Data Quality Assurance and monitoring

Prevention is better than cure. Quality can be much assured by \n pro-active assurance controls, while designing your systems and \n processes. Avoid bad data through interface controls, data standards, \n data models, database & Data processing and business controls.

Data Interface Exchange Controls

One of the biggest issues of data quality in this ever increasing world of inter-connected systems. Data exchange across the system needs controls at a record, file, application and database level.

These controls are used, when a soft file is generated from one system and is up-loaded into another system. The source and destination could be with in OR across the organizations/ Departments. These interface files could be:

  • the billing file from your supplier.
  • the customer file from your outsourced application capture operator.
  • the Point of Sales transaction file being sent to core processing systems.
  • disbursement file credit the payees.
  • the online web capture file from hosting database to your core systems.
  • the production details file from your sub contractor.

Checksum Data Interface Control

Checksum is a method by which the entire content of the interface file is subjected to a mathematical logic to arrive a number, which is called 'Checksum'. If any character of the file changes, the checksum will change. The checksum can be sent as part of:

  • The file name.
  • Embedded in the header-footer with in the file.
  • Sometimes sent separately.

The lack of Check Sum could result in any garbling OR corruption OR manipulation of file during transmission between source and destination

There are hundred of different 'checksum logics'. A checksum logic has to ensure that it does not provide same values across different combinations.

TIP- The checksum control is typically used for mission critical files. Checksum has a processing load where the receiving system will be calculating the checksum. For a typical business purpose, lesser methods are used like 'Aggregation Control Checks'.

Duplicate File Data Exchange Controls

The receiving system should be able to detect , if the interface file has been already up-loaded.

If a file has been already uploaded and populated the database of the destination system, it is possible that some further online activity has happened on that data. A duplicate file upload may overwrite the changes in the database since last file was uploaded.

An interface file should have an ID number placed in the header OR footer of the file. The destination system should maintain a table where it maintains the statuses of the files uploaded. Before uploading a file, it should check if it has not been already uploaded.

There are many other ways of checking, if the file has been already uploaded. This can be done by checking on the date and time stamp of destination tables, OR other business rules, which can validate if a file has not already been uploaded.

Duplicate Records

There should not be a duplicate record in the file being uploaded OR across the multiple file.

For example- If there is a duplicate record (like same disbursement record OR the same billing record), there could be two implications:

  • You could bill twice OR pay twice
  • Could end up disturbing the activities, which have already been done on the record post previous upload.

The system should be able to ensure this using the Database enforcement of the primary keys in the destination tables, and place additional checks so on not to overwrite the already updated record.

Data File Aggregation Control Checks

It involves placing the overall count of rows, the sum of numeric fields etc. in the header OR footer of the interface file. The receiving system should reconcile the aggregate of values in the transaction level records and match it with the pre-aggregated value in the header/footer.

Database to Data Base Controls/Interface file to database controls

Even if the interface file is received by the destination system has integrity, it might be possible that its upload in the destination database may develop problems. Therefore the final check of the interface file having been uploaded is to check the after-state of the database post upload. For example the customer ID in the source system may not exist in the destination system, (as the customer master has not been replicated). This may result in the rejection of the record.

Database to database checks are typically run post the interface files using various methods of Data monitoring, automated OR manual reports reconciliation.

Time Cut-off Controls for Batch processing

Certain files have to be uploaded before another business cycle OR process starts. Typically this is linked to bringing systems online post batch processing, OR any other time bound process. Time cut-off controls allow the interface files to be uploaded during a certain time slots. For example, card spending file has to be uploaded by say 2.00 A.M. in the morning before the card statement program starts working, so to enable generation of card statements by 8.00 A.M., so to meet the timelines of courier company pick-up.

Lack of this control will conflict with the time critical programs corrupting both the file uploads as well as the program bound by time. This can also interfere with the data integrity, if supposedly the data processing and uploading is happening at the same time.

Data Upload Validation Controls for Data Exchange

File upload should be accompanied by the validation rules, to ensure that the correct data gets uploaded into the system. So at the time of upload it should check, if the file has valid customer codes, supplier codes, product codes, Dates etc.

Upload Audit controls at file and record level for data interface

There should be records maintained on:

  • Which file was received when
  • When it was uploaded
  • Was it rejected OR accepted
  • Was it partly uploaded
  • Which records were rejected and why

Without the audit, one will not know on how the upload process happened and may miss out on duplicate file getting uploaded. With Audit data, one can investigate the rejects and is also able to explain the outcome of the processing following the upload. For example this audit may help IT to forewarn billing function of delay in billing OR to create the list of the invalid customer records, for the customer unit to correct.

TIP- There are many enterprise application integration tools which can create robust application data exchange interface. They also help to create data exchange standards, which provides excellent economies of scale (like having one standards customer information interface data model for all data exchange). Many a times, due to cost constraints, people get into the trap of point to point file exchange. We would recommend a higher level of empowerment and authority to be given to the integration team.

TIP- Create a dedicated data management role in the IT (just like you have a data steward in business domain)