Skip to main content
Version: v2.6

Reconciliation

Reconciliation is the art of verifying transformed data as part of a data migration.

Data migration is a critical process for organizations looking to upgrade or change their systems. One of the essential components of a successful data migration is the verification that the data transferred from the source system to the target system is accurate, complete, and consistent with the requirements for the migration. This process helps in identifying and resolving discrepancies that may arise during the migration.

It is the very nature of a data migration that the data are transformed along the way. In most cases, this transformation is extensive and complicated. As a consequence, the data in the source system is not directly comparable to the migration result in the target system.

There seems to be a general conception that this verification is almost like an audit: that it must be done completely independently from the migration, and should verify that "nothing has changed".

This approach is problematic for two reasons:

  • Lots of things have changed, for good reasons.
  • Some things should not have changed.

Where everything must match identically, row by row and field by field, the approach in practice means performing the migration transformation twice: once for the actual migration, and, in parallel, a second time in order to produce something to compare against.

This is a very labor-intensive way to reconcile. It simply verifies that - given the same input - the two transformation processes produce the same result. It does not verify that the expectations of the business are met by the result of the data migration.

A more nuanced approach is needed, one focused on what must be verified, what can be tested, and what is supposed to have changed, hence the need for predefined criteria or specific controls.

In the context of verification, Hopp distinguishes between these terms and processes:

  • Reconciliation is the process of verifying that the migration of data from the legacy source system to the new target system meets the expectations of the business owning the data.
  • Test is the verification, through test cases, that data in detail are migrated. For instance, to verify that names are migrated correctly, it is sufficient to test a selected number of names and confirm that they are migrated according to requirements. Note that this does not automatically imply that the names should be identical.
  • Migration Audit is data from the migration process documenting the transformation that has taken place. These audit data are crucial for the reconciliation process, bridging the transformation gap between the expectations and the migration result.

In short, it is the recommendation of Hopp that a migration project step back from any ambition to automatically compare the result of the actual migration to the result of a parallel transformation performed in a separate track.

Firstly, this is not a real reconciliation. Secondly, as a migration increases in scope and complexity, it is likely that this effort will prove too costly, too complicated, and too error-prone, and will be abandoned at some point along the way.

Instead, a project should define and develop a realistic, business-driven reconciliation between the golden business expectations and the migration result. In addition, a project should combine this with well-defined test cases, in order to verify, through testing, the correctness of migration results of lesser importance.

Background

In almost all data migrations, there is a business requirement for reconciliation. In general terms, the term covers a mechanism, preferably automated, to verify that the data from the legacy source system was migrated in accordance with requirements.

This is a large topic, and the purpose of this article is not to cover it in depth. The purpose is rather to place the Hopp software in the broader context of reconciliation, and to explain the rationale and the approach behind what Hopp provides.

While everybody agrees on the broader concept of reconciliation, it is more difficult to pin down once you get into the details.

In broad terms, reconciliation covers two aspects:

  • Completeness: that all in-scope data in the legacy source system is either migrated or discarded in accordance with requirements.
  • Correctness: that the data which was migrated is accurate and transformed in accordance with requirements.

Easily stated, but what does this actually mean? In conceptual terms, reconciliation means to reconcile expected results and actual results.

In some migrations, it is enough to count things. "We had x invoices in the legacy source system, y invoices were discarded by the migration as required, and z invoices were delivered to the target system. If x minus y equals z, everything is ok." For this kind of simple counting, the Hopp Portal is often quite enough to satisfy reconciliation requirements.

But in many migrations, this simple approach does not work. Then things become much more complicated, and for a generic migration tool like Hopp it becomes impossible to provide a generic reconciliation. The truth is that what more advanced reconciliation actually means is completely dependent on what is being migrated.

Hopp is built for complex data migration, so extensive transformations are a fact of life in any Hopp-driven migration.

Hopp delivers a safe and open mechanism to collect the audit data representing the actual-results side of the reconciliation equation. It is up to the migration project to establish a reconciliation of the audit data with the expected results, from outside Hopp.

The Challenge

Reconciling transformed data during data migrations can be quite challenging, due to several factors. Some of the key difficulties organizations face are:

  • One of the primary challenges is that data transformations often change the structure and content of the data.

For example, duplicate customers might be merged into one, or products might be redefined, split into components, or bundled into more complex service and product offerings. This means that a simple one-to-one or number-to-number comparison between the source and target data is not always possible or sufficient.

  • Another difficulty is ensuring data quality. Data transformations can correct, or introduce, errors and inconsistencies, making it harder to verify that the data has been accurately migrated. Organizations typically invest time and resources in data cleansing and validation as part of their data migration.
  • The complexity of legacy systems also adds to the challenge. Legacy systems often have intricate data structures and dependencies that can complicate the migration process. Understanding and mapping these structures to the new system requires careful planning and expertise.

Additionally, reconciling transformed data requires robust methodologies and tools. Traditional methods, such as custom programs and packaged ETL solutions, often require a large degree of bespoke development. These efforts tend to become IT projects by themselves, building a one-off, use-once solution at minimum cost and time, which can compromise quality.

Finally, managing the change and ensuring that all stakeholders are on board can be challenging. Migrating to a new system often involves changes in processes and data structures. Organizations must ensure that all stakeholders understand and support these changes, to ensure a smooth transition.

Our Approach to Reconciliation

At Hopp, we understand the importance of reconciliation in data migration. Over the years we have seen several different attempts to reconcile data migrations, more or less successfully.

There does not seem to be an accepted approach, or a standard way of doing this. Organizations tend to invent a way, and build bespoke software and processes to support it.

Hopp has a well-defined view of reconciliation: what it should cover and not cover, where a generic migration tool can support a specific reconciliation, and where a bespoke effort is needed to get all the way home.

Hopp divides the reconciliation into three parts: the business expectations, the migration result, and the migration audit data that bridges them.

Reconciliation flow

Reconciliation brings together the business expectations and the migration result; the migration audit data bridges the transformation between them.

One: Expectations

As one side of the reconciliation, it is indispensable that the business provides well-defined requirements.

This means that the business must define the dimensions it wishes to reconcile. For example, reconciliation dimensions could be:

  • Counts, such as customers, or accounts per customer.
  • Amounts, such as grand totals per account type, or totals per customer.

Together with these dimensions, the business must provide datasets expressing the values it expects for those dimensions. Ideally, the origin of these datasets should be different from the source data being migrated.

In many cases, the business will have financial, accounting, reporting, or data-warehouse systems in place in parallel to the operational systems being migrated. These systems are often the ideal place to go, both for a good take on which dimensions are relevant for reconciliation, and for the actual datasets expressing the business expectations to be met.

The datasets expressing the expectations of the business will in most cases be aggregated on the dimensions. In other words, the expectations for a given dimension are expressed as sums for each dimension value. If the dimension is account type, the dataset could contain a list of account types, each with the summed balance of all accounts of that type.

Two: Migration Result

The other side of the reconciliation is the result of the migration. For the final reconciliation, the migration result must be expressed in datasets extracted from the target system.

For the reconciliation to be meaningful, the datasets extracted from the target system must be on the same dimensions defined by the business.

This is where things get complicated, because the migration has transformed the data. This means that the dataset extracted from the target system normally is not immediately comparable to the expectation datasets provided by the business.

As noted earlier, it is of course an option to develop functionality to perform the required transformations on the expectation data, in order to reconcile it against the migration result.

Hopp does not recommend this approach. To a large extent it means developing two parallel transformations in order to reconcile. That is a lot of effort for little, and indeed often negative, value.

A difference uncovered by the reconciliation should ideally signal a problem or an error in the migration. But with the duplicated-transformation approach, it could instead mean that there is an error in the duplicated transformation logic. Even worse, if the same error is present in both the migration logic and the duplicated logic, the reconciliation may completely miss a serious issue.

Three: Migration Audit Data

The approach of Hopp is to let the migration deliver migration audit data documenting the transformations performed in the migration. In the end, it is these audit data that build the bridge and enable the reconciliation.

The Hopp migration tool includes interfaces that allow the migration team to collect audit data from the migrated data.

The data presented by the Hopp tool for audit collection through these interfaces are the pure results of the transformations performed by the migration engine. In other words, for a given Business Object, the audit interfaces receive both the input data to, and the output data from, the migration engine.

The audit data contains complete data-lineage information. This means that it is straightforward for the audit collection to pair the input and output data, and so detect the actual transformations performed by the Hopp tool.

It is important to underline that the data passed to the audit collection interfaces are the result of all the mapping, transformations, and rules applied by the Hopp migration tool. Through the interfaces, you collect the result. You do not replicate the transformation.

Using the audit collection interface as a bridge, from the business expectations expressed in source terms to the migration result expressed in target terms, completely removes the need to duplicate the migration rules in order to translate the business expectations into target terms and values.

Two samples illustrate the point

Sample 1

  • A customer holds an account with a balance of 100, of account type A.
  • In the target system, account types are numbered, not named. Account type A becomes account type 1.
  • The migration has transformed account type A to account type 1.
  • The audit interface can collect: account type A to account type 1, a balance of 100.

Sample 2

  • A customer holds two accounts: one with a balance of 100, of account type X, and one with a balance of 200, of account type Y.
  • In the target system, account types X and Y are both merged into account type 2.
  • The migration has transformed account types X and Y to account type 2.
  • The audit interface can collect: account type X to account type 2, a balance of 100; and account type Y to account type 2, a balance of 200.

Reconciliation of Sample 1 and 2

  • The expectation of the business:
    • Account type A: 100
    • Account type X: 100
    • Account type Y: 200
  • The migration result from the target system:
    • Account type 1: 100
    • Account type 2: 300
  • The collected audit data can bridge the transformation and bring the reconciliation home, especially the merge of account types X and Y into account type 2:
    • Account type A to account type 1: 100
    • Account type X to account type 2: 100
    • Account type Y to account type 2: 200

Conclusion

Reconciliation is a vital component of data migration, ensuring that the data transferred from the source system to the target system is accurate and consistent. However, it is not the only process that should be in place. It should be viewed as one of several activities that verify the correctness of the migration. Reconciliation should be seen as part of a whole, including test, acceptance, and audit processes.

We believe that, to have a fit-for-purpose solution, a project needs to change its approach and focus on the specific need for reconciliation, instead of pursuing a pure migration-replication approach. The argument is that a project taking the replication approach will end up spending too much time, effort, and cost, and not getting the expected or required outcome. A differentiated and focused approach is much more beneficial.

By leveraging Hopp's approach to reconciliation, including detailed audit interfaces, reconciliation reports, and continuous monitoring, a project can verify the completeness and correctness of its data migration.

Where Hopp already runs the migration, its audit data is the natural foundation for reconciliation. Building on it is a surer step than standing up a separate solution that has to reproduce the same transformations.