If you’re going to change databases, do it in one go

At Uboot, there was the suggestion that we should change database technology. This was because we would save money with by moving away from Oracle to e.g. MySQL.

It was further decided, as we had 200k+ lines of code, and 100+ tables, that if a migration was to occur, it shouldn't happen  "all at once", but instead one would have some parts of the system migrated with other parts of the system not yet migrated.

But the strategy of migrating databases piece by piece has the following problems:

  1. Data consistency. Initially, all important data are within the source database. It is possible to guarantee certain consistency constraints (e.g. all photos are owned by existing users). However, if e.g. photos are migrated to MySQL, while users remain in Oracle, no such consistency can be guaranteed. Inconsistency has the consequence that software must accept not only valid data, but also invalid data. It is impossible to test the software for all combinations of invalid data, and with the rule that “software which is not tested does not work”, one can deduce that the software will then not work. The set of inconsistencies will increase over time, so the set of users who experience bugs will also increase over time. Afterwards, going from an inconsistent state to a consistent state is nearly impossible.
  2. Reporting over multiple system areas. With one database, one can ask a question such as “how many photos belong to users logged in in the last month”. A “SQL join” is used. However if the tables reside in different database, no such query can be asked. One would have to export the data from one database to another, in order to ask the query. Or export all data into a centralized data warehouse. Or write software to do what a single line of "join" would do. All this effort is saved, by using a single database.
  3. Point-in-time recovery. With one system, if the database crashes or an error is made (e.g. “drop table account” is executed by an employee), a “point-in-time recovery” can be performed. Data is restored to its consistent state as it was at a certain point in time. If one uses more than one database, a point-in-time recovery will be more difficult. For example if the databases are out-of-sync by 1 second, then some photos will exist for which there are no users, or vice versa, and data consistency problems exist as described in point 2.
  4. Spending money on migration; no saving in license fees. If a migration is done, time/money is spent. Not only in doing the migration, but fixing performance problems after the system is done. If money can be saved on the Oracle license, perhaps this time is worth it, however if one is using multiple databases then one is still paying for Oracle.

The solution is either to change databases in one go, or simply don't change databases. You want to have your data in one place, at all times.

P.S. I recently created a nerdy privacy-respecting tool called When Will I Run Out Of Money? It's available for free if you want to check it out.

This article is © Adrian Smith.
It was originally published on 25 Jun 2013
More on: Databases | Software Architecture | uboot.com