Merging and Branching a Database
Webinar: DB Maestro and Delphix
Three part post
- part 1 Version Control Meets Data Control
- part 2 Databases for every developer like source code?
- part 3 Merging and Branching a Database (this post)
In the application world, merging and branching looks something like the picture below. Copies are made from confirmed releases on the trunk which can then be operated on by developers as a branch of the main code. Changes are made, tested, and then merged back into the trunk for a new release.
Imagine if we could do this in the database world as well, similar to this image:
When developing applications that run on top of a database, each branch of code theoretically requires a copy of the source database where necessary schema and data changes can be applied. For maximum development capabilities, each developer would require a full copy of the source database so they could work unimpeded without the danger of breaking code changes being made by other developers. That’s the theory, anyways. But we would need to have a way to merge in changes from other developers into each development environment as well as a way to merge developer changes back into the trunk environment for this to truly work out.
- Challenge #1: Creating additional databases for development work to be performed in parallel to the trunk.
- Challenge #2: Merging branches back into a trunk so at the end of the day everyone’s work is combined into a single release that can be pushed to testing and hopefully to production.
Creating a copy of a database can take days or even weeks and require extensive resources. This is where database virtualization comes into play. With database virtualization, a database copy is as simple as a few clicks of a mouse. Within a couple minutes and almost no storage footprint, a fully functional read/write database will be available to use for development.
The saved time and lack of space consumption are thanks to the fact that virtual databases share the majority of their storage. Only a single copy of the production database and its change records is actually needed to create as many branches to as many different developers are required. Each developer or development team will have their own full copy of the source database(s) and different versions of the code can be correlated with tagged versions of virtual databases.
This solves one of the most challenging aspects of database management in development environments while also solving the problem of branching for source control. However, the question of how to merge changes from different development environments with different copies into a single coherent version still remains.
Figuring out how to combine everything together (let’s call it merging, shall we?) is the next challenge, as conflicts arise (as they usually do) and all involved developers need to be consulted during the merge.
But as we can see from this diagram, remembering everything that was changed and what it might conflict with is not practical. Even locating and focusing on conflicts to be merged can be a time consuming, error prone, ‘eye ball tested’ process. Risky is one word that leaps to mind. Damage is another one.
We really need to ask ourselves:
- Did we consider all changes? Overlooking a change that was introduced to Branch 1 and needs to be updated in the trunk will backfire. If Murphy is right (and it is his law) it will happen when we are already in production.
- Do we understand the history and origin of each change? Locating changes is one thing—but do we really understand why and where each change was introduced? Should it be integrated? If a branch and a trunk are different, which version is the right one? The last thing we want to do is override the new version of hot-bug-fix in the trunk with an older version from Branch 2 while performing the integration!
- Can we recognize a conflict if we see one? Ok, so we definitely know that object “x” in Branch 2 was changed, and we do see a difference in the trunk. So it’s safe to override the trunk, right? Wrong! What happens if the same object undergoes different changes in both trunk and branch? None of the versions is the right one, because the right one is a combination of both.
As we can see, dealing with branch merges can be quite a challenge. Eye balling changes is risky business. Just comparing branch and trunk to find these changes is risky as well, as we want to make sure we don’t override necessary changes—because it’s not A or B, it’s a combination of both.
Easy creation of branches or developer sandboxes can be achieved with database virtualization. Safe branch merges require conflict and merging capabilities that are part of database deployment automation and connected to database version control, for safe and informed decision taking.
See these practices in action!
Join us for a webinar with DB Maestro Thu, Oct 24, 2013 12:00 PM – 1:00 PM EDT.