Data Control

September 9th, 2013


Video of webinar given on Data Control by myself and James Murtagh of Red Gate




 

 

Screen-Shot-2013-09-05-at-5.26.06-PMScreen Shot 2013-09-05 at 9.52.45 PMBusinesses today face a difficult conundrum: industry giants such as Google, Facebook, and others have taught people to expect service that’s reliable and fast, yet budgets are getting tighter by the fiscal year. Instead of being able to provide quality, on-time applications that are bug free and high performance, struggling IT departments are forced to cut back on systems and resources for developers.

The most vital resource for applications can also be one of the most cumbersome: the database. Data is the backbone to our most important applications, and as such it is stockpiled and maintained in a robust database engine—yet the contents can become quite large and unruly. This makes data difficult to manage, particularly when mobility is required. A lack of data mobility can cause severe issues in the lifecycle of a product as it impacts good QA and development.

This problem has become so widespread in fact that development teams have turned to new measures in order to complete their work despite bottlenecks in data management. These measures are collectively summed up as the DevOps movement.

From Wikipedia: DevOps (a portmanteau of development and operations) is a software development method that stresses communication, collaboration and integration between software developers and information technology (IT) professionals. DevOps is a response to the interdependence of software development and IT operations. It aims to help an organization rapidly produce software products and services.

One of the most important aspects of DevOps is release management. Standardized development environments are becoming more common and tools like git and SVN are almost universally usable to ensure a quality release and deployment process. This facilitates the goals of fast development, quality assurance, and rapid feature/maintenance release schedules. There’s only one problem with this: the database doesn’t play along. Yet.

Databases have no release-based data control like developers have for code. With code it is easy to have a trunk with branches, merge in changes, and preserve integrity with a check-out and check-in process. In fact, most databases used for development purposes are not only woefully behind production in terms of data freshness, but also are used for development across multiple releases without any possibility of identifying how the data looked or should look at the time of release. The database is just there—a giant glob of data that doesn’t follow the same rules as the code and often gets in the way of an otherwise efficient process.

In the database world, it’s common to hear the idiom “Cheap, fast, or good – pick two.” Some might even say you can only pick one . In all, the outlook is very bleak.

Here is what we at Delphix are seeing in the industry:

  • Inefficient QA ends up costing companies thousands or millions as bugs make their way into production.
  • QA delays mean extra work for project managers, QA teams, and IT teams as people scramble to find and fix issues and manage project deadlines.
  • Developers and QA teams sharing databases causes bottlenecks that result in difficult coding and testing practices and slower time to delivery.
  • Creating development or QA databases from subsets of production data (usually due to size) results in bugs due to impractical data volumes.
  • Slow provisioning and infrastructure rollout means delays that impact an application from the very foundation on up and result in massive schedule overruns.

Let’s look at these issues in a bit more detail.

Screen Shot 2013-09-05 at 9.52.45 PM

 1. Inefficient QA

Screen Shot 2013-09-05 at 5.26.27 PM

One Delphix customer we were talking to spent 94% of every QA cycle building the QA environment and only 6% actually running the QA testing suite. This cycle was repeated every single time a set of features required testing and had to happen for every QA suite. The long pole in the environment build was provisioning the database the code had to run on. The burden was so overwhelming it completely monopolized their allotted time.

Screen Shot 2013-09-05 at 9.52.45 PM

2. QA Delays make fixing bugs more expensive

Screen Shot 2013-09-05 at 5.38.40 PM

When building a QA environment takes a long time the integrity of the application suffers. In the above case, if only 6% of QA time is spent on actual testing it is very easy for bugs to go unnoticed. When this happens there can only be two possible outcomes: either the bug finally does get caught in QA but due to the delay in bug capture it requires more code or testing to roll back and fix, or the bug doesn’t go noticed and makes its way into production where it can cost customer confidence or large sums of revenue. The longer it takes for a problem to be found and fixed, the more cost goes up.

Screen Shot 2013-09-05 at 5.55.14 PM

Screen Shot 2013-09-05 at 9.52.45 PM

3. Using subset databases causes bugs

Screen Shot 2013-09-05 at 5.27.23 PM

Screen Shot 2013-09-05 at 5.27.31 PM

For most companies it is impossible to give all developers their own full size copy of the production database (at some companies, they can’t even provide one full size copy). As such, they will create subset copies of production for development and QA purposes. This takes a huge amount of time to write and maintain scripts to create data subsets and verify data coherency and referential integrity; in fact, one customer of ours estimates that 50% of their DBA time is dedicated to creating subset database clones. And because the data is not representative of production data, it often leads to the introduction of bugs. When the QA environment is similarly crippled, it may be impossible to isolate and verify a fix to the bug once it is caught. And with a subset, there will be bugs—another customer of ours estimated that 20% of their production bugs were due to using subsets.

 

Queries that run fine on subsets either run into new anomalies on production data, or hit the wall as performance of a query tanks when run on a full dataset size.

Screen Shot 2013-09-05 at 9.52.45 PM

4. Development teams sharing on full copy leads to bottlenecks

Screen Shot 2013-09-05 at 5.27.42 PM

An alternative to subsets (or sadly, sometimes used in combination) is having the development team share a single copy of the production database. Sharing a full copy requires that any changes to the database schema, content, or metadata pass through code review to ensure that the changes won’t break the work of other developers. Instead of developers being able to perform their own work quickly and efficiently and merge it into the final release, they are forced to work in a synchronous manner that wastes time. A large online auction site we talked to said this review cycle took one to two weeks before a developer could get changes into a shared database copy. In today’s fast paced world that is simply unacceptable.

Because so many people are using the shared resource and because that resource is so big and time consuming to build, it can be a very long time between refreshes of the data. This leads to old unrepresentative data, which incurs its own set of risks every bit as dangerous as data subsets.

Screen Shot 2013-09-05 at 5.58.58 PMScreen Shot 2013-09-05 at 5.59.10 PM

Screen Shot 2013-09-05 at 9.52.45 PM

 5. Slow environment builds and the culture of “NO”

Screen Shot 2013-09-05 at 5.27.49 PM

A huge IT hurdle to tackle is creating development environments in the first place. Data is growing faster than our infrastructures are improving, and the size of our databases means that cloning or refreshing environments can take days or even weeks to accomplish. Because development environments are incredibly dependent on real-time data, this results in lost productivity and reliability. One large global bank told us that it generally took 6 months to build their environments out. Another customer who used NetApp storage snapshots said it still took them 2-4 days due to their system of entering tickets for requests and then trying to schedule time and resources from the storage admin, system admin, and DBA teams. All of this hassle leads to a culture of “NO”, where developers and other front-line product creators are simply told that meeting their needs is not possible. Sadly, many developers are used to the idea that if they need a copy of the data they will be told “NO” in most cases.

Screen Shot 2013-09-05 at 9.52.45 PM

Can I get a copy ?  No !                         Result: Developer motivation

No     forlorn cat

Screen Shot 2013-09-05 at 9.52.45 PM

Data is the Problem

Screen Shot 2013-09-05 at 9.52.45 PM

Getting the right data to the right people at the right time and in the right form is the problem. The problem is data agility—or more to the point, a lack thereof.

A lot of people think they understand the extent of their data problems. But let’s go through what we’ve learned from our customers in the industry:

  • 96% of QA time is spent on building environments instead of running QA tests
  • 90% of lost development days attributed to waiting for development environments
  • 50% of DBA time spent making database copies
  • 20% of bugs because of using subset data instead of full copies

Budget cuts put pressure on IT teams which limits system resources, storage, and personnel. These limits reduce the number of systems available and increase the time to provision these resources. These limits impact developers and contractors that work on these systems leading to project delays and reduced quality.

 

The Total Cost of Data

Data is expensive. In most calculations, the total cost of data only includes storage costs; however, storage costs are just the tip of the iceberg. For every GB or TB of storage there are teams of storage admins, system admins, and DBAs who have to manage the storage, move data, backup data, and provision clones for development and other needs. The more data there is to move, the longer it takes and the more expensive it is due to the personnel involved, schedule overruns, and time spent discussing the requirements and tasks in meetings. The total cost of data goes far beyond simple storage costs.

 

Delphix Solves the Data Problem

Screen Shot 2013-09-05 at 9.52.45 PM

I’ve written extensively on Delphix elsewhere. You can see some of the more pertinent cases here:

In a nutshell, Delphix makes creating clone copies so fast that it is nearly unimaginable to even the most modern of IT teams. These clones can be provisioned in only a few minutes no matter what the source database size and require almost no storage overhead because duplicate blocks are shared across all clone copies. Because the clones are fast and almost free they can be provisioned for all the developers; in fact, a developer can easily be given multiple copies of the full production database. The clones can also be made fresh, just a few minutes behind production, in a matter of minutes.

When this technology was explained to one company we work with, the overwhelming response was “No way.” When the first clone of a 7TB database was created in 7 minutes and a second clone (from a time period in the past, no less) was created in an additional 10 minutes the response was, “No freaking way.” The implications of having two full-sized fresh clones to any point in time in a matter of minutes with no additional storage overhead were almost impossible to comprehend.

Screen Shot 2013-09-05 at 9.52.45 PM

 

Database Clone Copies with Delphix: Full , Fresh, Fast, Free

Screen Shot 2013-09-05 at 9.19.25 PM

 

What the Industry can accomplish with Delphix

 

  1. Efficient QA: Low cost, high utilization
  2. Quick QA: Fast bug fixes and problem resolution
  3. Developer Agility: Everyone gets a database for parallelized development
  4. Full DBs: Less bugs in development, QA, and production
  5. Fast builds: Fast development and the culture of Yes

Screen Shot 2013-09-05 at 9.52.45 PM

1.Efficient QA

Screen Shot 2013-09-05 at 8.55.02 PM

With Delphix, QA environments can be spun up in minutes and branched directly from the development database.

 

2.Quick QA

 

Screen Shot 2013-09-05 at 8.55.10 PM

Because QA environments can be spun up quickly, QA can test immediately at the end of a sprint to find bugs quicker and reduce the time and work needed to fix those bugs.

Screen Shot 2013-09-05 at 9.52.45 PM

3.Developer Agility

 

Screen Shot 2013-09-05 at 8.55.31 PM

With Delphix, each developer can have one or more full copies of a database and can refresh with new data in minutes. Developers will no longer be blockers to each other because of a single point of wait.

 

4.Full DBs

 

Screen Shot 2013-09-05 at 8.55.38 PM

With Delphix, the development databases are full copies so there are no longer bugs due to small and inadequate representations of real data.

 

5.Fast Builds

 

Screen Shot 2013-09-05 at 9.27.17 PM

With Delphix, developers can provision their own clone copies in minutes without having to coordinate with storage admins, system admins and DBAs.

Screen Shot 2013-09-05 at 9.52.45 PM

Dev Ops Management with Delphix

Screen Shot 2013-09-05 at 9.52.45 PM

A big question is: If each developer has their own database, how do we manage the merging of their changes into a release? This is where Red Gate Schema Compare comes into play. For example, consider this process:

  1. Delphix: Create 3 Virtual Databases
    1. DEV1
    2. DEV2
    3. TRUNK
  2. Have two developers perform their work
    1. Developer 1 works on DEV1
    2. Developer 2 works on DEV2
  3. Red Gate Schema Compare performs merges
    1. DEV1 changes move into TRUNK
    2. DEV2 changes move into TRUNK

 

This process can be easily summed up in a visual flow:

 

Screen Shot 2013-09-06 at 2.44.24 PM

 

From a timeline point of view, the process is as simple as it is agile:

Screen Shot 2013-09-05 at 9.32.36 PM

 

Quality Assurance can easily be a part of this process. The steps are more robust, but with near instant cloning it becomes trivial while remaining thorough:

Screen Shot 2013-09-05 at 9.32.42 PM

A few last words about Developers, DBAs and DevOps

Screen Shot 2013-09-05 at 9.52.45 PM

 

 


Uncategorized

  1. Trackbacks

  2. No trackbacks yet.
  1. Comments

  2. September 9th, 2013 at 19:45 | #1

    Hi Kyle,
    this may sound like a dumb question, but I still don’t really know what kind of environments need a setup where every developer gets his own database… As regards the environments I’ve worked in, I think that not only is there no desire for this, but perhaps even a feeling that having developers work on the same test database provides some kind of quality control (in the sense that at least, no one may break anyone else’s functionality…)
    Is this a matter of scale (very big companies versus medium ones) or am I missing something about the advantages of the one-database-per-developer scenario?
    Many thanks,
    Sigrid

  3. khailey
    September 9th, 2013 at 20:15 | #2

    Hi Sigrid, Thanks for the feedback. Most sights I talk to use a database per developer where the database is a subset of the source database (typically “production”). This of course causes problems as explained above where code that runs on subsets breaks in production. For references, RBS said they spend 50% of their DBA time making subsets. They use subsets since they don’t the resources to provision full copies. The full copies are made into subsets with scripts and those scripts are constantly breaking as things change and thus require modification. Another was a group at Ebay where the team shared one copy of a production and when developers wanted to change the database, they submit a request for code review and the code review takes 1-2 weeks. Personally that would bother me and impact my motivation. I have direct experience with this when I worked for a startup as the DBA and the changes had to go through me. Generally this only took a day or half of a day for me to validate the change and get it in the development database but even that delay caused grumblings on the dev team and they made the unilateral decision to go to an Entity Value Attribute (EAV) type database where they could make changes to the schema without stepping on each others feet and also bypassing me. The performance of this database was atrocious and this little story shows one reason relational databases gave rise to noSQL. see http://martinfowler.com/bliki/NoDBA.html
    For EAV performance and maintainability see the first couple of minutes of this awesome video http://www.roughsea.com/vids/SQL_Best_Practices.html
    more on EAV on wikipedia http://en.wikipedia.org/wiki/Entity-Attribute-Value_model
    One of Delphix’s customers Stubhub went from 5 copies of their database in development to 120 after using Delphix and saw at least 20% increase in developer productivity.
    Seems clear that having more copies increases performance both for larger companies like Ebay and Stubhub as well as a small company like the startup where I worked as a DBA. THe startup didn’t have Delphix so used EAV which had huge problems but now there is Delphix so there is a way to have the parallelized non-blocking developer work as well as the full copies for all the developers.

  4. khailey
    September 9th, 2013 at 20:38 | #3

    PS here are some notes I took on Stephane Faroult’s awesome video – was going to write it up into a blog https://sites.google.com/site/oraclemonitor/eav

  5. September 10th, 2013 at 21:13 | #4

    @khailey
    Hi Kyle,
    thanks for the explanations! It’s easy to imagine the problems the problems with subsetting like that… I still have to think about the advantages and disadvantages compared to a one-test-database, one-integration-test-database environment where the developers – AFAIK – are ok with the development cycle (and where a database-related code review process still has to be established) … I’ll check out the resources you mention, too – sounds interesting!
    Thanks,
    Sigrid

  6. October 4th, 2013 at 18:28 | #5

    Hi Kyle,

    Thanks for this nice article, and it really helped about some issues. It’s very detailed. At my business though, subsetting and db’s for each developer is not an option. We’re small team (<30 developers), working on shared schemas (but big schemas). Still not doing real version control.

    In our case is paramount that the whole team have full visibility of each other's changes, even if they break in unit testing. We keep track of recent failed builds "due to changeset X" and they're sorted out weekly.

    Regards,
    Alvaro

  7. khailey
    October 4th, 2013 at 18:58 | #6

    @Alvaro, thanks for the feedback. Always interested in hearing real world DevOps stories. Do you use any source control tools for schema changes ? Do you use change management tools like Red Gate or DB Maestro?

  8. October 5th, 2013 at 02:26 | #7

    Good stuff, Kyle. I’ve worked in both single stream database development environments (dev, int, qa, prod databases) and more recently in database-per-developer environments. Both models have challenges associated with schema development and evolution, data seeding, sufficiently sized test data, and multi-version roll-forward / roll-back scenarios.

    Most people are familiar with the challenges of the single stream model, including locking / overwrite and breakage delays due to changes, limited repeatability of deployment testing, and the difficulty of getting sufficiently sized test data in a rapidly repeatable fashion.

    I’ve had to solve many of the single stream problems without access to a tool like Delphix and it isn’t easy. Locking / overwrite issues require tools like Toad Team Coding or assigning different tables / objects to different people to work on. Breakage issues need to be handled with abstraction layers like synonyms & views or versioned stored procedure access to all tables. Repeatable deployment testing requires a resettable database (we used a daily reporting database which was manually re-instantiated nightly). Obtaining sufficiently sized test data required copying production into QA every three months and applying test data generation. Delphix clearly helps with both the repeatable deployment testing and sufficiently sized test database creation (and resetting of test data).

    Most recently I’ve switched to the database-per-developer model which has it’s own challenges of synchronizing each database, maintaining database changelogs and changesets outside of the database infrastructure, and resetting databases from a data perspective to support repeatable test cases. In a continuous integration environment I have to worry that every single database change is basically a mini-deployment to every developer — it has to deploy without error, and potentially always be backward compatible. I need an easy way for developers to blow away their database and create it from scratch. I need an easy way for developers to reset or rollback changes to their database both from a data and structure perspective after failed deployments or test runs. I may need to maintain a “library” of databases at specific versions to support branching and maintenance development.

    While I’m not sure that providing developers with private databases always leads to higher productivity (you still need DBA-like skills which are in short-supply from a resource perspective), it definitely provides developers with more options regarding testing. This is particularly important when the developer wants to test against a specific version of the database before introducing new schema changes.

    I’m glad to see more attention being paid to these challenges, and particularly on how tools like Delphix integrate with tools like Red Gate, Liquibase, dbMaestro and Datical.

  9. October 5th, 2013 at 19:42 | #8

    Hiya Kile. The only tools used until now are TOAD and OpenDBDiff/DBDIFF , along some specially written scripts. They are only used to compare the working schemas in the db used by developers, against a master , frozen schema (which is updated less frequently). We’re thinking to deploy Microsoft TFS and GIT for versioning.

    I’m using excellent DBlint tool (for normalization and cross-db schema checking issues) , from http://dblint.codeplex.com . Great tool.

    Regards,


two × = 10