Designing IT for Data Manufacture
photo by Jason Mrachina
As a (recovering) Mechanical Engineer, one of the things I’ve studied in the past is Design for Assembly (DFA). In a nutshell, the basic concepts of DFA are to reduce assembly time and cost by using fewer parts and process steps, making the parts and process steps you do use standard, automating, making it easy to grasp/insert/connect parts, removing time wasting process steps like having to orient the parts and so on.
In a meeting with a customer a couple of days ago, it struck me that out IT departments and increasingly our database and storage professionals have become very much like a database factory. What also become clear is that the processes that exist are cumbersome, expensive, and often burdened with low quality outputs.
If you think about the standard cycle to get a clone of a dataset provisioned in an organization, it can be long and expensive. There’s the process flow itself, which, although often automated, is also chock full of stop points, queues, wait times, and approvals. (At one major customer of mine on Wall St., a single request required 34 approvals.) Then, there are many decision points. For a simple clone, you might have to involve the project lead, the DBA, the storage team, the backup gut, the System Administrator, and in some cases even management just to get a simple clone made. And, each of these people has a queue, and has to decide how important your request is and if they have enough space and if there’s enough time, etc. etc. Then, even when it’s approved, maybe the Oracle RAC has a different methodology than the SQL Server, maybe we store our datasets locally whereas our sister application has to go get a backup from tape to use as a restore. All of this creates a process flow with lots of steps, moving parts, complexity, localizations and customizations, and potential for error and rework.
Principles of an efficient Data Factory
Considering IT as a data factory, we could apply the DFA principles to data and enumerate them as:
• Reduce the number of people, steps, and parts to get data clones provisioned.
• Simplify and standardize the steps that must remain
• Automated provision wherever possible
• Use standard Data Management Processes across data clones of all flavors.
• Simplify the process to connect data clones to the hosts and applications that need it.
• Encapsulate or eliminate the need for special knowledge to get a data clone to operate; Make it Dead Simple.
Delphix is the Data Factory
One great capability of the Delphix Engine is that it fulfills the tenets of an efficient Data Factory.
First, by automating 99% of the collection, storage, provision, and synchronization of data clones, it radically reduces the provisioning and refreshing process. Storage Administration often becomes a one-time activity, and provisioning or refreshing becomes a 3-click operation. Hours and Days (and sometimes even Weeks and Months) become minutes.
Second, it simplifies data clone management to such a degree that developers and even business application managers can build databases themselves – whether they are Oracle or SQL Server.
Third, in addition to radical provision and refresh automation, all of the housekeeping to gather change, integrate change, build snapshots, retain and release data, automate refresh are completely automated to such an extent that refreshing, rewinding, and restoring are also 3-click operations. And, doing things like a daily refresh for BI is a set-and-forget kind of operation.
Fourth, Data Management processes are standard across all flavors of supported databases. A refresh is a refresh. It shouldn’t matter to the end user that it’s a refresh for Oracle vs. a refresh for SQL Server or Postgres.
Fifth, by integrating with the mechanisms that let the database be ready for action (such as automatically registering with the Oracle Listener, or automatically applying masking scripts to make sure you’ve got obfuscated data to ship to your databases at Rackspace), the hosts and applications may not need to do anything except wait for the refresh to finish. No Request Necessary. No ticket to file. Nothing but fresh data in your database every morning ready to go!
Sixth, by encapsulating all of the difficult knowledge through automation or smart templating, it empowers a whole class of non-data professionals to perform their own Data Management. Letting Developers refresh for themselves completely takes the middle man out. No process needed at all.
If you’re a CIO, you may know that you’ve been operating your data factory like it’s 1965. You’ve been so far ahead of the game for so long that it has been inconceivable that there is a radically better way. That was the way that the American manufacturers thought before Deming and the other Total Quality Management gurus changed the way cars were manufactured in Japan. It’s time to bring your data factory into the 21st century. Stop trusting your data provisioning process to an outdated, overly complex, error-prone factory based on the IT organization of the 90s. Start trusting Delphix to deliver high quality database clones at much lower cost just-in-time for your needs.