How it Works
Typical Environment with multiple copies of Production Database
Production is copied off to development, QA, UAT among other locations. Each physical copy is a full size duplicate of production requiring time, resources and expertise and creating a system that is slow and difficult to refresh and maintain.
Same Scenario with Delphix database virtualization
Delphix seamlessly collects data from production, compresses by 1/3 typically and provides the appearance of full and private datafiles to each of the copies of production. Each copy can be spun up in minutes for almost not storage by a junior DBA or even a end user such a developer. All data collection and data purging is automated and virtual databases.
Database Virtualization is too valuable a concept to be implemented only as a complex and vendor-specific technology. Though the capabilities of thin cloning have been available for years, it has traditionally been limited by restrictive requirements and required strong storage expertise. For example, the scientific powerhouse CERN required a self-service interface for creating thin database clones on NetApp. In order to create this solution, over 25,000 lines of code and an unknown count of manhours were required from the development team. Today, self-service database thin cloning can be achieved with Oracle 12c Cloud Control Database as a Service (DBaaS) tool in conjunction with applicable vendor packages such as Netapp SnapClone or Oracle ZFS Storage Appliance. While the integration of storage platform features into management software has been long in coming and valuable to the industry, the strict vendor requirements create a barrier to entry for the solutions. In this case, vendor lock-in limits the capabilities of many businesses.
Delphix is a solution that breaks through these barriers by providing software solution that is hardware independent. The core Delphix solution is supplied as an OVA file that installs as a virtual machine under VMWare ESX, making it a true software-based Database Virtualization Appliance (DVA). And because it is software-based, it can be used with any type of storage solution, including EMC, NetApp, Hitachi, JBOD, and others. On the OS/DB side, Delphix supports Oracle 9.2 (Enterprise or Standard Editions) and higher on Solaris, Linux, AIX, HP/UX, and OpenSolaris. Real Application Clusters (RAC) are also supported. Delphix is a multi-databasebase platform solution supporting not only Oralce but also SQL Server with other database platforms to be announced.
An Agile Data Platform must be Database-centric and resource frugal. Most solutions devise expensive cloning practices or storage-specific solutions that do not address database complexities. Many platforms are able to provide quick clones, but they still involve far too much storage.
A key distinction with Delphix is that Delphix is not fundamentally a thin cloning platform. While it uses thin cloning technology behind thin provisioning in its tool stack, at its core Delphix is a database virtualization platform. No extra components are required to make it work specifically for databases—that is the core purpose behind the architecture.
Because Delphix is designed to be a Database Virtualization platform, it offers many core and extended features that are useful for DBAs, Developers, QA staff, and others.
Core Delphix Features
Delphix is a software appliance that can be deployed on any ESX Virtual Machine infrastructure, a task that usually takes just a few minutes. Once installed, the primary interface to the software is accessed through your web browser. From this interface, a technologist can instantly begin choosing and setting up source databases within Delphix and choosing systems that will be used as deployment targets. Installing the software and beginning setup is a painless procedure with minimal configuration requirements.
Figure 1. The Delphix Web UI as it appears after a fresh installation.
The interface itself and how it is used will be described in detail later. For now, we will concentrate on the core features this page expresses.
Easy Graphical Management
Nearly everything in Delphix is done via the web interface. While there is a command line interface as well with some advanced functionality, the act of snapshotting and provisioning databases can be done with a few clicks by technical and non-technical staff.
Versatile Source DB Compatibility
Delphix is extremely versatile in that it allows a wide range of database versions both for Oracle and SQL Server on nearly any storage architecture to act as source systems. Delphix will fully automate linking the source database to the appliance. Incremental snapshots are also automated and managed by the software without the need for complex cleanup and maintenance processes.
TimeFlow Point-In-Time Provisioning
TimeFlow is a feature that takes and stores changes on a source database for a defined recovery window. The benefits of this feature are twofold. It keeps the source database represented in Delphix up to date, which is important for guaranteeing fresh data when provisioning or refreshing a target system. But just as importantly, it allows target databases to be provisioned from any point in time during the recovery window, down to the second.
This means that if you choose to keep 30 days of TimeFlow data, you can provision a Virtual Database to a target environment containing the data as it was 25 days ago, and another Virtual Database with the data as it was 20 days ago, and another as it was 15 days ago, and so on. The multiple copies and the multiple times from which they are created incur no additional space.
Figure 2. Delphix is made with DBMS software in mind. By taking a single backup of a production (source) database, and propagating transaction history, many virtual databases can be provisioned from many points in time using its SCN aware TimeFlow feature.
In Figure 1, two elements have “Delphix vs. Unvirtualized” ratios: TimeFlow Ratio and Consolidation Ratio. These sections are showing Delphix’s compression capabilities. When a source database is linked to Delphix and an initial snapshot taken, that snapshot will be compressed (usually to ½ or ¼ the original size) and the total sizing metrics will be shown in the Consolidation Ratio section. Likewise, when incremental snapshots are taken for TimeFlow, the change data will be compressed and shown in the TimeFlow Ratio.
NOTE The compression of base snapshots and incremental snapshots in Delphix does not require any advanced Oracle features such as Oracle Advanced Compression. Additionally, both Enterprise Edition and Standard Edition can be used.
Delphix Extended Features
The core functionality built into the Delphix software also creates additional benefits.
Huge Storage Savings
In addition to the storage savings achieved by compression of the source snapshot, the fact that provisioned databases all utilize the same source blocks means further storage savings. The initial snapshot is compressed, the TimeFlow incremental snapshots are compressed, and the target virtual databases are all using that same compressed space; the only extra space consumption is the changes made on each target virtual database.
Figure 1. The storage space required for virtual snapshots is far less than the incrementally growing storage space necessary for physical backups. Additionally, in order to provision a clone from a physical backup it will be necessary to duplicate the storage space. In a virtual database environment, the target systems will all use the same space for unchanged blocks.
For example, imagine a 9TB source database. The initial link to Delphix takes 3TB stored with compression. That now leaves 6TB to save changes from the source database on the Delphix but those 6TB are 6TB compressed which is actually 18TB of changes from the source database uncompressed, so Delphix can save 18TB of changes in the size of one normal backup of the database. Those 18 TB can represent months of changes depending on the change rate of the database.
High Performance Block Sharing in Cache
The average organization provisions multiple copies of a single production database which can include multiple copies of development, QA, UAT, stress testing, maintenance QA, and others. Each of these copies requires time to build and provision and the end result is several identical instances with their own copies of largely identical data blocks across several shared memory regions. Delphix achieves a great reduction is disk usage by only saving identical blocks to disk once. But more importantly, Delphix also only caches unique data blocks in memory. This means that read performance across multiple systems that are provisioned from a single source can show dramatic improvements and scalability.
Figure 2. As the number of concurrent users (x-axis) increase, TPMs of a physical database reaches a limit when I/O subsystem saturates, but with a virtual database once Delphix can cache the blocks for one virtual database then they are also cached for the other virtual database(s) so TPM continues to increase and I/O latency stays low as most I/Os are being satisfied by the cache on Delphix. The above example represents a 200GB swingbench database with 200GB of memory on Delphix being accessed by 2 concurrent clones. TPM is cumulative and latency is averaged.
As the number of databases provisioned from a single source and the number of concurrent users increase, performance in a Delphix environment can grow linearly. These factors would usually destroy performance on a physical database; however, shared block caching means increased TPM and low latency on virtual databases when more concurrent users are added
Delphix is the perfect match for Oracle’s plugable databases (PDB). A core use case for PDBs is cloning but each clone takes up more disk space and each clone needs it’s own memory for it’s data block buffer cache, where as with PDBs on Delphix all duplicate data blocks are stored in disk and all duplicate datablocks are shared in the cache on Delphix minimizing the need for a large buffer cache memory on the container database (CDB).
Application development generally would not be manageable without solid version control. Popular packages like SVN, Git, Mercurial, and CVS are used by project managers, development teams, and QA groups to maintain and keep track of software revisions, bug fixes, and major releases.
Yet for all the version control capabilities available to software developers, there are not many viable options for version control of the data in your database. Oracle’s built-in capabilities like Flashback Query can be useful, but require a rewrite of every query to include the AS OF syntax. Flashback Database is a viable option for rolling back an entire database to a point in time, but it requires a full copy of the database along with extra storage for flashback logs. Even more importantly, it all has to be manually managed and orchestrated by DBA staff.
Figure 3. Virtual Database points-in-time can be lined up to code releases, and tagging allows the DBA to easily keep track of the snapshots that correspond to an application patch or release.
Delphix is capable of temporal data provisioning, meaning that it can provision a virtual database from any time point in the TimeFlow retention window. Further, it can provision multiple target systems from multiple points on the same timeline without any extra storage or orchestration requirements. The Delphix software also allows tags to be created for different points in time for semantic recognition; for example, you can tag a point such as “July 25, 2012 13:00” with the words “Version 1.3.5 Bugfix Release”. Snapshots can also be “kept” as well, ensuring that a specific snapshot never ages out of the retention window.
Delphix is shipped and deployed as an OVA file that is placed in an ESX environment. While there are no strict requirements for installing Delphix, it is recommended that the Virtual Machine hosting the software appliance be configured with a large amount of RAM for shared caching and enough disk for your source databases and their incremental snapshots.
Once the Delphix Engine is installed, there are three main steps in order to create a new virtual database:
- Register host machines
- Source machine running a master database
- Target machine with database binaries installed
- Link a source database to Delphix
- Provision a virtual database from the master database onto a target machine
Registering Host Machines on Delphix
Source and target environments can be either standalone databases existing on a single host (either physical or virtual), or an Oracle RAC environment.
In order to register a host with Delphix, it must be managed as an Environment. This can be done by selecting the Manage window and Environments option on the dropdown.
Figure 6. Environment Management option in Delphix.
From the Environments window, a new environment can be added by clicking the green “+” sign.
Figure 7. The Environments window will show existing environments and allow you to add new ones.
Once you have selected to add a new environment, Delphix will display the Add Environment Wizard. This page contains all the options required to discover and link a source host to the Delphix appliance.
Figure 8. The Add Environment Wizard
In order to add a new environment into Delphix, you will need:
- Environment Name – Any name used to identify this machine in Delphix
- Environment Operating System and Type – The operating system of the host environment, and a selection of standalone server or Oracle Cluster
- Host Address – The DNS registered machine name or IP address of the host
- SSH Port – Port for Secure Shell, used for copying of files.
- OS Username – Operating system login for the machine. This should be the OS login for the “oracle” user.
- OS Password – Password for the “oracle” user (or public key if selected)
- Toolkit Location – The directory where Delphix can install its scripts and tools once it connects to the environment (fully qualified path is required)
- Any notes that will be helpful in organizing the Delphix host.
Environments are not just source systems. They are any host that will be used as either a Source or a Target. Before you go on with configuration it is a good idea to ensure all source and target hosts are present in Delphix.
Adding a Source Database to Delphix
After the source and target hosts have been added as Environments, the source database can be linked to the Delphix Engine. In the top menu, you can do this by selecting Databases -> Add dSource.
Figure 9. Selecting the Databases menu to add a dSource.
Like adding an environment, this option will open a wizard called the Add dSource Wizard. A dSource is a “data source”, which is a database that will be linked into Delphix for provisioning.
Figure 10. The Add dSource Wizard
Delphix will display a list of all the databases discovered on the source environment on the left side of the window. In Figure 10, the SOE1G database is selected for linking to the Delphix environment. All that is required of the administrator is a username/password on the source database.
NOTE The user you enter in the Add dSource Wizard should be a DBA type user. You can use SYSTEM or a Delphix-specific user.
Once a source database is selected and credentials have been entered and accepted, Delphix will take a full backup of the source database using RMAN APIs. The full backup will only be taken once during the initial linking in order to establish a baseline. Once that baseline has been established, Delphix will continuously collect incremental changes for the life of the source database. No additional full backups are required whatsoever. By default, these snapshots of the source database changes are retained for two weeks (the TimeFlow retention window), which means that a virtual database can be provisioned to represent the source database as it was at any point-in-time during that two week window. However, the default policy can be changed to retain snapshots for longer or shorter periods of time.
Figure 11. The Active Job for linking the SOE1G database.
After the linking process has been initiated the RMAN backup will run as a background job. Clicking the “Active Jobs” button in the upper left hand corner can monitor the progress of this job (shown in Figure 11). As with any backup, this can be a time-consuming process but only must be run once during the initial link. The amount of time required to create the link mostly depends on both the size of the source database and the type of connection between the source host and the Delphix Engine. For example, it can take days to link a database that is multiple terabytes over a single 1GbE WAN connection, as a 1GbE connection has a maximum throughput of around 100MB/s. The linking time can be sped up using 10GbE or even aggregating multiple 10GbE NICs. Smaller databases on fast connections can be linked in minutes. In cases where the initial link may take hours, it can be set up to run during off hours when there is less network traffic (e.g. between midnight and 4AM).
Provisioning a Virtual Database
Once you have completed the simple tasks of adding the source and target environments and linking an initial dSource database, you can immediately begin provisioning virtual databases. Since it is a newly linked environment you will only have a single snapshot (Figure 13); however, this is all Delphix needs to make its first clone.
Figure 13. The Delphix interface shows a single snapshot card for the source database that was just linked. Clicking on the Provision button will launch the Provisioning Wizard based on this source snapshot.
The Virtual Database (VDB) Provisioning Wizard will be auto-populated with information from the source database, though the administrator can modify the values if necessary. The only definite requirement is the target environment for the virtual database, making quick cloning and refreshing easy for both technical and non-technical staff. In Figure 11-12, the machine named “target1” is selected.
TIP The “Advanced” link allows you to specify custom parameters for the target database that will be provisioned.
Even though there is only a single Snapshot card for this database (as it is newly linked), it is possible to provision down to a specific System Commit Number (SCN) or time of day by using the LogSync Control feature.
Figure 14. The Provision VDB window from a particular dSource Snapshot. Only a target is required to provision a VDB; however, advanced options are also available.
Provisioning a Virtual Database from a Specific Point In Time
Delphix regularly takes incremental snapshots of a source database that collect only changed blocks since the last snapshot. These incremental snapshots will be revealed in the Delphix interface as Snapshot cards. Over time, there will be multiple snapshot cards available for each source from which a virtual database can be provisioned. But those snapshot cards are like ‘bookmarks’ for each point that a snapshot was taken. Using LogSync, any point between snapshots can be chosen to provision a Virtual Database.
Figure 15. Opening LogSync from a particular Snapshot card. To provision a virtual database from a point in time between two snapshots, move to slider to the right above the first snapshot.
Opening the LogSync capability drills down into the timeframe between that Snapshot and the one following it, which allows a very granular view of the change iterations on the source database. A time can be selected (as shown in Figure 11-14), or SCN can be used to select a specific change number.
Figure 16. With the slider above the snapshot card positioned to the right, Delphix shows the LogSync timeline below the card. A slider on this timeline can be positioned to a specific point in time or an SCN at which the virtual database should be provisioned.
Once the red triangle slider is positioned over the desired time, the provision button can be clicked to create a virtual database representing the source database at that specific point in time.
It is important to note that every single time-point inside LogSync is capable of being provisioned as a fully functional virtual database representing the source database as it was at that exact point in time. This is an important feature for version control and other temporal requirements.
Delphix is made up of three main components:
- DxFS – The Delphix FileSystem, responsible for the storage and management of database data along with performance optimization.
- DataVisor – The DataVisor is a core component that manages the orchestration of tasks including synchronization, synthesis, recording of changes, data movement across copies, and replication.
- Self-service Management – Policy driven automation and interfaces that use DxFS and DataVisor to enable integration in any business use case.
Figure 17. The Delphix ecosystem, consisting of DxFS, Datavisor, and Management layers.
DxFS – The Delphix Filesystem
DxFS is the filesystem used by the Delphix host for storage of snapshots. It includes features for block caching, filtering, compression, and the block mapping that forms the secret sauce behind the provisioning of many environments from a single backup source.
Snapshots and changes that are collected from the source database and managed by Delphix are filtered and compressed, eliminating empty and temporary blocks and freeing up large amounts of storage. Block mapping built into the filesystem keeps track of data blocks for multi-versioning over time and works together with the DataVisor tier to virtualize “target-ready” datafiles that the target virtual database can use to spin up in full read/write mode.
DataVisor – Data Orchestration
The DataVisor tier of Delphix orchestrates the flow of snapshots for all source databases on the Delphix system. Once a database is linked, DataVisor monitors and maintains the incremental-forever synchronization strategy used in Delphix to save time and resources over the life of your provisioning architecture.
DataVisor also manages the provisioned targets and changes that are made on those systems. The synthesis of the snapshots from the source database is what builds the TimeFlow feature. Once a virtual database is deployed from a point in TimeFlow, the DDL and DML performed on that target virtual database (private only to that virtual database) is also synthesized and managed by DataVisor. In conjunction with DxFS, Delphix can store roughly 50 days of continuous recovery points (any point-in-time) in the space of 1 full normal data copy. DataVisor keeps track of all of the work behind the scenes.
Self Service Management – Policies and UI
The Management tier of Delphix powers the three main interfaces:
- The web-based GUI for human interface
- A web services API to build provisioning into your existing code
- A command line interface (CLI) for scripts
In addition to providing the UI components, the Management layer also includes a policy framework for users and the data they can access, provision, refresh, etc. Retention policies are also defined here and used by the DataVisor tier in the management of TimeFlow snapshots.
Lastly is the automation framework. Refreshes can be scheduled to occur on a specified basis; for instance, the beginning of each month or end of the year. During the VDB creation process, pre- and post-provisioning scripts can be identified to execute security procedures, handle specific QA team requests, or whatever else might be necessary to have the provisioned environment in a ready-for-use state.
Delphix provides a clean, simple, easy to use interface to link to source databases and automatically collect snapshots of incremental changes from them. Those snapshots are part of the TimeFlow feature, which is the core of the Delphix provisioning capability.
In addition to the fast and storage-reducing provisioning capabilities present in most virtual database platforms, Delphix offers key features in automation, performance, and flexibility. Collection of snapshots, integration into timeflow, and the provisioning process in Delphix is fully automated and managed by the DataVisor layer, keeping technical requirements easy and predictable. The DxFS filesystem behind Delphix improves performance on virtual databases by acting as a shared cache; in fact, the performance of the shared block cache in RAM. The caching done on Delphix is like a super SGA since all common blocks across virtual databases only need to be cached once to be used by all the virtual databases.
Most importantly, Delphix is flexible. Adding new environments, database sources, and provisioning targets is a simple process that can be done by technical or non-technical staff, or automated with no staff requirement at all. Any disk technology or server technology can be used on the source and target environments—which means no vendor lock-in to enjoy the features of thin provisioning. This flexibility makes Delphix a true Agile Data Platform that can solve critical application lifecycle and strategic issues both for the IT teams and the business.