Amazon Relational Database Service (RDS) provides a managed relational database service platform in AWS for a number of data platforms, including RDS for Oracle. The platform simplifies much of the operational complexity that comes with running a relational database in production, including hardware provisioning, database setup, patching, and backups.
While RDS for Oracle does support database snapshots and clones, they are primarily built for operational use. Concepts such as masking, data refresh, and shared storage are not provided by the platform, and each copy can be time consuming to create and expensive to maintain. This makes it difficult to use RDS Oracle for non-production workflows such as development and QA.
The Delphix Dynamic Data Platform, on the other hand, provides lightweight data pods with powerful dynamic data controls to support self-service access to secure data for non-production use. Combined together, these two solutions provide the best of both worlds: operational simplicity and robustness in production, with agility and cost savings in non-production.
While the the Dynamic Data Platform does not support direct ingestion from RDS, it is possible to use the Amazon Database Migration Service (DMS) to continuously load data into the platform, and then use Delphix capabilities for creating and managing non-production environments. Once configured, RDS sources will operate like any other Delphix source, and can be masked, shared, and distributed across different on-prem and cloud environments.
The solution relies on Amazon DMS to load and replicate data into a virtual database instances within the Delphix platform. Data operators first create a staging Oracle virtual database (VDB) within the Delphix platform, which is then used as the target for DMS replication. While not strictly required, the Delphix platform node should be running on EC2 in the same VPC, availability zone, and subnet as the RDS source. The staging VDB and all descendant VDBs (e.g. non-production databases) are created and managed on EC2 instances. Users are responsible for ensuring they have the licenses and resources required to run the virtual databases.
There are a number of DMS-specific concepts within this architecture: replication instances, endpoints, and tasks. The replication instance is a compute resource configured within the same VPC, subnet, and availability group as the RDS instance and Delphix platform node. One endpoint is associated with the production RDS source, and the other with the virtual staging database. A DMS task is then configured to perform a one-time load followed by continuous replication to the staging source.
At this point in time, only Oracle is supported, as change data capture is not yet supported for RDS SQL Server. While DMS supports replication between heterogeneous database types, such use is beyond the scope of this document.
DMS has a number of requirements and limitations that may not make it suitable for all databases. It is recommended that users first manually verify that their databases can be replicated using DMS prior to committing to using the Delphix DDP.
Configuring the Staging Instance
Users will first need to install and configure a Delphix DDP node in EC2. While DMS can replicate to targets outside of EC2 (the only restriction is one of the source or target must be in AWS), the best performance will be achieved if the Delphix node and DMS replication instance are in the same availability zone, VPC, and subnet as the RDS source. To configure the node, follow the prerequisites and instructions in our documentation.
You will need EC2 instances configured with the same Oracle version as the RDS source. DMS can replicate across database versions, but recommended best practices are to keep database versions the same. Once you have an EC2 instance with Oracle up and running, you will need to add the environment to the node as described here.
Once added, you will need to configure an “empty” VDB. Unlike traditional linking, Delphix does not ingest data directly from the RDS source, so there is no dSource as there would be with a traditional Oracle source. Instead, you must do the following:
- Create an temporary instance on the target, using standard Oracle tools such as DBCA or the “create database” command.
- Create a dSource from this empty instance using the instructions here.
- Detach the dSource
- Delete the temporary instance
- Create a VDB on the target environment (Staging Instance)
This VDB will be the target that we replicate data into. Since it will not appear as a dSource, you should give it an identifiable name that distinguishes it as a staging instance.
First, you must configure the source and target to be used by DMS.
- For the RDS source, follow these DMS instructions. We will be using continuous replication, so the source needs to be configured for continuous data capture (CDC).
- For the staging VDB follow these DMS instructions. You will need to configure an account with the appropriate set of privileges required by DMS.
Next, you will need to follow these steps for DMS:
- Create a replication instance, preferably in the same availability zone, VPC, and subnet as the RDS source.
- Create an endpoint using the RDS source endpoint, and a user account with sufficient privileges for DMS
- Create an endpoint using the staging VDB created in the previous section, and the user account configured in the previous step
- Create a DMS task using the replication instance, and connecting both endpoints. Select full load with continuous replication.
- As VDB Snapshots occur on the staging instance, the snapshots will be available for creation of new VDBs, masking, and self-service.