Unveiling the hidden cost of AWS DMS: what you need to know.
AWS Database Migration Service, commonly abbreviated as "AWS DMS", is a service developed to streamline the transition of customers databases from on-premise systems to the AWS Cloud.
Since its launch in general availability in 2016, this service has undergone numerous updates, with the most significant recent enhancement being the introduction of DMS Serverless.
Full Load vs Capture Data Change (CDC)
DMS offers 3 possibilities to migrate your database:
- 1️⃣ Full Load - Migrate existing data from your source database to the target of your choice.
- 2️⃣ Full Load and CDC - Migrate existing data and replicate ongoing changes
- 3️⃣ CDC only - Only replicate ongoing changes
Option1️⃣ is recommended by AWS, following a lift-and-shift migration strategy. The two others are designed to fit with every customer specific use case and constraints, but is not recommended for a daily production usage.
The hidden cost
Even so, many companies still rely on DMS to replicate on the fly ongoing data from their legacy on-premise databases to RDS instances. Moving an entire company with thousands of technical stack or product (and associated data) can be a long and complicated journey, paired with technical challenges and associated delays.
As this is a good practice to get some observability from a production system, logs export to Cloudwatch must be activated, using replication task settings:
"Logging": {
"EnableLogging": true
},
On your behalf, DMS task will create the Log Group of the replication instance, and put into the Log Stream of the associated task. DMS will also destroy the Log Group and Streams if the DMS Replication Instance is destroyed.
The main issue here is about the Cloudwatch Log Group settings:
- 🏷️ - No tags are propagated from your replication instance to the log group. This broke ABAC access policies.
- 💵 - No retention policy can be set by default, resulting to applying a "never expire" policy, as this the default on Cloudwatch.
To me, the last point is the most critical: considering a DMS replication task running since years in production in CDC mode with log export enable (with a default verbosity), monthly costs can be huge:
According to AWS calculator tool, 2405.22 GB of data is billed on eu-central-1 (Frankfurt):
- ~$1500 per month.
- ~$18.000 per year.
Mitigate the cost
As DMS must manage the lifecycle of the Log Group (creation and destruction), you cannot pre-create a Cloudwatch Log Group and ask DMS to use it - it will result to an error.
I recommend the following approach to mitigate both ABAC and billing issue:
- Create a Step Function with two steps:
- Cloudwatch Logs - TagLogGroup
- Cloudwatch Logs - PutRetentionPolicy
- Create an EventBridge rule triggered by CreateLogGroup API call.
- Use the Step Function as target
This solution is easy to create, maintain, observe, cost-effective yet powerful.