|
|
AWS Database Migration Service (DMS)
Author: Venkata Sudhakar
AWS Database Migration Service (DMS) is a managed service for migrating databases to AWS with minimal downtime. It supports both homogeneous migrations (MySQL to RDS MySQL) and heterogeneous migrations (Oracle to Aurora PostgreSQL). AWS DMS runs a replication instance in your VPC that connects to the source and target databases, performs an initial full load of existing data, and then uses CDC to continuously replicate ongoing changes until you are ready to cut over. Like GCP Datastream, it eliminates the need to manage your own Debezium or log shipping infrastructure. AWS DMS works with a wide range of sources including Oracle, SQL Server, MySQL, PostgreSQL, MongoDB, SAP, and IBM Db2, and targets including Aurora, RDS, Redshift, S3, DynamoDB, Kinesis, and Kafka. For heterogeneous migrations (different source and target database engines), AWS also provides the Schema Conversion Tool (SCT) which automatically converts stored procedures, views, and functions from Oracle or SQL Server to Aurora PostgreSQL or MySQL syntax, flagging items that require manual review. The below example shows how to create an AWS DMS replication task using the AWS CLI that migrates from on-premises MySQL to Amazon Aurora PostgreSQL with full load followed by continuous CDC.
It gives the following output,
# Replication instance creation (takes 5-10 minutes):
{
"ReplicationInstance": {
"ReplicationInstanceIdentifier": "prod-migration-instance",
"ReplicationInstanceClass": "dms.r5.large",
"ReplicationInstanceStatus": "creating"
}
}
# Connection test results:
{
"Connection": {
"ReplicationInstanceArn": "arn:aws:dms:...:rep:prod-migration-instance",
"EndpointArn": "arn:aws:dms:...:endpoint:mysql-source",
"Status": "successful",
"LastFailureMessage": ""
}
}
It gives the following output during migration,
# Task started:
{
"ReplicationTask": {
"ReplicationTaskIdentifier": "mysql-to-aurora-task",
"Status": "starting"
}
}
# After full load completes (switches to CDC automatically):
{
"Status": "running",
"FullLoadProgress": 100,
"TablesLoaded": 47,
"CDCLatency": 2
}
# CDCLatency=2 means 2 seconds behind source - near real-time
# To perform cutover:
# 1. Stop writes to MySQL source
# 2. Wait for CDCLatency to reach 0
# 3. Stop the DMS task
aws dms stop-replication-task \
--replication-task-arn arn:aws:dms:...:task:mysql-to-aurora-task
# 4. Verify final row counts match
# 5. Update application connection strings to Aurora PostgreSQL
AWS DMS best practices: Pre-create target schema - DMS migrates data but does not create indexes or foreign keys on the target (to maximise load speed). Create the full schema with indexes on Aurora before starting DMS, then disable foreign key checks during the load. LOB handling - Large Object columns (BLOB, CLOB, TEXT) require special configuration. Use Limited LOB mode with a max LOB size of 32KB for most cases, or Full LOB mode for larger values (slower). Parallel load - For large tables, enable parallel full load in task settings to split the table across multiple threads. Monitor CloudWatch - DMS publishes CDCLatencySource, FullLoadThroughputRowsSource, and CDCIncomingChanges metrics to CloudWatch. Set alarms on CDCLatencySource greater than 60 seconds to detect replication issues early.
|
|