How to replicate your AWS S3 bucket files to another bucket in realtime
There are many use cases where you want to replicate your one S3 Bucket’s content to another S3 bucket in real-time.
Example cases are as below
- For any compliance you need to maintain copy of your S3 bucket files as it is
- You require same objects in different region buckets
- You want to share some bucket data with another AWS Account
- You need to change storage class of s3 while copying same objects
- Replicate objects within certain amount of time (~15 min)
There are 3 ways available with AWS to replicate your AWS S3 bucket files to another bucket in realtime
aws s3 cp
AWS S3 Copy this command copied all/mentioned s3 bucket objects to the local system or to another S3 bucket- aws s3 sync AWS S3 Sync this command sync all/mentioned S3 bucket objects to the local system or to another S3 bucket, which includes object deletion event as well
- AWS S3 replication this AWS functionality, once setup automatically replicates all/mentioned objects to mentioned s3 bucket
- 1st and 2nd command needs to run manually using aws-cli or AWS SDK as required, cron jobs can also be setup
- 3rd command is automation directly performed by AWS
- Here, by mentioned objects, you can add rules based on some prefix rules
- Data transfer and storage prices will apply while performing this operation
Usage and implementation of aws s3 cp
commands is already mentioned here, you can read this article for more information
Now, Let’s go ahead and see how to implement AWS S3 replication
1. Prerequisite
- In AWS, you will require IAM access
- You need S3 access for buckets you are going to use
2. Choose S3 buckets for replication
- Choose if both S3 buckets belong to same account or different accounts
- Create 2nd replicated bucket (if not present)
- Check if
Bucket Versioning
isenabled
or not. On Both buckets, it needs to be enabled- to check, Go to
AWS S3 console
->Open S3 Bucket
->Properties
->Bucket Versioning
- to check, Go to
- Make sure you have access to both S3 buckets, to copy content and make changes in S3 bucket configurations
3. Make a copy of existing data (if required)
- S3 bucket replication will only work after it is set up, if you require old bucket data, you can copy that data before getting started
- To copy data, use AWS console or by using the command line, run
aws s3 cp
commands: Refer here
4. Create an IAM role to allow S3 to manage replication automatically
- Go to source S3 bucket account’s IAM console
- Go to
IAM
->Roles
->Create Role
- Select
S3
,S3 Allows S3 to call AWS services on your behalf
- Click
Next: permisions
- Click
Create Policy
- Go to
JSON
and copy mentioned policy - Click on
Next: Tags
- Create tags if required (optional)
- Click
Next: Review
- Name your policy and add an appropriate description
- Click on
Create Policy
- Select
{
"Version":"2012-10-17",
"Statement":[
{
"Effect":"Allow",
"Action":[
"s3:GetReplicationConfiguration",
"s3:ListBucket"
],
"Resource":[
"arn:aws:s3:::identicalcloud-replication-bucket1"
]
},
{
"Effect":"Allow",
"Action":[
"s3:GetObjectVersionForReplication",
"s3:GetObjectVersionAcl",
"s3:GetObjectVersionTagging"
],
"Resource":[
"arn:aws:s3:::identicalcloud-replication-bucket1/*"
]
},
{
"Effect":"Allow",
"Action":[
"s3:ReplicateObject",
"s3:ReplicateDelete",
"s3:ReplicateTags"
],
"Resource":"arn:aws:s3:::identicalcloud-replication-bucket2/*"
}
]
}
Replace,
– identicalcloud-replication-bucket1
with your source bucket name
– identicalcloud-replication-bucket2
with your destination bucket name
Above mentioned policy is for S3 replication on the same AWS Account, for buckets in different AWS Account, you will also require to add policy in the destination AWS Account to grant access, refer this to add that policy
- Go back to the previous tab, refresh using the in-page refresh button
- Search for policy by the name given in the previous step
- Click on
Next: Tags
- Create tags if required (optional)
- Click
Next: Review
- Name your
role
and add an appropriate description - Click on
Create Role
5. Go to the source S3 bucket and add replication rules
- Go to
Amazon S3
->Source Bucket
- Go to
Management
- Click on
Add Replication Rule
- Name your
replication rule
- Select
Status
,Enabled
- Select
all objects
or defineprefix rule
, for this tutorial, we will select all objects
- Name your
- in the next step,
- Select
Destination bucket
name - For the same AWS Account, just select
bucket name
- For different AWS Account, specify 12 digit AWS Account number and then enter your
bucket name
- Select
- Choose,
IAM role
that we created in the previous step - Next: Tick box, if you want to enable encryption on S3 objects (optional)
- Next: Select the Same storage Class for destination objects, or choose from mentioned categories
- In the next step, Tick as required from the below options
- Replication Time Control (RTC) which will replicate 99.99% of objects within 15 minutes
- Replication metrics and notifications will notify if replication fails
- Delete marker replication if you wish to delete objects when object deletion applies on source bucket, tick this box, I recommend not ticking this box
- Replica modification sync, if you tick this, and you make changes in metadata on
destination bucket
, it will also be reflected insource bucket
- Click on
Save
6. Done and Test
- Replication has been set up, now it’s time to test it
- Go ahead and upload an object in the source bucket
- It will be replicated to the destination bucket, wait for at least 15 minutes. Although if the object size is less, it will be done in seconds.
Let us know in the comment section, if this article was helpful to you! Share it with your collegues and friends
Drafted On,
22nd January 2022
DevOps @identicalCloud.com
References
[1] AWS S3 Cli cp commands
[2] AWS S3 Replication