Cross-Regional Data Backups on AWS

Just like any other data-conscious or security-conscious minded IT professional, you’ve already moved your data to Amazon’s Simple Storage Service (“S3”). But now what? How do you ensure the longevity of your data — I mean, when that volcano erupts from underneath your region’s data centers and Tommy Lee Jones isn’t around this time to help, whaddya do?! Well, fortunately enough, the fine folks at AWS thought of that scenario too and have a method to automatically replicate your data from the volcano-ridden data centers of the west coast or Jake Gyllenhaal’s snow-bound arctic data centers in the north-east to someplace safe from Hollywood’s [hopefully] fictitious natural disasters.Not only does this data replication protect you from a wide-scale regional outage, but you can also leverage it to protect you against malicious attacks or disgruntled employees. You achieve this protection by replicating the data not only to another region, but replicate it to an entirely different AWS account that only your most trusted users have access to. We’ll demonstrate some of that setup as well later in this post.

In a previous tutorial about creating an SFTP server on S3, we eluded to this cross-regional replication to help protect your users’ file uploads. Leveraging cross-regional data replication for FTP files is just one simple use case. Perhaps your devops setup leverages data replication as a means of deploying to higher environments. Or maybe your want to minimize application latency by getting the files to regions closer to your end users. Whatever your use case may be, this tutorial strives to show you how to set up S3 replication to achieve your goal.

“Spit Spot! And Off We Go.”

We’re assuming that you already have a bucket created in S3 for purposes of this tutorial. If, however, you do not have a bucket set up, refer to Amazon’s official documentation on creating a bucket before proceeding with this tutorial.

Additionally, we’ll be using the console rather than the AWS CLI throughout this tutorial for ease of use. Yeah, we figured you’d appreciate that!

Create Desitnation Bucket

While this tutorial assumes you already have a bucket in place that you want to set up replication on, replication does require both source and destination buckets. So, if you don’t already have a destination bucket set up in a different region, take another gander at those handy-dandy AWS docs to create the destination bucket. The point of what we’re doing, though, is to get your data replicated into a different region than your source bucket, so when you start the creation of the bucket, be sure to specify a region different from your source.

Create Replication Rule

Open the bucket you wish to set up replication for, click the Management tab, open the Replication sub section as shown in the figure below, and click the Add rule button.One of the benefits to using the console over command line in this case is the simplicity of the setup. For instance, if you don’t already have object versioning turned on for your bucket (a requirement to establish replication), the console takes a quick detour to configure that with you…so, enable it! If your bucket already has versioning turned on, though, you won’t see this option.The next section will configure which objects within the bucket should be replicated. You can opt for either replication of everything, or objects within certain folders if you specify those folders as a prefix. Multiple folders (if not everything in the bucket) requires setting up multiple replication rules as you can only specify a single prefix (folder) per replication rule. For this tutorial, we’ll be leaving the selection at the default “All contents” option.The next screen presents some pretty interesting options: choose a destination bucket in this account, or choose a bucket in a different account. For now, let’s choose the same account, but we’ll address the other option, a different account, later in this tutorial.

Choose your destination bucket that you created in the section above. And, similar to the source bucket, the destination bucket also needs versioning enabled. If you didn’t already have that set up, the console works with you once again to get that configured. If you already set versioning up on your destination bucket, you won’t see the versioning warning section as depicted in the following figure.Options available to you for the destination bucket are to change the storage class for the replicated objects, which is a great money saving opportunity. With S3, you pay for the storage you use. When you upload files to your source bucket, they are replicated to your destination bucket, and so it only makes sense that you pay for the duplicated storage. But, if you’re not terribly worried about disasters striking your primary region, perhaps you can stomach a riskier storage option in your replicated region in favor of lesser cost to you. Or, perhaps your business requirements or policies demand a higher level of resiliency in your data backups, and so you leave the “Change the storage class” option unchecked. That decision is entirely up to you, but you should check S3’s pricing page to help you make that choice.

The last set of options before reviewing and finalizing the replication setup is to specify permissions needed for S3 to be able to automatically replicate data from your source bucket to the destination bucket. Choose “Create new role” in the console for this to automatically be taken care of for you.This automatic permissions creation sets a policy on your destination bucket similar to:

{
    "Version": "2008-10-17",
    "Id": "S3-Console-Replication-Policy",
    "Statement": [
        {
            "Sid": "S3ReplicationPolicyStmt1",
            "Effect": "Allow",
            "Principal": {
                "AWS": "arn:aws:iam::123456987654:root"
            },
            "Action": [
                "s3:GetBucketVersioning",
                "s3:PutBucketVersioning",
                "s3:ReplicateObject",
                "s3:ReplicateDelete"
            ],
            "Resource": [
                "arn:aws:s3:::destination-bucket",
                "arn:aws:s3:::destination-bucket/*"
            ]
        }
    ]
}

Nearly identical permissions are created for the source bucket on the IAM Role that S3’s replication job will use to access the files in your source bucket for copying to the destination.

Review and create the replication rule, and that’s it!  Your data is now being copied to another AWS region whenever you create/update/delete any files stored in your source bucket for extra data resiliency!

A Little Extra Magic

So, what was that option about choosing a bucket in a different account when picking your replication destination? Can you actually do that?! Why would you want to do that? Two answers: 1) yes, you can actually do that, and 2) because it’s totally awesome! “No such thing as a stupid question?!” …well, those were. Let’s entertain ’em anyway and dig a bit deeper into this scenario.

Replicating data to a different AWS account has several benefits:

  • Perhaps you have multiple application environments (QA, Staging, Production) and you segregate things by using an entirely different account for each environment as a way to fulfill compliance obligations. Maybe you set up replication from the lower environments to the higher ones as a form of devops deployment to allow development at the lower levels while preventing the developers from having direct access to your production environments.
  • If you have highly-valuable information that you want to ensure never gets deleted, maybe you create a new account that only you (or all your company’s VPs) have access to. This way, if there are any disgruntled employees that for some reason have or acquire bucket-delete permissions, they can only damage the data in the AWS account they have access to. AWS documentation states, “If a DELETE request specifies a particular object version ID to delete, Amazon S3 deletes that object version in the source bucket, but it does not delete the same object version from the destination bucket. This behavior protects data from malicious deletions.” Wow, they really did think of everything.

If you want to set up replication across accounts, you first need to do some configuration on the destination bucket in the other AWS account. That needs to happen manually because the console wizard that we used for replicating in the same account most likely does not initially have access to the destination account.

After you have logged in to the destination account and are looking at the properties of the destination bucket, configure a bucket policy as follows.

{
   "Version":"2008-10-17",
   "Id":"PolicyForDestinationBucket",
   "Statement":[
      {
         "Sid":"1",
         "Effect":"Allow",
         "Principal":{
            "AWS":"SourceBucketAcctID"
         },
         "Action":[
            "s3:ReplicateDelete",
            "s3:ReplicateObject"
         ],
         "Resource":"arn:aws:s3:::destinationbucket/*"
      },
      {
         "Sid":"2",
         "Effect":"Allow",
         "Principal":{
            "AWS":"SourceBucketAcctID"
         },
         "Action": [
            "s3:List*",
            "s3:GetBucketVersioning",
            "s3:PutBucketVersioning"
         ],
         "Resource":"arn:aws:s3:::destinationbucket"
      }
   ]
}

You will also need to manually go in to the versioning options on the destination bucket and make sure that’s enabled.Once the destination bucket is properly configured in the destination account, log back in to the source account and follow the same steps from the section above to get the replication rule started.  But, when you get to selecting the destination bucket phase of the setup, be sure to specify that your destination bucket is in another account as seen below.Once you save the destination bucket settings, you will be provided with a bucket policy to apply to the destination bucket. We already took care of that with the policy code provided above, so you can just ignore that bit. Simply proceed with the instructions above as if you were replicating to the same account. Once complete, your replication should successfully duplicate files from your source bucket to the destination bucket in a different AWS account providing you with the added benefits listed at the top of this section.

Resources

https://docs.aws.amazon.com/AmazonS3/latest/dev/crr-what-is-isnot-replicated.html
https://docs.aws.amazon.com/AmazonS3/latest/dev/crr-how-setup.html

Tag(s): Development

Ryan Jensen

Ryan spent 10 years working at a prepaid card company, developing ordering and card balance platforms. At Sketch, he provides critical software development for our clients, and leads our managed service for cloud infrastructures. His many other hats include coaching, training (DevOps is his thing...among others), and...

Other posts you might be interested in