Tag Archives: Amazon S3

Let’s talk about my backup strategy

I am someone you would probably call a geek. Nearly everything I have created in my life is digital: computer programs, websites, designs, articles, scientific papers and photographs. All bits and bytes. Nevertheless I only started thinking about a backup strategy quite recently.

When Mac OS X introduced Time Machine in 2007 I first started to create backups on an external hard drive on a regular basis. Before that I would only occasionally backup some important folders. Having Time Machine is a great improvement, compared to having no backup at all. But there are many cases of data loss that would also mean loosing my backup. A fire, a flood (less likely since I live on the fourth floor) or a thief could destroy or steal my digital legacy.

My backup strategy improved unintentionally with the rise of cloud services. Most of my documents are now stored in my Dropbox folder, which does not only upload them to a server, but also creates a version history. This lets me retrieve older versions of a file. My emails reside on Gmail and I can barely remember the times when I had to delete emails from the server anymore. Google gives me enough space to be able to access years of personal and professional communication from every internet-enabled device.

All code I write is on Github or my own Git server. Even if my apartment should burn down, I could get up the next morning, buy a new computer and resume working. Since some months I even store some important configuration files in a public Github repository. I use iTunes Match to be able to retrieve my music from Apples servers any time I want to. Music I buy from Amazon or other online music stores is stored in my Dropbox folder.

However, there are files which are only stored on my own hardware in my apartment. Many gigabytes of photographs sit on my hard drive without any good backup. This makes me nervous. I tried multiple times to find a good solution for this problem. Backblaze seemed like a great solution, but uploading files was too slow to get over 100 GB of images into the cloud. Another service I tried is Bitcasa. I managed to get all my files up there, but downloading them takes ages and the service is too new to trust them with my backups.

Amazon S3 seems like a perfect solution for my problem. It’s fast and I trust Amazon to be around for some more years. However, it would cost me currently approximately $15 per month to store my 155 GB backup on S3. This is too expensive for me. Gladly Amazon introduced a new service this summer called Glacier. It is targeted for archiving large amounts of data for a very long time. Storing 155 GB on Glacier costs only $1.5 per month. Downloading these files is more expensive, but since I hope that I never have to, I am absolutely ok with paying $20 to fetch all my data.

Glacier is like all Amazon Web Services not a consumer product and therefore Amazon does not offer an interface to create and upload backups. But there is Arq from Haystack Software. Arq offers a great interface to backup for data to either S3 or Glacier. Arq costs $29 and if you don’t already have a backup in the cloud (or no backup at all) you should consider investing a little bit of money and time to setup a good backup strategy. Just in case.