Monday, November 20, 2006

How do I backup to Amazon S3 storage service

All of today I spent putting together all the tools to do an incremental backup to the Amazon S3 storage service.

Amazon is a good brand and providing storage as service is very good utility. But I was not very happy by the fact that Amazon has tried to reinvent storage with this web service. Web Services are a good interface for remote inter process communication. But why do storage operations like function calls. Anyway, cheap cost of the offered storage and good brand were enough for me to pursue it as my backup solution.

I spent whole evening setting up different tools that provide front end for S3 service. The most useful tool would be a NAS server that provides NFS/CIFS mounts and does S3 transactions at the backend. However it was hard to find any such direct tool. As of today there are many under-development or beta scripts written in perl, python, ruby, java. I gave a shot to few of them, but they weren't very handy for solving my simple backup solution. JungleDisk is very good frontend tool for S3-service. It gives WebDAV frontend interface (again why WebDAV and not NSF?) and it's available for all platforms. It is good if your needs are manual access to this storage. It is difficult to treat WebDAV URL as a drive so that a backup utility can write to it just like any other local drive. I investigated lot of options, NFS-to-WebDAV bridge, WebDAV CLI clients which can be called by backup scripts, but there isn't a great solution for all platforms. My current need is Windows desktop backup. Finally I got 'S3Drive'. It mounts the S3 storage as just another drive on your windows box. This was gr8 for my current needs. One thing to note - JungleDisk does not show the files stored by other front end tools to the same S3 bucket - it didn't show the files stored by S3Drive. There are some other tools however which show objects stored by other tools as well, e.g. S3Safe.

So storage is ready, now the backup software. I was using EzBack-it-up. But after reading through its docs, I realized that it is very dumb in doing incremental backups. I had considered WinRAR some time back as backup option, but didn't get its archiving logic. After reading thru some documentation and its CLI options I figured out a way to do incremental backups. When its CLI tool is run with -ao option it "adds files with Archive attribute set". With -ac option it "Clears Archive attribute after compression or extraction". Thus "rar a -ao -ac backup.tar " will backup only the files that have changed since last backup (unless someone else changes the archive flag - I need to check with my CVS tools for this). So I got incremental backup tool.

How to enforce policies and schedules. CronForWindows solves schedules problem. I spent better part of an hour on very minute problem. Being alergic to BATCH files I couldn't find a programmatic way to create the unique target tar file name that will have timestamp embedded in it. I tried running bash script using cygwin, but backslashes and/or '%' symbols in date command format caused problems. I thought of writing quick java code, but thought that wasn't really quick. So I decided to give python a shot. After half an hour I came up with following script:

from datetime import date
import os

today=date.today()
today_str = "%s%s%s"%(today.month,today.day,today.year)
command = 'rar a -ac -ao'+'b:\e%s'%today_str+'.tar'+' c:\workspace\project'
os.system(command)

I had to put rar.exe in one of the directories in PATH, but it worked. The Cron scheduler can run this script as per the schedule I set.

I have few MBs of source directory that I need to backup. I expect its contents to change under an MB per day. Looks like pretty good deal with cheap prices of S3 service.

Friday, November 10, 2006

Stranger than fiction

I saw this movie today. I don't know if it's a good movie or a bad movie. It's not about that. It certainly made me write about it.

Warning: If you have not watched this movie, don't read further. Go watch the movie and read this post later. You have been warned.

The movie creates an unusual plot, quite unlike anything that has been tried before (at least in my limited knowledge). The pinnacle of the story is the point where the professor (Dustin Hoffman) reads the ending and tells Harold that the end is worth dieing for. The speech that the professor gives to Harold, opens numerous possibilities in which the story can end. Not to mention, this is also the most dramatic point in the whole film. It creates lots of expectations from the finale.

Now I thought that the director/creator of the film lost his grip of the story after this pinnacle. When actually the fatal climactic accident happens in the movie, it doesn't live up to the expectations the professor created about it. It may be poetic, sad and heartbreaking as described, but it certainly is not worth dieing for. This is where the film looses the chance of being extra-ordinary.
..... But wait I haven't finished yet.

Now let's assume that the ending of the story was something really fascinating and worth dieing. (That's a detail after all - or is it). So in the movie, the author of the story decides to sacrifice this fascinating ending of the story for a mundane one, for saving the life of a good soul - a moral decision with fitting justification.

What baffles me is the recursive nature of this movie. Let's say what I said initially is true - that the movie goes downslope after creating lot of anticipation. So the author inside the movie settled for a mundane ending for moral reasons. Is that the same reason, the director of the movie also decided to make a simple easy ending? Was this recursion intended by the creator of the movie? Because if it is, then that's really amazing - the effect is created by coming out of the movie (I hope you understand what I mean by this sentence, I can't explain it clearer).

... Anyway if I could work on this art form, I would try to think of more intriguing ending for this movie than the traditional ending it has. As I said, the plot reaches a hilltop from where it can take numerous paths to conclusion. Sadly it takes the regular bitten road down the hill.

Wednesday, November 08, 2006

Gas prices... seems like big scam

I told my friends last weekend that I am gonna fill my tank on the day before election, to avoid the gas price hike that will follow once elections get over. I was little serious though at the back of my mind. But I did fill my tanks on Monday.

I was astonished to watch the gas price change on election day however. In the morning I saw the rate of $2.07 per gallon. It did not wait even for the last voter to cast his vote before rising to $2.29 in the evening. I was dumbfounded. Moreover I don't see this as a news this morning. I searched google news for gas prices, but no one seems to register this fact. There are news that industry rate will go by 4 cents or something in some parts. What I observed was a quarter per gallon. By filling in 10 gallons of Monday I saved $2.5. Are the local gas station owners hiking the prices on their own? Who knows!?

Anyway I am glad I verified my theory (and saved couple bucks too).