Friday, December 21, 2012

EC2: move a large file between Windows instances

Moving a file between linux machines is easy-peasy, just use scp. (You can be sure ssh/scp is on all your linux ec2 machines.) Windows? Sigh, Windows. To get ssh/scp on Windows you need to install cygwin, and that is a non-trivial step to take.

So, how to move a 60GB file to move from one Windows ec2 instance to another Windows ec2 instance (in a different region)? Here is what I did:
  1. Install CloudBerry Pro on the source server. (Must be the pro version: you can get a 14-day free trial; when that expires it is a $30 cost)[1]
  2. Install CloudBerry free version on the target server.
  3. Test copying over a small file, via your S3 account, to make sure it works. I'm assuming you already know how to use this type of two-pane file-copy application. (I created the bucket in the same region as the target machine: that means the upload takes longer than the download.)
  4. In CloudBerry Pro, Tools menu, then Options, then choose Compression And Encryption tab. Check "Use Compression".[2]
  5. Copy the big file. It gives no progress.
  6. When it had finished it said it was 21% done. Very confusing. And on the server it just showed as 13GB file, not a 60GB one.[3]
  7. Download to your target server, using CloudBerry free version. (yes it works fine to download large files, to download compressed files.)
  8. Rename your downloaded file with a ".gz" extension, as that is what it actually is.[3]
  9. Install 7-zip, if you don't have a program that can deal with gzip files. It tells me the file is 2GB compressed, 13GB uncompressed. Ignore that, it is just being stupid. Decompress it, and you get a 60GB file.
Phew,  hard work. If you needed to do it regularly you should install cygwin and use scp! <soap-box>Or port your applications to linux where the living is easy. Apart from the fact that running Windows machine is harder, it is also significantly cheaper to run the cloud instances.</soap-box>

[1]: I've heard, but not confirmed, that you can uninstall it from one machine, then use the same install key on a different machine. If true, that is quite a fair license, and I encourage you to support them.

[2]: As we saw, this creates more work, so uncheck after doing your big file.

[3]: I think CloudBerry Pro should have put a .gz extension on the file, when it uploaded it, to make it clear what was going on.

No comments: