Stand With Ukraine

Friday, February 6, 2015

A python script: example for downloading GlobSnow data (snow extent in this case)

Recently a friend asked me if it is possible to selectively download folders from http servers in linux.. Probably it is possible by using wget or rsync, but I have never succeeded to make them work exactly the way I needed. So I wrote a small script for the task and passed it to him hoping that this might be his first step to learning Python. And below are 2 versions of the same script:
  1. The version I have actually given to my friend,
  2. The improved a bit more scary-looking version, which is a bit closer to the way I think it should be written.

First, below I show a quick and simple way of downloading files, with a minimal account for possible errors. The next step is to check if the sizes of existing files are the same as the sizes of the remote files and to replace the bad files, if required. Of course, it takes some time to download the data, especially if you need several years. If you work remotely I would suggest using tmux or screen, so your program would continue running even if the ssh session is closed for some reason. But if those are not installed, you still can get away by using nohup as follows:
nohup pyhton download.py >& log.txt &

Cheers and any comments are welcome