Vimalkumar Velayudhan


A Python script to update NCBI BLAST databases

The following command will download and/or update the swissprot protein database in the current directory:

update_blastdb --decompress --passive swissprot


Connected to NCBI
Downloading swissprot.tar.gz... [OK]

Here is the list of downloaded files:

swissprot.tar.gz  swissprot.tar.gz.md5

One issue with this approach is that any long running BLAST jobs currently accessing the database will be aborted. To overcome this problem, I wrote a wrapper around the update_blastdb command -

It uses a symbolic link to the latest version of the database and only updates the link if the database is not being used. If the database is being used, the script adds a message to the log after the database download is complete. The link can then be updated manually later.

This script will only work on Linux/Unix-like systems due to its dependence on the lsof command to check if a directory is being accessed.

Download python script§

From vimalkvn/sysadminbio repository on GitLab:

Save this script as under /home/user/programs/ (only used for the purpose of the examples below). It can be saved somewhere else.


Assuming you would like to download the swissprot database to /home/user/blast, use:

python /home/user/programs/ \
  -d swissprot -p /home/user/blast

A log file will be available under

To use the database in your BLAST search, you can use:

blastp -db /home/user/blast/swissprot/swissprot \
  -query sample.fasta

Other databases (supported by update_blastdb) can be downloaded in the same manner.

Automated update§

An automated update can be setup using cron:

0 0 1 * * /home/user/programs/ \
-d swissprot -p /home/user/blast

The above cron job will update the database on the 1st of every month.

If you have any questions or comments on this post, please send them by email to vimal (at) disroot (dot) org.

If you would like your comment to remain anonymous, please state that in your email. In any case, your email address will not be published.