Vimalkumar Velayudhan


A Python script to update NCBI BLAST databases

The following command will download and/or update the swissprot protein database in the current directory:

update_blastdb --decompress --passive swissprot


Connected to NCBI
Downloading swissprot.tar.gz... [OK]

Here is the list of downloaded files:

swissprot.tar.gz  swissprot.tar.gz.md5

One issue with this approach is that any long running BLAST jobs currently accessing the database will be aborted. To overcome this problem, I wrote a wrapper around the update_blastdb command -

It uses a symbolic link to the latest version of the database and only updates the link if the database is not being used. If the database is being used, the script adds a message to the log after the database download is complete. The link can then be updated manually later.

This script will only work on Linux/Unix-like systems due to its dependence on the lsof command to check if a directory is being accessed.

Download python script§

From vimalkvn/sysadminbio repository on GitLab:

Save this script as under /home/user/programs/ (only used for the purpose of the examples below). It can be saved somewhere else.


Assuming you would like to download the swissprot database to /home/user/blast, use:

python /home/user/programs/ \
  -d swissprot -p /home/user/blast

A log file will be available under

To use the database in your BLAST search, you can use:

blastp -db /home/user/blast/swissprot/swissprot \
  -query sample.fasta

Other databases (supported by update_blastdb) can be downloaded in the same manner.

Automated update§

An automated update can be setup using cron:

0 0 1 * * /home/user/programs/ \
-d swissprot -p /home/user/blast

The above cron job will update the database on the 1st of every month.