AB-BLAST Memory Requirements and Usage

BLAST Memory Requirements

Memory Requirements for the Classical Ungapped BLAST Algorithm

Several characteristics of a BLAST search determine its heap memory requirements. For a rote implementation of the "classical" ungapped BLAST algorithm (Altschul et al., 1990), contributors to memory use include:

length of the query sequence, Q, measured in residues;
length of the longest sequence in the database, D, measured in residues;
whether one (S=1) or both (S=2) strands of a nucleotide query sequence are searched (re: BLASTN, BLASTX and TBLASTX);
whether one or both strands of a nucleotide sequence database are searched (re: TBLASTN and TBLASTX);
BLAST data structures whose total size, B, is a function of parameters that govern the algorithm's sensitivity; these parameters include the word length, W, and neighborhood word score threshold, T;
the alphabet size, A, where A=4 for DNA and A=28 for proteins (including amino acid ambiguity codes);
storage required for a data pointer, P (= 8 bytes on 64-bit computer systems);
whether low-complexity regions or repetitive elements have been masked from the query and/or database sequences;
the number of database hits that are found and cached for post-processing and subsequent output;
the number of CPUs or threads employed, C;

For example, the minimum storage required (in bytes) for a classical BLASTN search is approximately: 5SQ + C[8S(Q+D) + D] + B, where S=2 when both strands of the query are searched. In this example for BLASTN, B will be no more than (and often much less than) P(4^W), where the DNA alphabet size is 4. Using one processor or thread (C=1), this simplifies to 26Q + 17D + B bytes. If an additional processor or thread is used, the minimum memory requirement increases by 16Q + 17D, for a total of 42Q + 34D + B bytes.

Memory requirements can be greatly reduced by limiting the number of threads employed with the cpus=# option on multiprocessor systems. The default behavior is to spawn one thread for every logical processor in the case of BLASTP, BLASTX, TBLASTN and TBLASTX; and up to 4 threads in the case of BLASTN. This default behavior can be altered in a local file named /etc/sysblast. An example sysblast.sample file is provided in AB-BLAST software distributions. The most efficient use of computing resources will often be obtained by limiting individual BLAST jobs to a single thread (cpus=1), so that the computational overhead of thread creation and memory management is avoided.

Simultaneous Multi-Threading (i.e., Intel HyperThreads or AMD Clustered Multi-Threading) creates additional logical processors. Software like BLAST may spawn an additional thread of execution for each one, with each thread demanding more memory. Use of SMT often (but not always) speeds up a search, but at the expense of increased memory usage. SMT threads are not as efficient as real cores. And higher overall system throughput may actually be achieved by running multiple concurrent single-threaded BLAST searches (specify the cpus=1 option).

The default behavior of the BLAST programs is to search both strands of a nucleotide query sequence or database. Memory use can be minimized by requesting just one strand at a time, Collating results from multiple searches may be impractical however.

Sufficient real memory should be provided to the search programs that they can run without spilling over into virtual memory swap storage, as this can be disastrous to BLAST performance. AB-BLAST tries to avoid using virtual memory swap storage by estimating the memory required per thread and only spawning as many threads as can safely be managed within the amount of free physical memory that is available when the job starts.

Database File Caching

Beyond the above requirements for program heap storage, additional memory may improve BLAST performance, through in-memory caching of database files from previous searches. When databases are searched repeatedly (e.g., by an automated analysis pipeline), caching of database files avoids the latency and throughput limitations of disk I/O, as well as avoiding contension between different jobs for the same disk resources.

If sufficient memory is only available to cache files for a subset of databases, file caching will not be effective. Files are usually cached by the operating system in a FIFO (first in/first out) manner, such that files accessed earlier in a job stream will be dropped from the cache to make room for files accessed later. Overall system throughput may improve if the job stream can be structured to search all queries against one cache-able subset of the databases before proceeding to search the next cacheable subset, and so on, until all of the desired databases have been searched. In this manner, analysis pipelines run on memory-limited computers can still benefit from caching.

How much additional memory is useful for file caching? Typical BLAST searches involve a sequential search through an entire database. For AB-BLAST databases in XDF format, each search requires that the entirety of the .x[np]s file be read, in addition to the associated .x[np]t file. For any database hits, the associated .x[np]d file will be read to obtain sequence descriptions. Sufficient memory should be available to cache the .x[np]s and .x[np]t files, plus large portion (if not all) of the .x[np]d file. Due to the FIFO nature of cache management, adding some memory is unlikely to improve performance if still not enough is available to cache the entire .x[np]s and .x[np]t files,

One should be wary of other jobs executing simultaneously with BLAST, whose actions may purge the file cache of BLAST database files. If other jobs besides BLAST are active, additional memory should be provided for them to function within memory, too.

References

Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ. Basic local alignment search tool. J. Mol. Biol. 215:403-10.

Return to the AB-BLAST Archives home page

Last updated: 2020-06-05