UNDER CONSTRUCTION
As installed at the factory, or when using normal installation procedures, operating systems are often not configured to support the large problems users may be faced with when performing BLAST searches, particularly when whole chromosomes or genomes are involved. Large jobs may fail even when the hardware is physically capable, simply due to system software configuration issues that may or may not be adjustable by normal users. Sometimes large jobs can be successfully executed by the SuperUser (also known as "root"), while normal users are not so fortunate. The information provided below is intended to provide guidance for system administrators with SuperUser privileges about to how to boost the resource limits configured in their kernel, in order to allow normal users to run large jobs (and not just large BLAST jobs, by the way). Inevitably the limitation that prevents a user from executing a large job will be one of memory, data or virtual memory use, whether the limitation is a physical one or merely an issue of software configuration.
Physical limits must of course be addressed at the hardware level, either by replacing or augmenting existing components. While the size of a virtual memory swap partition can technically be a configuration issue, its size may in fact be limited by the storage capacity of the disk drive on which the swap partition resides. Additional swap partitions or swap files can often be defined to augment the primary partition. However, for many applications (such as BLAST), performance deteriorates precipitously once real (solid-state RAM) memory is exhausted. In this case, the size of swap should probably not enter into consideration; the real focus should be on increasing the amount of real (RAM) memory available.
Even if many gigabytes (GB) of memory are installed in a computer, 32-bit microprocessors are incapable of simultaneously addressing more than about 2-4 GB within any given program. Most BLAST jobs fit well within 2 GB, but jobs involving large chromosomal sequences frequently demand more memory than a 32-bit processor with its 32-bit virtual addressing can support. For these larger jobs, a 64-bit microprocessor supporting 64-bit virtual addressing is required. Since 64-bit virtual addressing imposes greater memory requirements than 32-bit virtual addressing for the same-sized job, and because many BLAST jobs don't require more than 2 GB memory to begin with, AB-BLAST is typically available in a 32-bit virtual addressing version even for 64-bit computing platforms. This not only allows smaller, 32-bit-compatible jobs to run in even less memory, but it also avoids the performance penalty, which can be significant, that is exacted when running applications in 64-bit virtual addressing mode.
If real memory is not a limitation, it's time to look at software imposed restrictions. Before making any changes to the so-called hard resource limits configured in the operating system's kernel or kernel configuration, be sure the problem isn't simply that the user is bumping up against a soft resource limit which they can adjust themselves. Soft limits can be adjusted using the command shell's built-in "ulimit" or "unlimit" commands. The "limit" command can be executed before and after "ulimit" and "unlimit" to see what changes (if any) have been made. For example, "unlimit datasize" will typically increase data usage limits to whatever the system-imposed hard limit on data usage is. Any limits on "datasize", "memoryuse", "vmemoryuse", "memorysize", "addressspace", etc. may also need to be increased. Note: the specific names of relevant resource limits vary from OS to OS. The details of these variables aren't really important, as long as you believe you've tried to relax the soft limits on all relevant variables before you begin mucking around with hard limits configured in the kernel.
Users should also note that
memory used by the BLAST
programs is a linear function
of the number of threads or CPUs employed.
AB-BLAST
is a multithreaded application that
uses multiple CPUs or processors by default—as did the earliest versions of
NCBI BLAST—on computers that are configured with multiple CPUs.
The default number of CPUs employed can be overridden on the command line
with the "cpus=#" option.
A different default number of CPUs can be established system-wide
by setting the desired value in an optional /etc/sysblast
configuration file.
A sample sysblast configuration file is included in
BLAST software distributions.
Since the /etc directory is local to every UN*X computer system,
a copy of /etc/sysblast must be installed
on every computer where these configuration controls are desired.
More information about /etc/sysblast is available in the README.html
file that accompanies the software distribution,
in the section on Installation.
Finally, many programs, including BLAST, perform exceptionally poorly when their memory requirements extend beyond the bounds of the available real memory into virtual swap (disk-based) memory. If your principal usage of a computer is to run programs that exhibit this behavior, it is advisable to confine the per-process memory allocation to be no more than real memory. At the same time, BLAST routinely uses memory-mapped I/O for more efficient access to the sequence databases; and memory-mapped I/O of large databases benefits from having large virtual address spaces. The need to restrict data usage to available real memory without restricting virtual addressing is not a conundrum for 64-bit platforms, as will be seen below.
This advice is relevant to version 4.0F of Compaq's Tru64 UNIX.
Users of Tru64 version 5.0 and higher should refer to the manual
page for the sysconfig
or sysconfigdb
utility.
For more information, see
this page on Compaq's web site.
Compaq recommends using the sysconfig
or sysconfigdb utility to change kernel parameters.
Read the man page for these programs before use.
The subsystems you'll possibly want to edit include proc
and vm
.
Alternatively -- AND NOT RECOMMENDED
unless you know what you're doing
as root --
make a backup copy of the file
/etc/sysconfigtab
.
Then edit /etc/sysconfigtab
to include lines and limits similar to the following:
proc: max-proc-per-user=128 per-proc-stack-size=33554432 per-proc-data-size=4000000000 per-proc-address-space=107374182400 max-per-proc-stack-size=134217728 max-per-proc-data-size=4000000000 max-per-proc-address-space=107374182400 vm: vm-maxvas=107374182400 vm-mapentries=1000
The above limits were chosen for a system containing 4 GB physical memory with the consideration that only one large job would be run at any given time. While 4000000000 is some 294967296 bytes (~281 MB) less than 4 GB, this margin provides needed memory for the operating system and other user processes. You will likely want to use different limits than these if your computer has more or less memory than 4 GB. You may even want to consider restricting memory use to something less than the maximum, as a safeguard against wayward users or unruly programs accidentally bringing the system to its knees. Furthermore, memory that is consumed by programs is memory that can no longer be used for file caching. Caching of BLAST database files from run-to-run is often an important factor in search speed, which presents another reason to restrict the per-process data size to something less than the maximum.
That figure of 107374182400
for per-proc-address-space
,
max-per-proc-address-space
,
and vm-maxvas
is the equivalent of 100 GB,
to facilitate memory-mapped I/O of large files (hopefully for some time to come).
To support virtual BLAST databases comprised of numerous component databases,
you may also wish to increase the value of vm-mapentries
,
which restricts the number of memory-mapped files in a given process.
Since each component of a virtual database requires a minimum of 3 mapped
files, the vm-mapentries limit should be at least 3 (if not 4 or more)
times the maximum number of BLAST database components
you will ever need to search.
Keep in mind, though, that setting a limit that is well beyond what you will
ever need may waste real memory.
Reboot or run “sysconfigdb -s” to have the changes take effect.
Hard upper limits are established by the max- parameters; default (soft) limits are established by the rest. Your command shell's "limit" command can be used to raise or lower soft limits within the ranges established by the hard upper limits.
This advice is relevant to HP-UX version 11 running on both PA-RISC and Intel IA-64 ("Itanium") processor architectures.
As root, run the sam (System Administration Manager) program to increase
the configurable kernel parameter values for
maxdsiz
,
maxdsiz_64bit
,
maxtsiz
and
maxtsiz_64bit
.
These parameters may be found beneath the "Kernel Configuration" menu.
The first value governs binaries that employ
32-bit virtual addressing;
the second parameter governs binaries that use 64-bit virtual addressing.
As on several other operating systems,
AB-BLAST supports both 32-bit and 64-bit addressing under HP-UX,
so don't neglect the 32-bit parameter setting.
The 32-bit AB-BLAST binaries
(the so-called "p32" binaries, because they use 32-bit C language pointers)
provide increased speed and more efficient use of memory
over their 64-bit siblings and should satisfy all
but the most demanding BLAST jobs.
While you're tweaking the kernel with sam,
you might also want to increase the maximum stack size by a few MB,
perhaps to bring it to 32 MB.
If 32 MB sounds small,
keep in mind that the stack is an entirely separate storage area
from the heap.
Most memory allocations occur in the heap
and the majority of programs never use more than several KB of stack.
Stack size is governed
by the maxssiz
and maxssiz_64bit
parameters.
If your computer system contains multiple processors,
you might make sure the stack size limit is at least 2 MB per processor.
Other kernel parameters to consider tweaking (most likely increasing)
on Itanium-based systems
are maxrsessiz
and maxrsessiz_64bit
,
which govern the size of the per-process Register Stack Engine (RSE)
for 32-bit and 64-bit processes.
The units of these parameters is in bytes.
Maxrsessiz should be raised if user processes
are being terminated with the error SIGBUS
due to overflow of the RSE stack.
After you're done with sam, it will ask whether to build and install the new kernel and reboot.
On some HP-UX systems,
the alternative to using sam may be
/usr/sbin/kcweb
or /usr/sbin/kctune
.
Check "man kcweb" and "man kctune" for further details.
This advice is relevant to Solaris version 8 and later.
The speed of BLAST searches can benefit from caching of database files,
which are sequentially accessed and are often multiple gigabytes in size.
This is particularly true for BLASTN searches,
which default to low-sensitivity but faster parameters
that frequently reveal I/O bottlenecks,
particularly on multi-processor computers.
Under Solaris, the caching of large, sequentially-accessed files can be tuned
by adding the following lines to the /etc/system
file.
set ufs:freebehind=0 set segmap_percent=12
Disabling ufs:freebehind
permits the kernel to cache large files.
The definition of “large” can be altered via the
smallfile
parameter whose default value is only 32768.
The default value for segmap_percent
is 12,
leaving 88 percent of physical memory to devote
to application programs like BLAST.
On systems with many gigabytes of physical memory,
cache performance can be improved by increasing segmap_percent
from its default value.
Avoid setting segmap_percent
too high, though.
The largest BLAST jobs can require multiple gigabytes of memory per processor
and might perform even worse with a high segmap_percent
value
than if the database had not been cached.
This advice is relevant to Solaris X86 version 9 but may be applicable to earlier and later releases.
For a computer with an Intel IA-32 or X86 processor capable of managing more than 4 GB of memory (e.g., the PentiumPro, PentiumII, PentiumIII, Pentium4 or Xeon processors), the theoretical maximum per-process address space under Solaris X86 is 3.75 GB. However, on computer systems with more than 4 GB memory installed, the imposed limit may be even less than 3 GB. To increase the per-process limit in such cases, the base address where the kernel should be loaded may be tuned with an eeprom setting. (See the Solaris eeprom man page in section 1M). To make the change, while root, enter the command "eeprom kernelbase=0xD9800000" and reboot.
N.B. The kernel base can supposedly be loaded at the slightly higher address of 0xE0000000, but practice has shown that for at least one installed instance of Solaris X86 version 9, the system would hang if the kernel base was set to that value.