Manual page for apps.PsiBlastSearch program

PsiBlastSearch - runs PsiBlast program. It is possible to scan a whole range of PsiBlast input prameters. The jobs may be distributed among several threads.

-align.gap_extend=<value> Penalty for gap opening that will be used for sequence or profile alignment. When used with BLOSSUM or PAM matrices, is usually in the range [-2,-1].
-align.gap_open=<value> Penalty for gap opening that will be used for sequence or profile alignment. When used with BLOSSUM or PAM matrices, is usually in the range [-12,-10].
-align.query.fasta=<file name> an input query (target) sequence(s) in FASTA format
-align.query.pdb=<file name> an input query (target) file in the PDB format
-align.query.pir=<file name> reads an input file in the PIR format and extracts all sequences. These will be used as a queries (targets) in alignment calculations
-align.query.pssm=<file name> an input file in the PsiBlast profile (PSSM) format
-align.query.seq=<file name> an input query sequence in SEQ format
-align.query.string=<seq_string> provide a query sequence as a FASTA-like string
-blast.cmd=<string> full command to run blast, including path
-blast.db=<string> full path to the database to be used in blast searches
-blast.dry_run in dry run mode the program does not run (Psi)Blast; all the commands are printed on stdout
-blast.evalue=<value> E-value cutoff to report a hit.
-blast.evalue_niter=<file name> The option expects a simple file with two columns to be provided: the first column should contain e-value values,the number of iterations should be given in the second one.The program will run launch a PsiBlast search foreach listed combination, i.e. as many runs as the number of rows in the file.
-blast.matrix=<string> name of the similarity matrix
-blast.n_alignments=<value> how many alignments to be reported.
-blast.n_cpu=<number> how many CPU to be used for blast threads.
-blast.n_hits=<value> how many hits (sequences) to be reported.
-blast.n_iter=<value> how many PsiBlast iterations should be done.
-blast.outname_root=<string> returns a root name for all output files; for example when user puts -o=myprot than an output PSSM will be in the file myprot.pssm
-blast.param_scan=<T|F> scans through a set of predefined PsiBlast parameters. If set to false, only a single PsiBlast job will be launched for every query sequence, with the PsiBlast parameters extracted from PsiBlastSearch flags
-blast.plus=<T|F> if true, it is assumed that the BLAST+ vesion of the software is used
-blast.prof_evalue=<value> E-value cutoff when a hit is included into a profile for the next iteration of PsiBlast.
-blast.show_cfg prints Psi-Blast cofiguration on stdout
-h print a brief summary of available options
-help=<name-part> print a help message on the screen - ANSI terminal version with visual enhancements. If <name-part> argumen is given, the program will print only these options that contains that substring
-help.dox=<name> print a help message in doxygen (*.dox) format on the screen for the PsiBlastSearch program
-help.md=<name> print a help message in markdown (*.md) format on the screen
-help.option=<option-name> print a help message for a single option on the screen.
-help.plain=<name-part> print a help message on the screen - plain text version. If <name-part> argumen is given, the program will print only these options that contains that substring
-in.pdb.all_models=<T|F> forces PDB reader to take all the models from a PDB file.
-in.pdb.comma_separated forces -in.pdb option to look for several PDB file names, separated by a comma. In This case any of the file names may not contain a comma character.
Example:
-in.pdb.comma_separated -in.pdb=2gb1.pdb,2aza.pdb
-in.pdb.create_bu=<T|F> forces PDB reader to create biological unit for each structure. Biological unit creation is solely based on information stored in the PDB file header.Any error in the header will affect the resulting biological unit. This option fores PDB reading mechanism to create an array of structures for each structure (MODEL data) in the given PDB. Therefore (to avoid handling too many molecules) it is advised NOT to use -in.pdb.all_models combined with this option.
-in.pdb.first_model_only=<T|F> forces PDB reader to take only the first model from a PDB file.
-in.pdb.online=<T|F> download a PDB file from www.rcsb.org rather than reading a file. In this case the parameter given to -ip must provide a valid four-character PDB code
-in.pdb.read_hydrogens=<T|F> forces PDB reader to read in all hydrogen atoms. This by default is switched off and all hydrogens are discarded
-in.pdb.search_path=<path string> provides a path where -ip and other PDB-reading options will look for a PDB data. In this case the parameter given to -ip must provide a valid four-character PDB code rather than a file name. Then, for the code (say, 1abc), several possible file locations will be tested, e.g:
PATH/1abc
PATH/1abc.pdb
PATH/1abc.pdb.gz
PATH/pdb1abc.ent
PATH/pdb1abc.ent.gz
PATH/1ABC
PATH/1ABC.PDB
PATH/ab/pdb1abc.ent
PATH/ab/pdb1abc.ent.gz
-in.pdb.skip_header=<T|F> skip a header when parsing a PDB file.
-mute suppress all messages from a given package or class, e.g. “-mute=jbcl.data.formats”, or “-mute=jbcl.calc.structural.Crmsd”. It is also possible to switch of a whole branch from the jbcl library, e.g. “-mute=jbcl.data” will mute all comming from jbcl.data.formats,jbcl.data.types, jbcl.data.dict and jbcl.data.basic. To switch all the messages, say: “-mute=jbcl” or simply “-mute” because the default behaviour is to mute everything. This option is executed AFTER -verbose, so user may increase verbosity level to a desired valueand then selectively switch off logging from some packages
-verbose=<integer> Sets up a verbosity level to a given value. The argument should be an integer from the rangefrom 0 (no messages at all, which is equivalent to -mute=jbcl) to 6 when everything is logged. See -mute for additional information.

EXAMPLES

      (1) Runs PsiBlast with a set of predefined set of parameters.
Distributes the jobs over 8 CPUs.
    java apps.PsiBlastSearch -blast.db=../db/nr -if=target.fasta -blast.n_cpu=8 -blast.param_scan


      (2) As above, but specify also the location of blastpgp executable
and the database
    java apps.PsiBlastSearch -blast.db=../db/nr -if=target.fasta -blast.n_cpu=8 -blast.param_scan