Manual page for apps.StrCalc program

%TStrCalc - simple calculations on protein structures.%NThe program can calculate various distances and angles. It is also possible to change protein representation into a reduced model basing on a config file provided by a user.

-calc.mass=<T|F> calculates molecular mass, result is reported in Daltons
-first_map_index=<integer> sets the index of the first residue in a contact map, distance map or minimum distance mapto a given value.
-h print a brief summary of available options
-help=<name-part> print a help message on the screen - ANSI terminal version with visual enhancements. If <name-part> argumen is given, the program will print only these options that contains that substring
-help.dox=<name> print a help message in doxygen (*.dox) format on the screen for the StrCalc program
-help.md=<name> print a help message in markdown (*.md) format on the screen
-help.option=<option-name> print a help message for a single option on the screen.
-help.plain=<name-part> print a help message on the screen - plain text version. If <name-part> argumen is given, the program will print only these options that contains that substring
-in.dssp=<file name> an input file in the DSSP format
-in.fasta=<file name> an input file in the FASTA format
-in.pdb=<file name> an input data in the PDB format. This option accepts the following types of input:
- file in PDB format (possibly gzip’ped)
- several files in PDB format (possibly gzip’ped) - in this case you must use -in.pdb.comma_separated option to let the program know to split the input string
- just PDB code, with -online option the data is downloaded from www.rcsb.org
- just PDB code, with -in.pdb.search_path option the program will look for the right file
-in.pdb.all_models=<T|F> forces PDB reader to take all the models from a PDB file.
-in.pdb.comma_separated forces -in.pdb option to look for several PDB file names, separated by a comma. In This case any of the file names may not contain a comma character.
Example:
-in.pdb.comma_separated -in.pdb=2gb1.pdb,2aza.pdb
-in.pdb.create_bu=<T|F> forces PDB reader to create biological unit for each structure. Biological unit creation is solely based on information stored in the PDB file header.Any error in the header will affect the resulting biological unit. This option fores PDB reading mechanism to create an array of structures for each structure (MODEL data) in the given PDB. Therefore (to avoid handling too many molecules) it is advised NOT to use -in.pdb.all_models combined with this option.
-in.pdb.dir=<dir name> provides directory with PDB files
-in.pdb.file_mask=<mask_regexp> provides a mask for directory-based input, e.g. for -input_pdb_dir option. Without file_mask, directory related options try to read all possible files in a directory. This is a way to change it and narrow the selection. In general a file mask should follow the rules of regular expression in Java. The only exceptions are >.< (dot character) and >*< (asterix) that should be given explicitly (with no escaping). Example masks are: *.pdb 1[ABC]*.pdb 1(MBA|;mba).pdb
-in.pdb.first_model_only=<T|F> forces PDB reader to take only the first model from a PDB file.
-in.pdb.list=<file name> an input text file that lists names of PDB files (with paths if necessary)
-in.pdb.online=<T|F> download a PDB file from www.rcsb.org rather than reading a file. In this case the parameter given to -ip must provide a valid four-character PDB code
-in.pdb.read_hydrogens=<T|F> forces PDB reader to read in all hydrogen atoms. This by default is switched off and all hydrogens are discarded
-in.pdb.search_path=<path string> provides a path where -ip and other PDB-reading options will look for a PDB data. In this case the parameter given to -ip must provide a valid four-character PDB code rather than a file name. Then, for the code (say, 1abc), several possible file locations will be tested, e.g:
PATH/1abc
PATH/1abc.pdb
PATH/1abc.pdb.gz
PATH/pdb1abc.ent
PATH/pdb1abc.ent.gz
PATH/1ABC
PATH/1ABC.PDB
PATH/ab/pdb1abc.ent
PATH/ab/pdb1abc.ent.gz
-in.pdb.skip_header=<T|F> skip a header when parsing a PDB file.
-in.seq=<file name> an input file in the SEQ format
-in.xyz=<file name> an input file in the XYZ format
-index_map_by_residues=<T|F> contact map, distance map or minimum distance map will use residue names and indexes instead of raw integers. This makes the output more human-readable but resulting files are much biger
-mute suppress all messages from a given package or class, e.g. “-mute=jbcl.data.formats”, or “-mute=jbcl.calc.structural.Crmsd”. It is also possible to switch of a whole branch from the jbcl library, e.g. “-mute=jbcl.data” will mute all comming from jbcl.data.formats,jbcl.data.types, jbcl.data.dict and jbcl.data.basic. To switch all the messages, say: “-mute=jbcl” or simply “-mute” because the default behaviour is to mute everything. This option is executed AFTER -verbose, so user may increase verbosity level to a desired valueand then selectively switch off logging from some packages
-select.aa selects only amino acid residues.
-select.atoms=<strings> selects atoms defined by their PDB name. Example: -select.atoms=CA,N,C,O select all backbone atoms
-select.bb filters input protein structure(s) and removes all the atoms except its backbone.
-select.bb_cb filters input protein structure(s) and removes all the atoms except its backbone or beta carbon.
-select.ca filters input protein structure(s) and removes all the atoms except alpha carbons.
-select.chains=<characters> selects chains defined by their PDB id (single character per chain). Example: -select.chains=ABD
-select.elements=<strings> selects atoms defined by their chemical element.
Example: -select.elements=N,O select all oxygens and nitrogens
-select.elements=C -select.bb_cb select all carbons from backbone + CB, effectively carbonyl C, CA and CB
-select.fragment=<selection expression> selects residues and chains defined by their PDB residue ID and chain ID. The selection string may be a combination of selectors, separated with a semicolon. For example the following: -select.fragment=A.43:78;B.12:89 selects residues 48:78 from chain A and residues 12:89 from chain B.
-select.residues_by_id=<residue selection> selects residues defined by their PDB ID. All chains will be processed separately.
-select.residues_by_index=<residue selection> selects residues defined by the order they appear in a structure. The first residue has index 0. All chains will be processed separately.
-strcalc.a13 calculates all possible r12 distances (between adjecent CA atoms)
-strcalc.bf report average temperature factor for each residue
-strcalc.chi calculates all possible dihedral chi angles
-strcalc.distmap.ca=<file name> calculates a distance map between CA atoms and prints results into a file
-strcalc.distmap.full=<file name> calculates a distance map between all atoms (full-atom distance map) and prints results into a file
-strcalc.distmap.minres=<file name> calculates a map of the minimum distances between residues and prints results into a file
-strcalc.full_ss_string=<T|F> Report secondary for 6 residues rather than for the central one
-strcalc.missing report the number of missing atoms in each residue
-strcalc.omega calculates all possible omega dihedral angles (between CA - C - N - CA atoms)
-strcalc.phi calculates all possible Phi dihedral angles (between C - N - CA - C atoms)
-strcalc.phi_psi calculates all possible Phi and Psi dihedral angles
-strcalc.psi calculates all possible Psi dihedral angles (between N - CA - C - N atoms)
-strcalc.r12 calculates all possible r12 distances (between adjecent CA atoms)
-strcalc.r13 calculates all possible r12 distances (between adjecent CA atoms)
-strcalc.r14x calculates all possible r12 distances (between adjecent CA atoms)
-strcalc.r15 calculates all possible r12 distances (between adjecent CA atoms)
-strcalc.s2=<T|F> calculates a square value of gyration radius for a protein
-strcalc.t14 calculates all possible r12 distances (between adjecent CA atoms)
-strcalc.tau calculates all possible tau planar angles (between N - CA - C
-strcalc.torsion_class detects torsion class for each residue (ABEGO classification)
-strcalc.use_degrees by default BioShell reports results in radians. Use this option to see degrees
-verbose=<integer> Sets up a verbosity level to a given value. The argument should be an integer from the rangefrom 0 (no messages at all, which is equivalent to -mute=jbcl) to 6 when everything is logged. See -mute for additional information.

EXAMPLES

      (1) Print torsion angles for a given protein: backbone Phi, Psi and
Omega angles as well as Chi angles in side chains.
    java apps.StrCalc -ip=2GB1.pdb -phi_psi -omega -chi


      (2) Calculate a distance map based on CA atoms. Force residue indexes
start from 1, not 0
    java apps.StrCalc -ip=2GB1.pdb -strcalc.distmap.ca -first_map_index=1


      (3) Calculate a contact map in all-atom definition. This example
employs awk program to convert distances to contacts.
    java apps.StrCalc -ip=2GB1.pdb -strcalc.distmap.minres | awk '{if($3<4.5) print $0}'


      (4) Gather Phi/Psi observations for GLU residue that IS NOT followed
by a PRO. Read all PDB files from a given directory that matches the
provided pattern. StrCalc prints all the observations that are further
filtered by awk. It checks if the previous residue is different to PRO
and then prints the observation line
    java apps.StrCalc -in.pdb.file_mask=2g*.pdb  -in.pdb.dir=./pdb/ -phi_psi -deg -v=1 | | awk '{if($2!="PRO") print prevLine; prevLine = $0;}'