Introduction
A detailed introduction is pending.
1. Layout of Biskit
This is a schematic overview over the Biskit project. The main parts are:
|
Biskit module outline
The most important folder for you as developer is biskit/Biskit which you should include into your Pythonpath. The Biskit module is structured into:
- . -- the base package with common classes
- PVM -- parallelisation-related classes
- Dock -- protein-protein complexes and docking
- Mod -- homology modelling
2. Before you start...
Please aquaint yourself with the basic array handling methods of Numeric Python (Numpy), in particular with:
-
take (extracting part of an array given positions)
-
compress (extracting part of an array given a mask)
-
put (assign values to array positions)
Other useful methods are: shape, ravel, argsort, argmax, nonzero, clip, sum, resize. The Numpy tutorial gives an introduction to these and other functions.
Numpy functions
are not only heavily used in Biskit but Biskit also implements funtions
of the same name to extract and manipulate parts of structures and
trajectories.
3. Basic classes
The probably most central classes of the Biskit package are:
PDBModel plus, depending on what you want to do, Trajectory, PDBDope and Dock/Complex. Until we put a short tutorial online, please have yourself a look at the extensive documentation and example code of these classes! See also the online API reference which is generated from the source code. Here is a schematic overview of the data model (see also our application note):
A detailed description of PDBModel can be found in our draft tutorial for using PDBModel!
4. Examples
Just as an appetizer -- assuming Biskit is installed you can calculate the CA rmsd of two closely related proteins (with some residues or atoms different) as follows:
import Biskit as B
m1 = B.PDBModel( "your/structure1.pdb" )
m2 = B.PDBModel( "your/related/structure.pdb" )
# align both models to the same residue content
# and atom order and content
i1, i2 = m1.compareAtoms( m2 )
m1 = m1.take( i1 ) # take atoms common with m2
m2 = m2.take( i2 ) # take atoms common with m1
## some checking for demonstration
assert len( m1 ) == len( m2 )
assert m1.sequence() == m2.sequence()
## get CA structures
ca1 = m1.compress( m1.maskCA() )
ca2 = m2.compress( m2.maskCA() )
## RMSD
rms = ca1.rms( ca2, fit=1 )
Here is a second example that calculates what fraction of a protein's surface is contributed by beta sheets.
import Biskit as B
import numpy as N
m = B.PDBModel('structure.pdb')
## add information calculated by external programs
d = B.PDBDope(m)
d.addSurfaceRacer() # MS, AS, relMS, relAS and curvature
d.addSecondaryStructure() # secondary structure from DSSP
## molecular surface of all atoms
ms_profile = m['MS']
## mask for all atoms of the given secondary structure
ss_mask = N.array( m.res2atomProfile('secondary') ) == 'E'
## fraction of total surface
ss_fraction = N.sum(ms_profile * ss_mask) / N.sum(ms_profile}
Note: This example requires the installation of two external programs: (1) SurfaceRacer and (2) DSSP. Please follow the links to installation instructions!
5. Tips and hints / finding help
Python has several methods to explore the content and documentation of a module interactively:- dir() -- lists the content of a module, object or the current namespace
- object.__doc__ -- contains the doc-string of a function or class
For example:
>>> import Biskit as BThe info function in Biskit.tools combines dir() and __doc__ strings to give a quick overview over the fields and methods of a class, object or module:
>>> dir( B )
['AmberCrdParser', 'AmberParmBuilder', 'AmberRstParser', 'AmbiguousMatch', 'BisList',
'BisListError', 'BiskitError'...]
>>> print B.PDBDope.__doc__
Decorate a PDBModel with calculated properties (profiles)
>>> dir( B.PDBDope )
['__doc__', '__init__', '__module__', 'addASA', 'addConservation', 'addDensity', 'addFoldX',
'addIntervor', 'addSecondaryStructure', 'addSurfaceMask', 'addSurfaceRacer', 'model', 'version']
>>> print B.PDBDope.addSecondaryStructure.__doc__
Adds a residue profile with the secondary structure as
calculated by the DSSP program.
Profile code::
B = residue in isolated beta-bridge
E = extended strand, participates in beta ladder
G = 3-helix (3/10 helix)
I = 5 helix (pi helix)
T = hydrogen bonded turn
S = bend
. = loop or irregular
@raise ExeConfigError: if external application is missing
>>> from Biskit.tools import info
>>> from Biskit import PDBDope
>>> info( PDBDope )
NAME: PDBDope
ID: 139712764
TYPE: <type 'classobj'>
VALUE: <class Biskit.PDBDope.PDBDope at 0x853d8fc>
CALLABLE: Yes
DOC: Decorate a PDBModel with calculated properties (profiles)
METHODS
__init__ : @param model: model to dope
addASA : Add profiles of Accessible Surface Area: 'relASA', 'ASA_tota..
addConservation: Adds a conservation profile. See L{Biskit.Hmmer}
addDensity : Count the number of heavy atoms within the given radius.
addFoldX : Adds dict with fold-X energies to PDBModel's info dict.
addIntervor : Triangulate a protein-protein interface with intervor.
addSecondaryStructure: Adds a residue profile with the secondary structure as
addSurfaceMask : Adds a surface mask profie that contains atoms with > 40% ex..
addSurfaceRacer: Always adds three different profiles as calculated by fastSu..
model : @return: model
version : @return: version of class
FIELDS
addFoldX : <function addFoldX at 0x8544bc4>
...