The biomolecule object used in PDB2PQR and associated methods.
This module should be broken into separate files.
Authors: Todd Dolinsky, Yong Huang
This class represents the parsed PDB, and provides a hierarchy of information - each Biomolecule object contains a list of Chain objects as provided in the PDB file. Each Chain then contains its associated list of Residue objects, and each Residue contains a list of Atom objects, completing the hierarchy.
Initialize using parsed PDB file
Add the hydrogens to the biomolecule.
This requires either the rebuild_tetrahedral function for tetrahedral geometries or the standard quatfit methods. These methods use three nearby bonds to rebuild the atom; the closer the bonds, the more accurate the results. As such the peptide bonds are used when available.
Apply the forcefield to the atoms within the biomolecule.
Parameters: forcefield (Forcefield) – forcefield object Returns: (list of atoms that were found in the forcefield, list of atoms that were not found in the forcefield) Return type: (list, list)
Apply the naming scheme of the given forcefield.
Parameters: forcefield (Forcefield) – forcefield object
Apply a patch to the given residue.
This is one of the key functions in PDB2PQR. A similar function appears in
definitions- that version is needed for residue level subtitutions so certain protonation states (i.e. CYM, HSE) are detectatble on input.
This version looks up the particular patch name in the patch_map stored in the biomolecule, and then applies the various commands to the reference and actual residue structures.
See the inline comments for a more detailed explanation.
apply_pka_values(force_field, ph, pkadic)¶
Apply calculated pKa values to assign titration states.
assign_termini(chain, neutraln=False, neutralc=False)¶
Assign the termini for the given chain.
Assignment made by looking at the start and end residues.
Calculate dihedral angles for every residue in the biomolecule.
Get the total charge on the biomolecule
Since the misslist is used to identify incorrect charge assignments, this routine does not list the 3 and 5 termini of nucleic acid chains as having non-integer charge even though they are (correctly) non-integer.
Returns: (list of residues with non-integer charges, the total charge on the biomolecule) Return type: (list, float)
Create an HTML typemap file at the desired location.
If a type cannot be found for an atom a blank is listed.
Create a residue object.
If the resname is a known residue type, try to make that specific object, otherwise just make a standard residue object.
the residue object
Set fixed state of specified residues.
Parameters: hlist ([(str, str, str)]) – list of (res_seq, chainid, ins_code) specifying the residues for altering fixed state status.
Return the number of ATOM (not HETATM) records in the biomolecule.
Returns: number of ATOM records Return type: int
Return number of biomolecular heavy atoms in structure.
Figure out if this is redundant with
Includes hydrogens (but those are stripped off eventually)
Returns: number of heavy atoms Return type: int
Return number of missing biomolecular heavy atoms in structure.
Returns: number of missing heavy atoms in structure Return type: int
Remove hydrogens from the biomolecule.
Repair all heavy atoms.
Unfortunately the first time we get to an atom we might not be able to rebuild it - it might depend on other atoms to be rebuild first (think side chains). As such a ‘seenmap’ is used to keep track of what we’ve already seen and subsequent attempts to rebuild the atom.
Raises: ValueError – missing atoms prevent reconstruction
Generate new serial numbers for atoms in the biomolecule.
Set the donors and acceptors within the biomolecule.
Set all HIS states to HIP.
Set the distance to the CA atom in the residue.
This is necessary for determining which atoms are allowed to move during rotations. Uses the
shortest_path()algorithm found in
Raises: ValueError – if shortest path cannot be found (e.g., if the atoms are not connected)
Set the state of each residue.
This is the last step before assigning the forcefield, but is necessary so as to distinguish between various protonation states.
aafor residue-specific functions.
Set the termini for a protein.
First set all known termini by looking at the ends of the chain. Then examine each residue, looking for internal chain breaks.
This function needs to be cleaned and simplified
Update the bonding network of the biomolecule.
This happens in 3 steps:
- Apply the PEPTIDE patch to all Amino residues to add reference for the N(i+1) and C(i-1) atoms
- UpdateInternal_bonds for inter-residue linking
- Set the links to the N(i+1) and C(i-1) atoms
Update the internal bonding network.
Update using the reference objects in each atom.
Find the type of residue as notated in the Amino Acid definition.
Why are we setting residue types to numeric values (see code)?
Check and set SS-bridge partners.