biomolecule
¶
The biomolecule object used in PDB2PQR and associated methods.
Todo
This module should be broken into separate files.
Authors: Todd Dolinsky, Yong Huang
-
class
pdb2pqr.biomolecule.
Biomolecule
(pdblist, definition)[source]¶ Biomolecule class.
This class represents the parsed PDB, and provides a hierarchy of information - each Biomolecule object contains a list of Chain objects as provided in the PDB file. Each Chain then contains its associated list of Residue objects, and each Residue contains a list of Atom objects, completing the hierarchy.
-
__init__
(pdblist, definition)[source]¶ Initialize using parsed PDB file
Parameters: - pdblist (list) – list of objects from
pdb
from lines of PDB file - definition (Definition) – topology definition object
- pdblist (list) – list of objects from
-
add_hydrogens
(hlist=None)[source]¶ Add the hydrogens to the biomolecule.
This requires either the rebuild_tetrahedral function for tetrahedral geometries or the standard quatfit methods. These methods use three nearby bonds to rebuild the atom; the closer the bonds, the more accurate the results. As such the peptide bonds are used when available.
-
apply_force_field
(forcefield_)[source]¶ Apply the forcefield to the atoms within the biomolecule.
Parameters: forcefield (Forcefield) – forcefield object Returns: (list of atoms that were found in the forcefield, list of atoms that were not found in the forcefield) Return type: (list, list)
-
apply_name_scheme
(forcefield_)[source]¶ Apply the naming scheme of the given forcefield.
Parameters: forcefield (Forcefield) – forcefield object
-
apply_patch
(patchname, residue)[source]¶ Apply a patch to the given residue.
This is one of the key functions in PDB2PQR. A similar function appears in
definitions
- that version is needed for residue level subtitutions so certain protonation states (i.e. CYM, HSE) are detectatble on input.This version looks up the particular patch name in the patch_map stored in the biomolecule, and then applies the various commands to the reference and actual residue structures.
See the inline comments for a more detailed explanation.
Parameters:
-
apply_pka_values
(force_field, ph, pkadic)[source]¶ Apply calculated pKa values to assign titration states.
Parameters:
-
assign_termini
(chain, neutraln=False, neutralc=False)[source]¶ Assign the termini for the given chain.
Assignment made by looking at the start and end residues.
Parameters:
-
calculate_dihedral_angles
()[source]¶ Calculate dihedral angles for every residue in the biomolecule.
-
charge
¶ Get the total charge on the biomolecule
Todo
Since the misslist is used to identify incorrect charge assignments, this routine does not list the 3 and 5 termini of nucleic acid chains as having non-integer charge even though they are (correctly) non-integer.
Returns: (list of residues with non-integer charges, the total charge on the biomolecule) Return type: (list, float)
-
create_html_typemap
(definition, outfilename)[source]¶ Create an HTML typemap file at the desired location.
If a type cannot be found for an atom a blank is listed.
Parameters: - definition (Definition) – the definition objects.
- outfilename (str) – the name of the file to write
-
create_residue
(residue, resname)[source]¶ Create a residue object.
If the resname is a known residue type, try to make that specific object, otherwise just make a standard residue object.
Parameters: Returns: the residue object
Return type:
-
hold_residues
(hlist)[source]¶ Set fixed state of specified residues.
Parameters: hlist ([(str, str, str)]) – list of (res_seq, chainid, ins_code) specifying the residues for altering fixed state status.
-
num_bio_atoms
¶ Return the number of ATOM (not HETATM) records in the biomolecule.
Returns: number of ATOM records Return type: int
-
num_heavy
¶ Return number of biomolecular heavy atoms in structure.
Todo
Figure out if this is redundant with
Biomolecule.num_bio_atoms()
Note
Includes hydrogens (but those are stripped off eventually)
Returns: number of heavy atoms Return type: int
-
num_missing_heavy
¶ Return number of missing biomolecular heavy atoms in structure.
Returns: number of missing heavy atoms in structure Return type: int
-
repair_heavy
()[source]¶ Repair all heavy atoms.
Unfortunately the first time we get to an atom we might not be able to rebuild it - it might depend on other atoms to be rebuild first (think side chains). As such a ‘seenmap’ is used to keep track of what we’ve already seen and subsequent attempts to rebuild the atom.
Raises: ValueError – missing atoms prevent reconstruction
-
set_reference_distance
()[source]¶ Set the distance to the CA atom in the residue.
This is necessary for determining which atoms are allowed to move during rotations. Uses the
shortest_path()
algorithm found inutilities
.Raises: ValueError – if shortest path cannot be found (e.g., if the atoms are not connected)
-
set_states
()[source]¶ Set the state of each residue.
This is the last step before assigning the forcefield, but is necessary so as to distinguish between various protonation states.
See
aa
for residue-specific functions.
-
set_termini
(neutraln=False, neutralc=False)[source]¶ Set the termini for a protein.
First set all known termini by looking at the ends of the chain. Then examine each residue, looking for internal chain breaks.
Todo
This function needs to be cleaned and simplified
Parameters:
-
update_bonds
()[source]¶ Update the bonding network of the biomolecule.
This happens in 3 steps:
- Apply the PEPTIDE patch to all Amino residues to add reference for the N(i+1) and C(i-1) atoms
- UpdateInternal_bonds for inter-residue linking
- Set the links to the N(i+1) and C(i-1) atoms
-
update_internal_bonds
()[source]¶ Update the internal bonding network.
Update using the reference objects in each atom.
-