pdb

PDB parsing class

This module parses PDBs in accordance to PDB Format Description Version 2.2 (1996); it is not very forgiving. Each class in this module corresponds to a record in the PDB Format Description. Much of the documentation for the classes is taken directly from the above PDB Format Description.

Code author: Todd Dolinsky

Code author: Yong Huang

Code author: Nathan Baker

class pdb2pqr.pdb.ANISOU(line)[source]

ANISOU class

The ANISOU records present the anisotropic temperature factors.

__init__(line)[source]

Initialize by parsing line:

COLUMNS TYPE FIELD DEFINITION
7-11 int serial Atom serial number.
13-16 string name Atom name.
17 string alt_loc Alternate location indicator.
18-20 string res_name Residue name.
22 string chain_id Chain identifier.
23-26 int res_seq Residue sequence number.
27 string ins_code Insertion code.
29-35 int u00 U(1,1)
36-42 int u11 U(2,2)
43-49 int u22 U(3,3)
50-56 int u01 U(1,2)
57-63 int u02 U(1,3)
64-70 int u12 U(2,3)
73-76 string seg_id Segment identifier, left-justified.
77-78 string element Element symbol, right-justified.
79-80 string charge Charge on the atom.
Parameters:line (str) – line with PDB class
class pdb2pqr.pdb.ATOM(line)[source]

ATOM class

The ATOM records present the atomic coordinates for standard residues. They also present the occupancy and temperature factor for each atom. Heterogen coordinates use the HETATM record type. The element symbol is always present on each ATOM record; segment identifier and charge are optional.

__init__(line)[source]

Initialize by parsing line

COLUMNS TYPE FIELD DEFINITION
7-11 int serial Atom serial number.
13-16 string name Atom name.
17 string alt_loc Alternate location indicator.
18-20 string res_name Residue name.
22 string chain_id Chain identifier.
23-26 int res_seq Residue sequence number.
27 string ins_code Code for insertion of residues.
31-38 float x Orthogonal coordinates for X in Angstroms.
39-46 float y Orthogonal coordinates for Y in Angstroms.
47-54 float z Orthogonal coordinates for Z in Angstroms.
55-60 float occupancy Occupancy.
61-66 float temp_factor Temperature factor.
73-76 string seg_id Segment identifier, left-justified.
77-78 string element Element symbol, right-justified.
79-80 string charge Charge on the atom.
Parameters:line (str) – line with PDB class
class pdb2pqr.pdb.AUTHOR(line)[source]

AUTHOR field

The AUTHOR record contains the names of the people responsible for the contents of the entry.

__init__(line)[source]

Initialize by parsing a line

COLUMNS TYPE FIELD DEFINITION
11-70 string author_list List of the author names, separated by commas
Parameters:line (str) – line with PDB class
class pdb2pqr.pdb.BaseRecord(line)[source]

Base class for all records.

Verifies the received record type.

__init__(line)[source]

Initialize self. See help(type(self)) for accurate signature.

record_type()[source]

Return PDB record type as string.

Returns:record type
Return type:str
class pdb2pqr.pdb.CAVEAT(line)[source]

CAVEAT field

CAVEAT warns of severe errors in an entry. Use caution when using an entry containing this record.

__init__(line)[source]

Initialize by parsing line.

COLUMNS TYPE FIELD DEFINITION
12-15 string id_code PDB ID code of this entry.
20-70 string comment Free text giving the reason for the CAVEAT.
Parameters:line (str) – line with PDB class
class pdb2pqr.pdb.CISPEP(line)[source]

CISPEP field

CISPEP records specify the prolines and other peptides found to be in the cis conformation. This record replaces the use of footnote records to list cis peptides.

__init__(line)[source]

Initialize by parsing line

COLUMNS TYPE FIELD DEFINITION
8-10 int ser_num Record serial number.
12-14 string pep1 Residue name.
16 string chain_id1 Chain identifier.
18-21 int seq_num1 Residue sequence number.
22 string icode1 Insertion code.
26-28 string pep2 Residue name.
30 string chain_id2 Chain identifier.
32-35 int seq_num2 Residue sequence number.
36 string icode2 Insertion code.
44-46 int mod_num Identifies the specific model.
54-59 float measure Measure of the angle in degrees.
Parameters:line (str) – line with PDB class
class pdb2pqr.pdb.COMPND(line)[source]

COMPND field

The COMPND record describes the macromolecular contents of an entry. Each macromolecule found in the entry is described by a set of token: value pairs, and is referred to as a COMPND record component. Since the concept of a molecule is difficult to specify exactly, PDB staff may exercise editorial judgment in consultation with depositors in assigning these names.

For each macromolecular component, the molecule name, synonyms, number assigned by the Enzyme Commission (EC), and other relevant details are specified.

__init__(line)[source]

Initialize by parsing a line

COLUMNS TYPE FIELD DEFINITION
11-70 string compound Description of the molecular list components.
Parameters:line (str) – line with PDB class
class pdb2pqr.pdb.CONECT(line)[source]

CONECT class

The CONECT records specify connectivity between atoms for which coordinates are supplied. The connectivity is described using the atom serial number as found in the entry. CONECT records are mandatory for HET groups (excluding water) and for other bonds not specified in the standard residue connectivity table which involve atoms in standard residues (see Appendix 4 for the list of standard residues). These records are generated by the PDB.

__init__(line)[source]

Initialize by parsing line

COLUMNS TYPE FIELD DEFINITION
7-11 int serial Atom serial number
12-16 int serial1 Serial number of bonded atom
17-21 int serial2 Serial number of bonded atom
22-26 int serial3 Serial number of bonded atom
27-31 int serial4 Serial number of bonded atom
32-36 int serial5 Serial number of hydrogen bonded atom
37-41 int serial6 Serial number of hydrogen bonded atom
42-46 int serial7 Serial number of salt bridged atom
47-51 int serial8 Serial number of hydrogen bonded atom
52-56 int serial9 Serial number of hydrogen bonded atom
57-61 int serial10 Serial number of salt bridged atom
Parameters:line (str) – line with PDB class
class pdb2pqr.pdb.CRYST1(line)[source]

CRYST1 class

The CRYST1 record presents the unit cell parameters, space group, and Z value. If the structure was not determined by crystallographic means, CRYST1 simply defines a unit cube.

__init__(line)[source]

Initialize by parsing line

COLUMNS TYPE FIELD DEFINITION
7-15 float a a (Angstroms).
16-24 float b b (Angstroms).
25-33 float c c (Angstroms).
34-40 float alpha alpha (degrees).
41-47 float beta beta (degrees).
48-54 float gamma gamma (degrees).
56-66 string space_group Space group.
67-70 int z Z value.
Parameters:line (str) – line with PDB class
class pdb2pqr.pdb.DBREF(line)[source]

DBREF field

The DBREF record provides cross-reference links between PDB sequences and the corresponding database entry or entries. A cross reference to the sequence database is mandatory for each peptide chain with a length greater than ten (10) residues. For nucleic acid entries a DBREF record pointing to the Nucleic Acid Database (NDB) is mandatory when the corresponding entry exists in NDB.

__init__(line)[source]

Initialize by parsing a line.

COLUMNS TYPE FIELD DEFINITION
8-11 string id_code ID code of this entry.
13 string chain_id Chain identifier.
15-18 int seq_begin Initial sequence number of the PDB sequence segment.
19 string insert_begin Initial insertion code of the PDB sequence segment.
21-24 int seq_end Ending sequence number of the PDB sequence segment.
25 string insert_end Ending insertion code of the PDB sequence segment.
27-32 string database Sequence database name. “PDB” when a corresponding sequence database entry has not been identified.
34-41 string db_accession Sequence database accession code. For GenBank entries, this is the NCBI gi number.
43-54 string db_id_code Sequence database identification code. For GenBank entries, this is the accession code.
56-60 int db_seq_begin Initial sequence number of the database seqment.
61 string db_ins_begin Insertion code of initial residue of the segment, if PDB is the reference.
63-67 int dbseq_end Ending sequence number of the database segment.
68 string db_ins_end Insertion code of the ending residue of the segment, if PDB is the reference.
Parameters:line (str) – line with PDB class
class pdb2pqr.pdb.END(line)[source]

END class

The END records are paired with MODEL records to group individual structures found in a coordinate entry.

__init__(line)[source]

Initialize with line.

Parameters:line (str) – line with PDB class
class pdb2pqr.pdb.ENDMDL(line)[source]

ENDMDL class

The ENDMDL records are paired with MODEL records to group individual structures found in a coordinate entry.

__init__(line)[source]

Initialize self. See help(type(self)) for accurate signature.

class pdb2pqr.pdb.EXPDTA(line)[source]

EXPDTA field

The EXPDTA record identifies the experimental technique used. This may refer to the type of radiation and sample, or include the spectroscopic or modeling technique. Permitted values include:

  • ELECTRON DIFFRACTION
  • FIBER DIFFRACTION
  • FLUORESCENCE TRANSFER
  • NEUTRON DIFFRACTION
  • NMR
  • THEORETICAL MODEL
  • X-RAY DIFFRACTION
__init__(line)[source]

Initialize by parsing a line

COLUMNS TYPE FIELD DEFINITION
11-70 string technique The experimental technique(s) with optional comment describing the sample or experiment
Parameters:line (str) – line with PDB class
class pdb2pqr.pdb.FORMUL(line)[source]

FORMUL field

The FORMUL record presents the chemical formula and charge of a non-standard group.

__init__(line)[source]

Initialize by parsing line

COLUMNS TYPE FIELD DEFINITION
9-10 int comp_num Component number
13-15 string hetatm_id Het identifier
19 string asterisk * for water
20-70 string text Chemical formula
Parameters:line (str) – line with PDB class
class pdb2pqr.pdb.HEADER(line)[source]

HEADER field

The HEADER record uniquely identifies a PDB entry through the id_code field. This record also provides a classification for the entry. Finally, it contains the date the coordinates were deposited at the PDB.

__init__(line)[source]

Initialize by parsing a line.

COLUMNS TYPE FIELD DEFINITION
11-50 string classification Classifies the molecule(s)
51-59 string dep_date Deposition date. This is the date the coordinates were received by the PDB
63-66 string id_code This identifier is unique wihin within PDB
Parameters:line (str) – line with PDB class
class pdb2pqr.pdb.HELIX(line)[source]

HELIX field

HELIX records are used to identify the position of helices in the molecule. Helices are both named and numbered. The residues where the helix begins and ends are noted, as well as the total length.

__init__(line)[source]

Initialize by parsing line

COLUMNS TYPE FIELD DEFINITION
8-10 int ser_num Serial number of the helix. This starts at 1 and increases incrementally.
12-14 string helix_id Helix identifier. In addition to a serial number, each helix is given an alphanumeric character helix identifier.
16-18 string init_res_name Name of the initial residue.
20 string init_chain_id Chain identifier for the chain containing this helix.
22-25 int init_seq_num Sequence number of the initial residue.
26 string init_i_code Insertion code of the initial residue.
28-30 string end_res_name Name of the terminal residue of the helix.
32 string end_chain_id Chain identifier for the chain containing this helix.
34-37 int end_seq_num Sequence number of the terminal residue.
38 string end_i_code Insertion code of the terminal residue.
39-40 int helix_class Helix class (see below).
41-70 string comment Comment about this helix.
72-76 int length Length of this helix.
Parameters:line (str) – line with PDB class
class pdb2pqr.pdb.HET(line)[source]

HET field

HET records are used to describe non-standard residues, such as prosthetic groups, inhibitors, solvent molecules, and ions for which coordinates are supplied. Groups are considered HET if they are:

  • not one of the standard amino acids, and
  • not one of the nucleic acids (C, G, A, T, U, and I), and
  • not one of the modified versions of nucleic acids (+C, +G, +A, +T, +U, and +I), and
  • not an unknown amino acid or nucleic acid where UNK is used to indicate the unknown residue name.

Het records also describe heterogens for which the chemical identity is unknown, in which case the group is assigned the hetatm_id UNK.

__init__(line)[source]

Initialize by parsing line

COLUMNS TYPE FIELD DEFINITION
8-10 string hetatm_id Het identifier, right-justified.
13 string ChainID Chain identifier.
14-17 int seq_num Sequence number.
18 string ins_code Insertion code.
21-25 int num_het_atoms Number of HETATM records.
31-70 string text Text describing Het group.
Parameters:line (str) – line with PDB class
class pdb2pqr.pdb.HETATM(line, sybyl_type='A.aaa', l_bonds=[], l_bonded_atoms=[])[source]

HETATM class

The HETATM records present the atomic coordinate records for atoms within “non-standard” groups. These records are used for water molecules and atoms presented in HET groups.

__init__(line, sybyl_type='A.aaa', l_bonds=[], l_bonded_atoms=[])[source]

Initialize by parsing line

COLUMNS TYPE FIELD DEFINITION
7-11 int serial Atom serial number.
13-16 string name Atom name.
17 string alt_loc Alternate location indicator.
18-20 string res_name Residue name.
22 string chain_id Chain identifier.
23-26 int res_seq Residue sequence number.
27 string ins_code Code for insertion of residues.
31-38 float x Orthogonal coordinates for X in Angstroms.
39-46 float y Orthogonal coordinates for Y in Angstroms.
47-54 float z Orthogonal coordinates for Z in Angstroms.
55-60 float occupancy Occupancy.
61-66 float temp_factor Temperature factor.
73-76 string seg_id Segment identifier, left- justified.
77-78 string element Element symbol, right-justified.
79-80 string charge Charge on the atom.
Parameters:line (str) – line with PDB class
class pdb2pqr.pdb.HETNAM(line)[source]

HETNAM field

This record gives the chemical name of the compound with the given hetatm_id.

__init__(line)[source]

Initialize by parsing line

COLUMNS TYPE FIELD DEFINITION
12-14 string hetatm_id Het identifier, right-justified.
16-70 string text Chemical name.
Parameters:line (str) – line with PDB class
class pdb2pqr.pdb.HETSYN(line)[source]

HETSYN field

This record provides synonyms, if any, for the compound in the corresponding (i.e., same hetatm_id) HETNAM record. This is to allow greater flexibility in searching for HET groups.

__init__(line)[source]

Initialize by parsing line

COLUMNS TYPE FIELD DEFINITION
12-14 string hetatm_id Het identifier, right-justified.
16-70 string hetatm_synonyms List of synonyms
Parameters:line (str) – line with PDB class
class pdb2pqr.pdb.HYDBND(line)[source]

HYDBND field

The HYDBND records specify hydrogen bonds in the entry.

__init__(line)[source]

Initialize by parsing line

COLUMNS TYPE FIELD DEFINITION
13-16 string name1 Atom name.
17 string alt_loc1 Alternate location indicator.
18-20 string res_name1 Residue name.
22 string chain1 Chain identifier.
23-27 int res_seq1 Residue sequence number.
28 string i_code1 Insertion code.
30-33 string name_h Hydrogen atom name.
34 string alt_loc_h Alternate location indicator.
36 string chain_h Chain identifier.
37-41 int res_seq_h Residue sequence number.
42 string ins_codeH Insertion code.
44-47 string name2 Atom name.
48 string alt_loc2 Alternate location indicator.
49-51 string res_name2 Residue name.
53 string chain_id2 Chain identifier.
54-58 int res_seq2 Residue sequence number.
59 string ins_code2 Insertion code.
60-65 string sym1 Symmetry operator for 1st non-hydrogen atom.
67-72 string sym2 Symmetry operator for 2nd non-hydrogen atom.
Parameters:line (str) – line with PDB class
class pdb2pqr.pdb.JRNL(line)[source]

JRNL field

The JRNL record contains the primary literature citation that describes the experiment which resulted in the deposited coordinate set. There is at most one JRNL reference per entry. If there is no primary reference, then there is no JRNL reference. Other references are given in REMARK 1.

__init__(line)[source]

Initialize by parsing line

COLUMNS TYPE FIELD DEFINITION
13-70 string text See details on web.
Parameters:line (str) – line with PDB class
class pdb2pqr.pdb.KEYWDS(line)[source]

KEYWDS field

The KEYWDS record contains a set of terms relevant to the entry. Terms in the KEYWDS record provide a simple means of categorizing entries and may be used to generate index files. This record addresses some of the limitations found in the classification field of the HEADER record. It provides the opportunity to add further annotation to the entry in a concise and computer-searchable fashion.

__init__(line)[source]

Initialize by parsing a line

COLUMNS TYPE FIELD DEFINITION
11-70 string keywds Comma-separated list of keywords relevant to the entry
Parameters:line (str) – line with PDB class

LINK field

The LINK records specify connectivity between residues that is not implied by the primary structure. Connectivity is expressed in terms of the atom names. This record supplements information given in CONECT records and is provided here for convenience in searching.

__init__(line)[source]

Initialize by parsing line

COLUMNS TYPE FIELD DEFINITION
13-16 string name1 Atom name.
17 string alt_loc1 Alternate location indicator.
18-20 string res_name1 Residue name.
22 string chain_id1 Chain identifier.
23-26 int res_seq1 Residue sequence number.
27 string ins_code1 Insertion code.
43-46 string name2 Atom name.
47 string alt_loc2 Alternate location indicator.
48-50 string res_name2 Residue name.
52 string chain_id2 Chain identifier.
53-56 int res_seq2 Residue sequence number.
57 string ins_code2 Insertion code.
60-65 string sym1 Symmetry operator for 1st atom.
67-72 string sym2 Symmetry operator for 2nd atom.
Parameters:line (str) – line with PDB class
class pdb2pqr.pdb.MASTER(line)[source]

MASTER class

The MASTER record is a control record for bookkeeping. It lists the number of lines in the coordinate entry or file for selected record types.

__init__(line)[source]

Initialize by parsing line

COLUMNS TYPE FIELD DEFINITION
11-15 int num_remark Number of REMARK records
21-25 int num_het Number of HET records
26-30 int numHelix Number of HELIX records
31-35 int numSheet Number of SHEET records
36-40 int numTurn Number of TURN records
41-45 int numSite Number of SITE records
46-50 int numXform Number of coordinate transformation records (ORIGX+SCALE+MTRIX)
51-55 int numCoord Number of atomic coordinate records (ATOM+HETATM)
56-60 int numTer Number of TER records
61-65 int numConect Number of CONECT records
66-70 int numSeq Number of SEQRES records
Parameters:line (str) – line with PDB class
class pdb2pqr.pdb.MODEL(line)[source]

MODEL class

The MODEL record specifies the model serial number when multiple structures are presented in a single coordinate entry, as is often the case with structures determined by NMR.

__init__(line)[source]

Initialize by parsing line

COLUMNS TYPE FIELD DEFINITION
11-14 int serial Model serial number.
Parameters:line (str) – line with PDB class
class pdb2pqr.pdb.MODRES(line)[source]

MODRES field

The MODRES record provides descriptions of modifications (e.g., chemical or post-translational) to protein and nucleic acid residues. Included are a mapping between residue names given in a PDB entry and standard residues.

__init__(line)[source]

Initialize by parsing a line

COLUMNS TYPE FIELD DEFINITION
8-11 string id_code ID code of this entry.
13-15 string res_name Residue name used in this entry.
17 string chain_id Chain identifier.
19-22 int seq_num Sequence number.
23 string ins_code Insertion code.
25-27 string stdRes Standard residue name.
30-70 string comment Description of the residue modification.
Parameters:line (str) – line with PDB class
class pdb2pqr.pdb.MTRIX1(line)[source]

MATRIX1 PDB entry

class pdb2pqr.pdb.MTRIX2(line)[source]

MATRIX2 PDB entry

class pdb2pqr.pdb.MTRIX3(line)[source]

MATRIX3 PDB entry

class pdb2pqr.pdb.MTRIXn(line)[source]

MTRIXn baseclass

The MTRIXn (n = 1, 2, or 3) records present transformations expressing non-crystallographic symmetry.

__init__(line)[source]

Initialize by parsing line

COLUMNS TYPE FIELD DEFINITION
8-10 int serial Serial number
11-20 float mn1 M31
21-30 float mn2 M32
31-40 float mn3 M33
46-55 float vn V3
60 int i_given 1 if coordinates for the representations which are approximately related by the transformations of the molecule are contained in the entry. Otherwise, blank.
Parameters:line (str) – line with PDB class
class pdb2pqr.pdb.NUMMDL(line)[source]

NUMMDL class

The NUMMDL record indicates total number of models in a PDB entry.

__init__(line)[source]

Initialize by parsing line

COLUMNS TYPE FIELD DEFINITION
11-14 int modelNumber Number of models.
Parameters:line (str) – line with PDB class
class pdb2pqr.pdb.OBSLTE(line)[source]

OBSLTE field

This record acts as a flag in an entry which has been withdrawn from the PDB’s full release. It indicates which, if any, new entries have replaced the withdrawn entry.

The format allows for the case of multiple new entries replacing one existing entry.

__init__(line)[source]

Initialize by parsing a line.

COLUMNS TYPE FIELD DEFINITION
12-20 string replace_date Date that this entry was replaced.
22-25 string id_code ID code of this entry.
32-35 string rid_code ID code of entry that replaced this one.
37-40 string rid_code ID code of entry that replaced this one.
42-45 string rid_code ID code of entry that replaced this one.
47-50 string rid_code ID code of entry that replaced this one.
52-55 string rid_code ID code of entry that replaced this one.
57-60 string rid_code ID code of entry that replaced this one.
62-65 string rid_code ID code of entry that replaced this one.
67-70 string rid_code ID code of entry that replaced this one.
Parameters:line (str) – line with PDB class
class pdb2pqr.pdb.ORIGX1(line)[source]

ORIGX3 PDB entry

class pdb2pqr.pdb.ORIGX2(line)[source]

ORIGX2 PDB entry

class pdb2pqr.pdb.ORIGX3(line)[source]

ORIGX3 PDB entry

class pdb2pqr.pdb.ORIGXn(line)[source]

ORIGXn class

The ORIGXn (n = 1, 2, or 3) records present the transformation from the orthogonal coordinates contained in the entry to the submitted coordinates.

__init__(line)[source]

Initialize by parsing line

COLUMNS TYPE FIELD DEFINITION
11-20 float on1 O21
21-30 float on2 O22
31-40 float on3 O23
46-55 float tn T2
Parameters:line (str) – line with PDB class
class pdb2pqr.pdb.REMARK(line)[source]

REMARK field

REMARK records present experimental details, annotations, comments, and information not included in other records. In a number of cases, REMARKs are used to expand the contents of other record types. A new level of structure is being used for some REMARK records. This is expected to facilitate searching and will assist in the conversion to a relational database.

__init__(line)[source]

Initialize by parsing line.

Parameters:line (str) – line with PDB class
class pdb2pqr.pdb.REVDAT(line)[source]

REVDAT field

REVDAT records contain a history of the modifications made to an entry since its release.

__init__(line)[source]

Initialize by parsing a line.

Todo

If multiple modifications are present, only the last one in the file is preserved.

COLUMNS TYPE FIELD DEFINITION
8-10 int mod_num Modification number.
14-22 string mod_date Date of modification (or release for new entries).
24-28 string mod_id Identifies this particular modification. It links to the archive used internally by PDB.
32 int mod_type An integer identifying the type of modification. In case of revisions with more than one possible mod_type, the highest value applicable will be assigned.
40-45 string record Name of the modified record.
47-52 string record Name of the modified record.
54-59 string record Name of the modified record.
61-66 string record Name of the modified record.
Parameters:line (str) – line with PDB class
class pdb2pqr.pdb.SCALE1(line)[source]

SCALE2 PDB entry

class pdb2pqr.pdb.SCALE2(line)[source]

SCALE2 PDB entry

class pdb2pqr.pdb.SCALE3(line)[source]

SCALE3 PDB entry

class pdb2pqr.pdb.SCALEn(line)[source]

SCALEn baseclass

The SCALEn (n = 1, 2, or 3) records present the transformation from the orthogonal coordinates as contained in the entry to fractional crystallographic coordinates. Non-standard coordinate systems should be explained in the remarks.

__init__(line)[source]

Initialize by parsing line

COLUMNS TYPE FIELD DEFINITION
11-20 float sn1 S31
21-30 float sn2 S32
31-40 float sn3 S33
46-55 float un U3
Parameters:line (str) – line with PDB class
class pdb2pqr.pdb.SEQADV(line)[source]

SEQADV field

The SEQADV record identifies conflicts between sequence information in the ATOM records of the PDB entry and the sequence database entry given on DBREF. Please note that these records were designed to identify differences and not errors. No assumption is made as to which database contains the correct data. PDB may include REMARK records in the entry that reflect the depositor’s view of which database has the correct sequence.

__init__(line)[source]

Initialize by parsing line

COLUMNS TYPE FIELD DEFINITION
8-11 string id_code ID code of this entry.
13-15 string res_name Name of the PDB residue in conflict.
17 string chain_id PDB chain identifier.
19-22 int seq_num PDB sequence number.
23 string ins_code PDB insertion code.
25-28 string database Sequence database name.
30-38 string db_id_code Sequence database accession number.
40-42 string db_res Sequence database residue name.
44-48 int db_seq Sequence database sequence number.
50-70 string conflict Conflict comment.
Parameters:line (str) – line with PDB class
class pdb2pqr.pdb.SEQRES(line)[source]

SEQRES field

SEQRES records contain the amino acid or nucleic acid sequence of residues in each chain of the macromolecule that was studied.

__init__(line)[source]

Initialize by parsing a line

COLUMNS TYPE FIELD DEFINITION
9-10 int ser_num Serial number of the SEQRES record for the current chain. Starts at 1 and increments by one each line. Reset to 1 for each chain.
12 string chain_id Chain identifier. This may be any single legal character, including a blank which is used if there is only one chain.
14-17 int num_res Number of residues in the chain. This value is repeated on every record.
20-22 string res_name Residue name.
24-26 string res_name Residue name.
28-30 string res_name Residue name.
32-34 string res_name Residue name.
36-38 string res_name Residue name.
40-42 string res_name Residue name.
44-46 string res_name Residue name.
48-50 string res_name Residue name.
52-54 string res_name Residue name.
56-58 string res_name Residue name.
60-62 string res_name Residue name.
64-66 string res_name Residue name.
68-70 string res_name Residue name.
Parameters:line (str) – line with PDB class
class pdb2pqr.pdb.SHEET(line)[source]

SHEET field

SHEET records are used to identify the position of sheets in the molecule. Sheets are both named and numbered. The residues where the sheet begins and ends are noted.

__init__(line)[source]

Initialize by parsing line

COLUMNS TYPE FIELD DEFINITION
8-10 int strand Strand number which starts at 1 for each strand within a sheet and increases by one.
12-14 string sheet_id Sheet identifier.
15-16 int num_strands Number of strands in sheet.
18-20 string init_res_name Residue name of initial residue.
22 string init_chain_id Chain identifier of initial residue in strand.
23-26 int init_seq_num Sequence number of initial residue in strand.
27 string init_i_code Insertion code of initial residue in strand.
29-31 string end_res_name Residue name of terminal residue.
33 string end_chain_id Chain identifier of terminal residue.
34-37 int end_seq_num Sequence number of terminal residue.
38 string end_i_code Insertion code of terminal residue.
39-40 int sense Sense of strand with respect to previous strand in the sheet. 0 if first strand, 1 if parallel, -1 if anti-parallel.
42-45 string cur_atom Registration. Atom name in current strand.
46-48 string curr_res_name Registration. Residue name in current strand.
50 string curChainId Registration. Chain identifier in current strand.
51-54 int curr_res_seq Registration. Residue sequence number in current strand.
55 string curr_ins_code Registration. Insertion code in current strand.
57-60 string prev_atom Registration. Atom name in previous strand.
61-63 string prev_res_name Registration. Residue name in previous strand.
65 string prevChainId Registration. Chain identifier in previous strand.
66-69 int prev_res_seq Registration. Residue sequence number in previous strand.
70 string prev_ins_code Registration. Insertion code in previous strand.
Parameters:line (str) – line with PDB class
class pdb2pqr.pdb.SIGATM(line)[source]

SIGATM class

The SIGATM records present the standard deviation of atomic parameters as they appear in ATOM and HETATM records.

__init__(line)[source]

Initialize by parsing line

COLUMNS TYPE FIELD DEFINITION
7-11 int serial Atom serial number.
13-16 string name Atom name.
17 string alt_loc Alternate location indicator.
18-20 string res_name Residue name.
22 string chain_id Chain identifier.
23-26 int res_seq Residue sequence number.
27 string ins_code Code for insertion of residues.
31-38 float sig_x Standard deviation of orthogonal coordinates for X in Angstroms.
39-46 float sig_y Standard deviation of orthogonal coordinates for Y in Angstroms.
47-54 float sig_z Standard deviation of orthogonal coordinates for Z in Angstroms.
55-60 float sig_occ Standard deviation of occupancy.
61-66 float sig_temp Standard deviation of temperature factor.
73-76 string seg_id Segment identifier, left-justified.
77-78 string element Element symbol, right-justified.
79-80 string charge Charge on the atom.
Parameters:line (str) – line with PDB class
class pdb2pqr.pdb.SIGUIJ(line)[source]

SIGUIJ class

The SIGUIJ records present the anisotropic temperature factors.

__init__(line)[source]

Initialize by parsing line:

COLUMNS TYPE FIELD DEFINITION
7-11 int serial Atom serial number.
13-16 string name Atom name.
17 string alt_loc Alternate location indicator.
18-20 string res_name Residue name.
22 string chain_id Chain identifier.
23-26 int res_seq Residue sequence number.
27 string ins_code Insertion code.
29-35 int sig11 Sigma U(1,1)
36-42 int sig22 Sigma U(2,2)
43-49 int sig33 Sigma U(3,3)
50-56 int sig12 Sigma U(1,2)
57-63 int sig13 Sigma U(1,3)
64-70 int sig23 Sigma U(2,3)
73-76 string seg_id Segment identifier, left-justified.
77-78 string el.ment Element symbol, right-justified.
79-80 string charge Charge on the atom.
Parameters:line (str) – line with PDB class
class pdb2pqr.pdb.SITE(line)[source]

SITE class

The SITE records supply the identification of groups comprising important sites in the macromolecule.

__init__(line)[source]

Initialize by parsing the line

COLUMNS 8-10 TYPE int FIELD seq_num DEFINITION Sequence number.
12-14 string site_id Site name.
16-17 int num_res Number of residues comprising site.
19-21 string res_name1 Residue name for first residue comprising site.
23 string chain_id1 Chain identifier for first residue comprising site.
24-27 int seq1 Residue sequence number for first residue comprising site.
28 string ins_code1 Insertion code for first residue comprising site.
30-32 string res_name2 Residue name for second residue comprising site.
34 string chain_id2 Chain identifier for second residue comprising site.
35-38 int seq2 Residue sequence number for second residue comprising site.
39 string ins_code2 Insertion code for second residue comprising site.
41-43 string res_name3 Residue name for third residue comprising site.
45 string chain_id3 Chain identifier for third residue comprising site.
46-49 int seq3 Residue sequence number for third residue comprising site.
50 string ins_code3 Insertion code for third residue comprising site.
52-54 string res_name4 Residue name for fourth residue comprising site.
56 string chain_id4 Chain identifier for fourth residue comprising site.
57-60 int seq4 Residue sequence number for fourth residue comprising site.
61 string ins_code4 Insertion code for fourth residue comprising site.
Parameters:line (str) – line with PDB class
class pdb2pqr.pdb.SLTBRG(line)[source]

SLTBRG field

The SLTBRG records specify salt bridges in the entry. records and is provided here for convenience in searching.

__init__(line)[source]

Initialize by parsing line

COLUMNS TYPE FIELD DEFINITION
13-16 string name1 Atom name.
17 string alt_loc1 Alternate location indicator.
18-20 string res_name1 Residue name.
22 string chain_id1 Chain identifier.
23-26 int res_seq1 Residue sequence number.
27 string ins_code1 Insertion code.
43-46 string name2 Atom name.
47 string alt_loc2 Alternate location indicator.
48-50 string res_name2 Residue name.
52 string chain_id2 Chain identifier.
53-56 int res_seq2 Residue sequence number.
57 string ins_code2 Insertion code.
60-65 string sym1 Symmetry operator for 1st atom.
67-72 string sym2 Symmetry operator for 2nd atom.
Parameters:line (str) – line with PDB class
class pdb2pqr.pdb.SOURCE(line)[source]

SOURCE field

The SOURCE record specifies the biological and/or chemical source of each biological molecule in the entry. Sources are described by both the common name and the scientific name, e.g., genus and species. Strain and/or cell-line for immortalized cells are given when they help to uniquely identify the biological entity studied.

__init__(line)[source]

Initialize by parsing a line

COLUMNS TYPE FIELD DEFINITION
11-70 string source Identifies the source of the macromolecule in a token: value format
Parameters:line (str) – line with PDB class
class pdb2pqr.pdb.SPRSDE(line)[source]

SPRSDE field

The SPRSDE records contain a list of the ID codes of entries that were made obsolete by the given coordinate entry and withdrawn from the PDB release set. One entry may replace many. It is PDB policy that only the principal investigator of a structure has the authority to withdraw it.

__init__(line)[source]

Initialize by parsing line

COLUMNS TYPE FIELD DEFINITION
12-20 string super_date Date this entry superseded the listed entries.
22-25 string id_code ID code of this entry.
32-35 string sid_code ID code of a superseded entry.
37-40 string sid_code ID code of a superseded entry.
42-45 string sid_code ID code of a superseded entry.
47-50 string sid_code ID code of a superseded entry.
52-55 string sid_code ID code of a superseded entry.
57-60 string sid_code ID code of a superseded entry.
62-65 string sid_code ID code of a superseded entry.
67-70 string sid_code ID code of a superseded entry.
Parameters:line (str) – line with PDB class
class pdb2pqr.pdb.SSBOND(line)[source]

SSBOND field

The SSBOND record identifies each disulfide bond in protein and polypeptide structures by identifying the two residues involved in the bond.

__init__(line)[source]

Initialize by parsing line

COLUMNS TYPE FIELD DEFINITION
8-10 int ser_num Serial number.
16 string chain_id1 Chain identifier.
18-21 int seq_num1 Residue sequence number.
22 string icode1 Insertion code.
30 string chain_id2 Chain identifier.
32-35 int seq_num2 Residue sequence number.
36 string icode2 Insertion code.
60-65 string sym1 Symmetry operator for 1st residue.
67-72 string sym2 Symmetry operator for 2nd residue.
Parameters:line (str) – line with PDB class
class pdb2pqr.pdb.TER(line)[source]

TER class

The TER record indicates the end of a list of ATOM/HETATM records for a chain.

__init__(line)[source]

Initialize by parsing line:

COLUMNS TYPE FIELD DEFINITION
7-11 int serial Serial number.
18-20 string res_name Residue name.
22 string chain_id Chain identifier.
23-26 int res_seq Residue sequence number.
27 string ins_code Insertion code.
Parameters:line (str) – line with PDB class
class pdb2pqr.pdb.TITLE(line)[source]

TITLE field

The TITLE record contains a title for the experiment or analysis that is represented in the entry. It should identify an entry in the PDB in the same way that a title identifies a paper.

__init__(line)[source]

Initialize by parsing a line.

COLUMNS TYPE FIELD DEFINITION
11-70 string title Title of the experiment
Parameters:line (str) – line with PDB class
class pdb2pqr.pdb.TURN(line)[source]

TURN field

The TURN records identify turns and other short loop turns which normally connect other secondary structure segments.

__init__(line)[source]

Initialize by parsing line

COLUMNS TYPE FIELD DEFINITION
8-10 int seq Turn number; starts with 1 and increments by one.
12-14 string turn_id Turn identifier.
16-18 string init_res_name Residue name of initial residue in turn.
20 string init_chain_id Chain identifier for the chain containing this turn.
21-24 int init_seq_num Sequence number of initial residue in turn.
25 string init_i_code Insertion code of initial residue in turn.
27-29 string end_res_name Residue name of terminal residue of turn.
31 string end_chain_id Chain identifier for the chain containing this turn.
32-35 int end_seq_num Sequence number of terminal residue of turn.
36 string end_i_code Insertion code of terminal residue of turn.
41-70 string comment Associated comment.
Parameters:line (str) – line with PDB class
class pdb2pqr.pdb.TVECT(line)[source]

TVECT class

The TVECT records present the translation vector for infinite covalently connected structures.

__init__(line)[source]

Initialize by parsing line

COLUMNS TYPE FIELD DEFINITION
8-10 int serial Serial number
11-20 float t1 Components of translation vector
21-30 float t2 Components of translation vector
31-40 float t2 Components of translation vector
41-70 string text Comments
Parameters:line (str) – line with PDB class
pdb2pqr.pdb.read_atom(line)[source]

If the ATOM/HETATM is not column-formatted, try to get some information by parsing whitespace from the right. Look for five floating point numbers followed by the residue number.

Parameters:line (str) – the line to parse
pdb2pqr.pdb.read_pdb(file_)[source]

Parse PDB-format data into array of Atom objects.

Parameters:file (file) – open File-like object
Returns:(a list of objects from this module, a list of record names that couldn’t be parsed)
Return type:(list, list)
pdb2pqr.pdb.register_line_parser(klass)[source]

Register a line parser in the global dictionary.

Parameters:klass – class for line parser