pdb

PDB parsing class

This module parses PDBs in accordance to PDB Format Description Version 2.2 (1996); it is not very forgiving. Each class in this module corresponds to a record in the PDB Format Description. Much of the documentation for the classes is taken directly from the above PDB Format Description.

Code author: Todd Dolinsky

Code author: Yong Huang

Code author: Nathan Baker

class pdb2pqr.pdb.ANISOU(line)[source]

ANISOU class

The ANISOU records present the anisotropic temperature factors.

__init__(line)[source]

Initialize by parsing line:

COLUMNS

TYPE

FIELD

DEFINITION

7-11

int

serial

Atom serial number.

13-16

string

name

Atom name.

17

string

alt_loc

Alternate location indicator.

18-20

string

res_name

Residue name.

22

string

chain_id

Chain identifier.

23-26

int

res_seq

Residue sequence number.

27

string

ins_code

Insertion code.

29-35

int

u00

U(1,1)

36-42

int

u11

U(2,2)

43-49

int

u22

U(3,3)

50-56

int

u01

U(1,2)

57-63

int

u02

U(1,3)

64-70

int

u12

U(2,3)

73-76

string

seg_id

Segment identifier, left-justified.

77-78

string

element

Element symbol, right-justified.

79-80

string

charge

Charge on the atom.

Parameters:

line (str) – line with PDB class

class pdb2pqr.pdb.ATOM(line)[source]

ATOM class

The ATOM records present the atomic coordinates for standard residues. They also present the occupancy and temperature factor for each atom. Heterogen coordinates use the HETATM record type. The element symbol is always present on each ATOM record; segment identifier and charge are optional.

__init__(line)[source]

Initialize by parsing line

COLUMNS

TYPE

FIELD

DEFINITION

7-11

int

serial

Atom serial number.

13-16

string

name

Atom name.

17

string

alt_loc

Alternate location indicator.

18-20

string

res_name

Residue name.

22

string

chain_id

Chain identifier.

23-26

int

res_seq

Residue sequence number.

27

string

ins_code

Code for insertion of residues.

31-38

float

x

Orthogonal coordinates for X in Angstroms.

39-46

float

y

Orthogonal coordinates for Y in Angstroms.

47-54

float

z

Orthogonal coordinates for Z in Angstroms.

55-60

float

occupancy

Occupancy.

61-66

float

temp_factor

Temperature factor.

73-76

string

seg_id

Segment identifier, left-justified.

77-78

string

element

Element symbol, right-justified.

79-80

string

charge

Charge on the atom.

Parameters:

line (str) – line with PDB class

class pdb2pqr.pdb.AUTHOR(line)[source]

AUTHOR field

The AUTHOR record contains the names of the people responsible for the contents of the entry.

__init__(line)[source]

Initialize by parsing a line

COLUMNS

TYPE

FIELD

DEFINITION

11-70

string

author_list

List of the author names, separated by commas

Parameters:

line (str) – line with PDB class

class pdb2pqr.pdb.BaseRecord(line)[source]

Base class for all records.

Verifies the received record type.

__init__(line)[source]
record_type()[source]

Return PDB record type as string.

Returns:

record type

Return type:

str

class pdb2pqr.pdb.CAVEAT(line)[source]

CAVEAT field

CAVEAT warns of severe errors in an entry. Use caution when using an entry containing this record.

__init__(line)[source]

Initialize by parsing line.

COLUMNS

TYPE

FIELD

DEFINITION

12-15

string

id_code

PDB ID code of this entry.

20-70

string

comment

Free text giving the reason for the CAVEAT.

Parameters:

line (str) – line with PDB class

class pdb2pqr.pdb.CISPEP(line)[source]

CISPEP field

CISPEP records specify the prolines and other peptides found to be in the cis conformation. This record replaces the use of footnote records to list cis peptides.

__init__(line)[source]

Initialize by parsing line

COLUMNS

TYPE

FIELD

DEFINITION

8-10

int

ser_num

Record serial number.

12-14

string

pep1

Residue name.

16

string

chain_id1

Chain identifier.

18-21

int

seq_num1

Residue sequence number.

22

string

icode1

Insertion code.

26-28

string

pep2

Residue name.

30

string

chain_id2

Chain identifier.

32-35

int

seq_num2

Residue sequence number.

36

string

icode2

Insertion code.

44-46

int

mod_num

Identifies the specific model.

54-59

float

measure

Measure of the angle in degrees.

Parameters:

line (str) – line with PDB class

class pdb2pqr.pdb.COMPND(line)[source]

COMPND field

The COMPND record describes the macromolecular contents of an entry. Each macromolecule found in the entry is described by a set of token: value pairs, and is referred to as a COMPND record component. Since the concept of a molecule is difficult to specify exactly, PDB staff may exercise editorial judgment in consultation with depositors in assigning these names.

For each macromolecular component, the molecule name, synonyms, number assigned by the Enzyme Commission (EC), and other relevant details are specified.

__init__(line)[source]

Initialize by parsing a line

COLUMNS

TYPE

FIELD

DEFINITION

11-70

string

compound

Description of the molecular list components.

Parameters:

line (str) – line with PDB class

class pdb2pqr.pdb.CONECT(line)[source]

CONECT class

The CONECT records specify connectivity between atoms for which coordinates are supplied. The connectivity is described using the atom serial number as found in the entry. CONECT records are mandatory for HET groups (excluding water) and for other bonds not specified in the standard residue connectivity table which involve atoms in standard residues (see Appendix 4 for the list of standard residues). These records are generated by the PDB.

__init__(line)[source]

Initialize by parsing line

COLUMNS

TYPE

FIELD

DEFINITION

7-11

int

serial

Atom serial number

12-16

int

serial1

Serial number of bonded atom

17-21

int

serial2

Serial number of bonded atom

22-26

int

serial3

Serial number of bonded atom

27-31

int

serial4

Serial number of bonded atom

32-36

int

serial5

Serial number of hydrogen bonded atom

37-41

int

serial6

Serial number of hydrogen bonded atom

42-46

int

serial7

Serial number of salt bridged atom

47-51

int

serial8

Serial number of hydrogen bonded atom

52-56

int

serial9

Serial number of hydrogen bonded atom

57-61

int

serial10

Serial number of salt bridged atom

Parameters:

line (str) – line with PDB class

class pdb2pqr.pdb.CRYST1(line)[source]

CRYST1 class

The CRYST1 record presents the unit cell parameters, space group, and Z value. If the structure was not determined by crystallographic means, CRYST1 simply defines a unit cube.

__init__(line)[source]

Initialize by parsing line

COLUMNS

TYPE

FIELD

DEFINITION

7-15

float

a

a (Angstroms).

16-24

float

b

b (Angstroms).

25-33

float

c

c (Angstroms).

34-40

float

alpha

alpha (degrees).

41-47

float

beta

beta (degrees).

48-54

float

gamma

gamma (degrees).

56-66

string

space_group

Space group.

67-70

int

z

Z value.

Parameters:

line (str) – line with PDB class

class pdb2pqr.pdb.DBREF(line)[source]

DBREF field

The DBREF record provides cross-reference links between PDB sequences and the corresponding database entry or entries. A cross reference to the sequence database is mandatory for each peptide chain with a length greater than ten (10) residues. For nucleic acid entries a DBREF record pointing to the Nucleic Acid Database (NDB) is mandatory when the corresponding entry exists in NDB.

__init__(line)[source]

Initialize by parsing a line.

COLUMNS

TYPE

FIELD

DEFINITION

8-11

string

id_code

ID code of this entry.

13

string

chain_id

Chain identifier.

15-18

int

seq_begin

Initial sequence number of the PDB sequence segment.

19

string

insert_begin

Initial insertion code of the PDB sequence segment.

21-24

int

seq_end

Ending sequence number of the PDB sequence segment.

25

string

insert_end

Ending insertion code of the PDB sequence segment.

27-32

string

database

Sequence database name. “PDB” when a corresponding sequence database entry has not been identified.

34-41

string

db_accession

Sequence database accession code. For GenBank entries, this is the NCBI gi number.

43-54

string

db_id_code

Sequence database identification code. For GenBank entries, this is the accession code.

56-60

int

db_seq_begin

Initial sequence number of the database seqment.

61

string

db_ins_begin

Insertion code of initial residue of the segment, if PDB is the reference.

63-67

int

dbseq_end

Ending sequence number of the database segment.

68

string

db_ins_end

Insertion code of the ending residue of the segment, if PDB is the reference.

Parameters:

line (str) – line with PDB class

class pdb2pqr.pdb.END(line)[source]

END class

The END records are paired with MODEL records to group individual structures found in a coordinate entry.

__init__(line)[source]

Initialize with line.

Parameters:

line (str) – line with PDB class

class pdb2pqr.pdb.ENDMDL(line)[source]

ENDMDL class

The ENDMDL records are paired with MODEL records to group individual structures found in a coordinate entry.

__init__(line)[source]
class pdb2pqr.pdb.EXPDTA(line)[source]

EXPDTA field

The EXPDTA record identifies the experimental technique used. This may refer to the type of radiation and sample, or include the spectroscopic or modeling technique. Permitted values include:

  • ELECTRON DIFFRACTION

  • FIBER DIFFRACTION

  • FLUORESCENCE TRANSFER

  • NEUTRON DIFFRACTION

  • NMR

  • THEORETICAL MODEL

  • X-RAY DIFFRACTION

__init__(line)[source]

Initialize by parsing a line

COLUMNS

TYPE

FIELD

DEFINITION

11-70

string

technique

The experimental technique(s) with optional comment describing the sample or experiment

Parameters:

line (str) – line with PDB class

class pdb2pqr.pdb.FORMUL(line)[source]

FORMUL field

The FORMUL record presents the chemical formula and charge of a non-standard group.

__init__(line)[source]

Initialize by parsing line

COLUMNS

TYPE

FIELD

DEFINITION

9-10

int

comp_num

Component number

13-15

string

hetatm_id

Het identifier

19

string

asterisk *

for water

20-70

string

text

Chemical formula

Parameters:

line (str) – line with PDB class

class pdb2pqr.pdb.HEADER(line)[source]

HEADER field

The HEADER record uniquely identifies a PDB entry through the id_code field. This record also provides a classification for the entry. Finally, it contains the date the coordinates were deposited at the PDB.

__init__(line)[source]

Initialize by parsing a line.

COLUMNS

TYPE

FIELD

DEFINITION

11-50

string

classification

Classifies the molecule(s)

51-59

string

dep_date

Deposition date. This is the date the coordinates were received by the PDB

63-66

string

id_code

This identifier is unique wihin within PDB

Parameters:

line (str) – line with PDB class

class pdb2pqr.pdb.HELIX(line)[source]

HELIX field

HELIX records are used to identify the position of helices in the molecule. Helices are both named and numbered. The residues where the helix begins and ends are noted, as well as the total length.

__init__(line)[source]

Initialize by parsing line

COLUMNS

TYPE

FIELD

DEFINITION

8-10

int

ser_num

Serial number of the helix. This starts at 1 and increases incrementally.

12-14

string

helix_id

Helix identifier. In addition to a serial number, each helix is given an alphanumeric character helix identifier.

16-18

string

init_res_name

Name of the initial residue.

20

string

init_chain_id

Chain identifier for the chain containing this helix.

22-25

int

init_seq_num

Sequence number of the initial residue.

26

string

init_i_code

Insertion code of the initial residue.

28-30

string

end_res_name

Name of the terminal residue of the helix.

32

string

end_chain_id

Chain identifier for the chain containing this helix.

34-37

int

end_seq_num

Sequence number of the terminal residue.

38

string

end_i_code

Insertion code of the terminal residue.

39-40

int

helix_class

Helix class (see below).

41-70

string

comment

Comment about this helix.

72-76

int

length

Length of this helix.

Parameters:

line (str) – line with PDB class

class pdb2pqr.pdb.HET(line)[source]

HET field

HET records are used to describe non-standard residues, such as prosthetic groups, inhibitors, solvent molecules, and ions for which coordinates are supplied. Groups are considered HET if they are:

  • not one of the standard amino acids, and

  • not one of the nucleic acids (C, G, A, T, U, and I), and

  • not one of the modified versions of nucleic acids (+C, +G, +A, +T, +U, and +I), and

  • not an unknown amino acid or nucleic acid where UNK is used to indicate the unknown residue name.

Het records also describe heterogens for which the chemical identity is unknown, in which case the group is assigned the hetatm_id UNK.

__init__(line)[source]

Initialize by parsing line

COLUMNS

TYPE

FIELD

DEFINITION

8-10

string

hetatm_id

Het identifier, right-justified.

13

string

ChainID

Chain identifier.

14-17

int

seq_num

Sequence number.

18

string

ins_code

Insertion code.

21-25

int

num_het_atoms

Number of HETATM records.

31-70

string

text

Text describing Het group.

Parameters:

line (str) – line with PDB class

class pdb2pqr.pdb.HETATM(line, sybyl_type='A.aaa', l_bonds=[], l_bonded_atoms=[])[source]

HETATM class

The HETATM records present the atomic coordinate records for atoms within “non-standard” groups. These records are used for water molecules and atoms presented in HET groups.

__init__(line, sybyl_type='A.aaa', l_bonds=[], l_bonded_atoms=[])[source]

Initialize by parsing line

COLUMNS

TYPE

FIELD

DEFINITION

7-11

int

serial

Atom serial number.

13-16

string

name

Atom name.

17

string

alt_loc

Alternate location indicator.

18-20

string

res_name

Residue name.

22

string

chain_id

Chain identifier.

23-26

int

res_seq

Residue sequence number.

27

string

ins_code

Code for insertion of residues.

31-38

float

x

Orthogonal coordinates for X in Angstroms.

39-46

float

y

Orthogonal coordinates for Y in Angstroms.

47-54

float

z

Orthogonal coordinates for Z in Angstroms.

55-60

float

occupancy

Occupancy.

61-66

float

temp_factor

Temperature factor.

73-76

string

seg_id

Segment identifier, left- justified.

77-78

string

element

Element symbol, right-justified.

79-80

string

charge

Charge on the atom.

Parameters:

line (str) – line with PDB class

class pdb2pqr.pdb.HETNAM(line)[source]

HETNAM field

This record gives the chemical name of the compound with the given hetatm_id.

__init__(line)[source]

Initialize by parsing line

COLUMNS

TYPE

FIELD

DEFINITION

12-14

string

hetatm_id

Het identifier, right-justified.

16-70

string

text

Chemical name.

Parameters:

line (str) – line with PDB class

class pdb2pqr.pdb.HETSYN(line)[source]

HETSYN field

This record provides synonyms, if any, for the compound in the corresponding (i.e., same hetatm_id) HETNAM record. This is to allow greater flexibility in searching for HET groups.

__init__(line)[source]

Initialize by parsing line

COLUMNS

TYPE

FIELD

DEFINITION

12-14

string

hetatm_id

Het identifier, right-justified.

16-70

string

hetatm_synonyms

List of synonyms

Parameters:

line (str) – line with PDB class

class pdb2pqr.pdb.HYDBND(line)[source]

HYDBND field

The HYDBND records specify hydrogen bonds in the entry.

__init__(line)[source]

Initialize by parsing line

COLUMNS

TYPE

FIELD

DEFINITION

13-16

string

name1

Atom name.

17

string

alt_loc1

Alternate location indicator.

18-20

string

res_name1

Residue name.

22

string

chain1

Chain identifier.

23-27

int

res_seq1

Residue sequence number.

28

string

i_code1

Insertion code.

30-33

string

name_h

Hydrogen atom name.

34

string

alt_loc_h

Alternate location indicator.

36

string

chain_h

Chain identifier.

37-41

int

res_seq_h

Residue sequence number.

42

string

ins_codeH

Insertion code.

44-47

string

name2

Atom name.

48

string

alt_loc2

Alternate location indicator.

49-51

string

res_name2

Residue name.

53

string

chain_id2

Chain identifier.

54-58

int

res_seq2

Residue sequence number.

59

string

ins_code2

Insertion code.

60-65

string

sym1

Symmetry operator for 1st non-hydrogen atom.

67-72

string

sym2

Symmetry operator for 2nd non-hydrogen atom.

Parameters:

line (str) – line with PDB class

class pdb2pqr.pdb.JRNL(line)[source]

JRNL field

The JRNL record contains the primary literature citation that describes the experiment which resulted in the deposited coordinate set. There is at most one JRNL reference per entry. If there is no primary reference, then there is no JRNL reference. Other references are given in REMARK 1.

__init__(line)[source]

Initialize by parsing line

COLUMNS

TYPE

FIELD

DEFINITION

13-70

string

text

See details on web.

Parameters:

line (str) – line with PDB class

class pdb2pqr.pdb.KEYWDS(line)[source]

KEYWDS field

The KEYWDS record contains a set of terms relevant to the entry. Terms in the KEYWDS record provide a simple means of categorizing entries and may be used to generate index files. This record addresses some of the limitations found in the classification field of the HEADER record. It provides the opportunity to add further annotation to the entry in a concise and computer-searchable fashion.

__init__(line)[source]

Initialize by parsing a line

COLUMNS

TYPE

FIELD

DEFINITION

11-70

string

keywds

Comma-separated list of keywords relevant to the entry

Parameters:

line (str) – line with PDB class

LINK field

The LINK records specify connectivity between residues that is not implied by the primary structure. Connectivity is expressed in terms of the atom names. This record supplements information given in CONECT records and is provided here for convenience in searching.

__init__(line)[source]

Initialize by parsing line

COLUMNS

TYPE

FIELD

DEFINITION

13-16

string

name1

Atom name.

17

string

alt_loc1

Alternate location indicator.

18-20

string

res_name1

Residue name.

22

string

chain_id1

Chain identifier.

23-26

int

res_seq1

Residue sequence number.

27

string

ins_code1

Insertion code.

43-46

string

name2

Atom name.

47

string

alt_loc2

Alternate location indicator.

48-50

string

res_name2

Residue name.

52

string

chain_id2

Chain identifier.

53-56

int

res_seq2

Residue sequence number.

57

string

ins_code2

Insertion code.

60-65

string

sym1

Symmetry operator for 1st atom.

67-72

string

sym2

Symmetry operator for 2nd atom.

Parameters:

line (str) – line with PDB class

class pdb2pqr.pdb.MASTER(line)[source]

MASTER class

The MASTER record is a control record for bookkeeping. It lists the number of lines in the coordinate entry or file for selected record types.

__init__(line)[source]

Initialize by parsing line

COLUMNS

TYPE

FIELD

DEFINITION

11-15

int

num_remark

Number of REMARK records

21-25

int

num_het

Number of HET records

26-30

int

numHelix

Number of HELIX records

31-35

int

numSheet

Number of SHEET records

36-40

int

numTurn

Number of TURN records

41-45

int

numSite

Number of SITE records

46-50

int

numXform

Number of coordinate transformation records (ORIGX+SCALE+MTRIX)

51-55

int

numCoord

Number of atomic coordinate records (ATOM+HETATM)

56-60

int

numTer

Number of TER records

61-65

int

numConect

Number of CONECT records

66-70

int

numSeq

Number of SEQRES records

Parameters:

line (str) – line with PDB class

class pdb2pqr.pdb.MODEL(line)[source]

MODEL class

The MODEL record specifies the model serial number when multiple structures are presented in a single coordinate entry, as is often the case with structures determined by NMR.

__init__(line)[source]

Initialize by parsing line

COLUMNS

TYPE

FIELD

DEFINITION

11-14

int

serial

Model serial number.

Parameters:

line (str) – line with PDB class

class pdb2pqr.pdb.MODRES(line)[source]

MODRES field

The MODRES record provides descriptions of modifications (e.g., chemical or post-translational) to protein and nucleic acid residues. Included are a mapping between residue names given in a PDB entry and standard residues.

__init__(line)[source]

Initialize by parsing a line

COLUMNS

TYPE

FIELD

DEFINITION

8-11

string

id_code

ID code of this entry.

13-15

string

res_name

Residue name used in this entry.

17

string

chain_id

Chain identifier.

19-22

int

seq_num

Sequence number.

23

string

ins_code

Insertion code.

25-27

string

stdRes

Standard residue name.

30-70

string

comment

Description of the residue modification.

Parameters:

line (str) – line with PDB class

class pdb2pqr.pdb.MTRIX1(line)[source]

MATRIX1 PDB entry

class pdb2pqr.pdb.MTRIX2(line)[source]

MATRIX2 PDB entry

class pdb2pqr.pdb.MTRIX3(line)[source]

MATRIX3 PDB entry

class pdb2pqr.pdb.MTRIXn(line)[source]

MTRIXn baseclass

The MTRIXn (n = 1, 2, or 3) records present transformations expressing non-crystallographic symmetry.

__init__(line)[source]

Initialize by parsing line

COLUMNS

TYPE

FIELD

DEFINITION

8-10

int

serial

Serial number

11-20

float

mn1

M31

21-30

float

mn2

M32

31-40

float

mn3

M33

46-55

float

vn

V3

60

int

i_given

1 if coordinates for the representations which are approximately related by the transformations of the molecule are contained in the entry. Otherwise, blank.

Parameters:

line (str) – line with PDB class

class pdb2pqr.pdb.NUMMDL(line)[source]

NUMMDL class

The NUMMDL record indicates total number of models in a PDB entry.

__init__(line)[source]

Initialize by parsing line

COLUMNS

TYPE

FIELD

DEFINITION

11-14

int

modelNumber

Number of models.

Parameters:

line (str) – line with PDB class

class pdb2pqr.pdb.OBSLTE(line)[source]

OBSLTE field

This record acts as a flag in an entry which has been withdrawn from the PDB’s full release. It indicates which, if any, new entries have replaced the withdrawn entry.

The format allows for the case of multiple new entries replacing one existing entry.

__init__(line)[source]

Initialize by parsing a line.

COLUMNS

TYPE

FIELD

DEFINITION

12-20

string

replace_date

Date that this entry was replaced.

22-25

string

id_code

ID code of this entry.

32-35

string

rid_code

ID code of entry that replaced this one.

37-40

string

rid_code

ID code of entry that replaced this one.

42-45

string

rid_code

ID code of entry that replaced this one.

47-50

string

rid_code

ID code of entry that replaced this one.

52-55

string

rid_code

ID code of entry that replaced this one.

57-60

string

rid_code

ID code of entry that replaced this one.

62-65

string

rid_code

ID code of entry that replaced this one.

67-70

string

rid_code

ID code of entry that replaced this one.

Parameters:

line (str) – line with PDB class

class pdb2pqr.pdb.ORIGX1(line)[source]

ORIGX3 PDB entry

class pdb2pqr.pdb.ORIGX2(line)[source]

ORIGX2 PDB entry

class pdb2pqr.pdb.ORIGX3(line)[source]

ORIGX3 PDB entry

class pdb2pqr.pdb.ORIGXn(line)[source]

ORIGXn class

The ORIGXn (n = 1, 2, or 3) records present the transformation from the orthogonal coordinates contained in the entry to the submitted coordinates.

__init__(line)[source]

Initialize by parsing line

COLUMNS

TYPE

FIELD

DEFINITION

11-20

float

on1

O21

21-30

float

on2

O22

31-40

float

on3

O23

46-55

float

tn

T2

Parameters:

line (str) – line with PDB class

class pdb2pqr.pdb.REMARK(line)[source]

REMARK field

REMARK records present experimental details, annotations, comments, and information not included in other records. In a number of cases, REMARKs are used to expand the contents of other record types. A new level of structure is being used for some REMARK records. This is expected to facilitate searching and will assist in the conversion to a relational database.

__init__(line)[source]

Initialize by parsing line.

Parameters:

line (str) – line with PDB class

class pdb2pqr.pdb.REVDAT(line)[source]

REVDAT field

REVDAT records contain a history of the modifications made to an entry since its release.

__init__(line)[source]

Initialize by parsing a line.

Todo

If multiple modifications are present, only the last one in the file is preserved.

COLUMNS

TYPE

FIELD

DEFINITION

8-10

int

mod_num

Modification number.

14-22

string

mod_date

Date of modification (or release for new entries).

24-28

string

mod_id

Identifies this particular modification. It links to the archive used internally by PDB.

32

int

mod_type

An integer identifying the type of modification. In case of revisions with more than one possible mod_type, the highest value applicable will be assigned.

40-45

string

record

Name of the modified record.

47-52

string

record

Name of the modified record.

54-59

string

record

Name of the modified record.

61-66

string

record

Name of the modified record.

Parameters:

line (str) – line with PDB class

class pdb2pqr.pdb.SCALE1(line)[source]

SCALE2 PDB entry

class pdb2pqr.pdb.SCALE2(line)[source]

SCALE2 PDB entry

class pdb2pqr.pdb.SCALE3(line)[source]

SCALE3 PDB entry

class pdb2pqr.pdb.SCALEn(line)[source]

SCALEn baseclass

The SCALEn (n = 1, 2, or 3) records present the transformation from the orthogonal coordinates as contained in the entry to fractional crystallographic coordinates. Non-standard coordinate systems should be explained in the remarks.

__init__(line)[source]

Initialize by parsing line

COLUMNS

TYPE

FIELD

DEFINITION

11-20

float

sn1

S31

21-30

float

sn2

S32

31-40

float

sn3

S33

46-55

float

un

U3

Parameters:

line (str) – line with PDB class

class pdb2pqr.pdb.SEQADV(line)[source]

SEQADV field

The SEQADV record identifies conflicts between sequence information in the ATOM records of the PDB entry and the sequence database entry given on DBREF. Please note that these records were designed to identify differences and not errors. No assumption is made as to which database contains the correct data. PDB may include REMARK records in the entry that reflect the depositor’s view of which database has the correct sequence.

__init__(line)[source]

Initialize by parsing line

COLUMNS

TYPE

FIELD

DEFINITION

8-11

string

id_code

ID code of this entry.

13-15

string

res_name

Name of the PDB residue in conflict.

17

string

chain_id

PDB chain identifier.

19-22

int

seq_num

PDB sequence number.

23

string

ins_code

PDB insertion code.

25-28

string

database

Sequence database name.

30-38

string

db_id_code

Sequence database accession number.

40-42

string

db_res

Sequence database residue name.

44-48

int

db_seq

Sequence database sequence number.

50-70

string

conflict

Conflict comment.

Parameters:

line (str) – line with PDB class

class pdb2pqr.pdb.SEQRES(line)[source]

SEQRES field

SEQRES records contain the amino acid or nucleic acid sequence of residues in each chain of the macromolecule that was studied.

__init__(line)[source]

Initialize by parsing a line

COLUMNS

TYPE

FIELD

DEFINITION

9-10

int

ser_num

Serial number of the SEQRES record for the current chain. Starts at 1 and increments by one each line. Reset to 1 for each chain.

12

string

chain_id

Chain identifier. This may be any single legal character, including a blank which is used if there is only one chain.

14-17

int

num_res

Number of residues in the chain. This value is repeated on every record.

20-22

string

res_name

Residue name.

24-26

string

res_name

Residue name.

28-30

string

res_name

Residue name.

32-34

string

res_name

Residue name.

36-38

string

res_name

Residue name.

40-42

string

res_name

Residue name.

44-46

string

res_name

Residue name.

48-50

string

res_name

Residue name.

52-54

string

res_name

Residue name.

56-58

string

res_name

Residue name.

60-62

string

res_name

Residue name.

64-66

string

res_name

Residue name.

68-70

string

res_name

Residue name.

Parameters:

line (str) – line with PDB class

class pdb2pqr.pdb.SHEET(line)[source]

SHEET field

SHEET records are used to identify the position of sheets in the molecule. Sheets are both named and numbered. The residues where the sheet begins and ends are noted.

__init__(line)[source]

Initialize by parsing line

COLUMNS

TYPE

FIELD

DEFINITION

8-10

int

strand

Strand number which starts at 1 for each strand within a sheet and increases by one.

12-14

string

sheet_id

Sheet identifier.

15-16

int

num_strands

Number of strands in sheet.

18-20

string

init_res_name

Residue name of initial residue.

22

string

init_chain_id

Chain identifier of initial residue in strand.

23-26

int

init_seq_num

Sequence number of initial residue in strand.

27

string

init_i_code

Insertion code of initial residue in strand.

29-31

string

end_res_name

Residue name of terminal residue.

33

string

end_chain_id

Chain identifier of terminal residue.

34-37

int

end_seq_num

Sequence number of terminal residue.

38

string

end_i_code

Insertion code of terminal residue.

39-40

int

sense

Sense of strand with respect to previous strand in the sheet. 0 if first strand, 1 if parallel, -1 if anti-parallel.

42-45

string

cur_atom

Registration. Atom name in current strand.

46-48

string

curr_res_name

Registration. Residue name in current strand.

50

string

curChainId

Registration. Chain identifier in current strand.

51-54

int

curr_res_seq

Registration. Residue sequence number in current strand.

55

string

curr_ins_code

Registration. Insertion code in current strand.

57-60

string

prev_atom

Registration. Atom name in previous strand.

61-63

string

prev_res_name

Registration. Residue name in previous strand.

65

string

prevChainId

Registration. Chain identifier in previous strand.

66-69

int

prev_res_seq

Registration. Residue sequence number in previous strand.

70

string

prev_ins_code

Registration. Insertion code in previous strand.

Parameters:

line (str) – line with PDB class

class pdb2pqr.pdb.SIGATM(line)[source]

SIGATM class

The SIGATM records present the standard deviation of atomic parameters as they appear in ATOM and HETATM records.

__init__(line)[source]

Initialize by parsing line

COLUMNS

TYPE

FIELD

DEFINITION

7-11

int

serial

Atom serial number.

13-16

string

name

Atom name.

17

string

alt_loc

Alternate location indicator.

18-20

string

res_name

Residue name.

22

string

chain_id

Chain identifier.

23-26

int

res_seq

Residue sequence number.

27

string

ins_code

Code for insertion of residues.

31-38

float

sig_x

Standard deviation of orthogonal coordinates for X in Angstroms.

39-46

float

sig_y

Standard deviation of orthogonal coordinates for Y in Angstroms.

47-54

float

sig_z

Standard deviation of orthogonal coordinates for Z in Angstroms.

55-60

float

sig_occ

Standard deviation of occupancy.

61-66

float

sig_temp

Standard deviation of temperature factor.

73-76

string

seg_id

Segment identifier, left-justified.

77-78

string

element

Element symbol, right-justified.

79-80

string

charge

Charge on the atom.

Parameters:

line (str) – line with PDB class

class pdb2pqr.pdb.SIGUIJ(line)[source]

SIGUIJ class

The SIGUIJ records present the anisotropic temperature factors.

__init__(line)[source]

Initialize by parsing line:

COLUMNS

TYPE

FIELD

DEFINITION

7-11

int

serial

Atom serial number.

13-16

string

name

Atom name.

17

string

alt_loc

Alternate location indicator.

18-20

string

res_name

Residue name.

22

string

chain_id

Chain identifier.

23-26

int

res_seq

Residue sequence number.

27

string

ins_code

Insertion code.

29-35

int

sig11

Sigma U(1,1)

36-42

int

sig22

Sigma U(2,2)

43-49

int

sig33

Sigma U(3,3)

50-56

int

sig12

Sigma U(1,2)

57-63

int

sig13

Sigma U(1,3)

64-70

int

sig23

Sigma U(2,3)

73-76

string

seg_id

Segment identifier, left-justified.

77-78

string

el.ment

Element symbol, right-justified.

79-80

string

charge

Charge on the atom.

Parameters:

line (str) – line with PDB class

class pdb2pqr.pdb.SITE(line)[source]

SITE class

The SITE records supply the identification of groups comprising important sites in the macromolecule.

__init__(line)[source]

Initialize by parsing the line

COLUMNS 8-10

TYPE int

FIELD seq_num

DEFINITION Sequence number.

12-14

string

site_id

Site name.

16-17

int

num_res

Number of residues comprising site.

19-21

string

res_name1

Residue name for first residue comprising site.

23

string

chain_id1

Chain identifier for first residue comprising site.

24-27

int

seq1

Residue sequence number for first residue comprising site.

28

string

ins_code1

Insertion code for first residue comprising site.

30-32

string

res_name2

Residue name for second residue comprising site.

34

string

chain_id2

Chain identifier for second residue comprising site.

35-38

int

seq2

Residue sequence number for second residue comprising site.

39

string

ins_code2

Insertion code for second residue comprising site.

41-43

string

res_name3

Residue name for third residue comprising site.

45

string

chain_id3

Chain identifier for third residue comprising site.

46-49

int

seq3

Residue sequence number for third residue comprising site.

50

string

ins_code3

Insertion code for third residue comprising site.

52-54

string

res_name4

Residue name for fourth residue comprising site.

56

string

chain_id4

Chain identifier for fourth residue comprising site.

57-60

int

seq4

Residue sequence number for fourth residue comprising site.

61

string

ins_code4

Insertion code for fourth residue comprising site.

Parameters:

line (str) – line with PDB class

class pdb2pqr.pdb.SLTBRG(line)[source]

SLTBRG field

The SLTBRG records specify salt bridges in the entry. records and is provided here for convenience in searching.

__init__(line)[source]

Initialize by parsing line

COLUMNS

TYPE

FIELD

DEFINITION

13-16

string

name1

Atom name.

17

string

alt_loc1

Alternate location indicator.

18-20

string

res_name1

Residue name.

22

string

chain_id1

Chain identifier.

23-26

int

res_seq1

Residue sequence number.

27

string

ins_code1

Insertion code.

43-46

string

name2

Atom name.

47

string

alt_loc2

Alternate location indicator.

48-50

string

res_name2

Residue name.

52

string

chain_id2

Chain identifier.

53-56

int

res_seq2

Residue sequence number.

57

string

ins_code2

Insertion code.

60-65

string

sym1

Symmetry operator for 1st atom.

67-72

string

sym2

Symmetry operator for 2nd atom.

Parameters:

line (str) – line with PDB class

class pdb2pqr.pdb.SOURCE(line)[source]

SOURCE field

The SOURCE record specifies the biological and/or chemical source of each biological molecule in the entry. Sources are described by both the common name and the scientific name, e.g., genus and species. Strain and/or cell-line for immortalized cells are given when they help to uniquely identify the biological entity studied.

__init__(line)[source]

Initialize by parsing a line

COLUMNS

TYPE

FIELD

DEFINITION

11-70

string

source

Identifies the source of the macromolecule in a token: value format

Parameters:

line (str) – line with PDB class

class pdb2pqr.pdb.SPRSDE(line)[source]

SPRSDE field

The SPRSDE records contain a list of the ID codes of entries that were made obsolete by the given coordinate entry and withdrawn from the PDB release set. One entry may replace many. It is PDB policy that only the principal investigator of a structure has the authority to withdraw it.

__init__(line)[source]

Initialize by parsing line

COLUMNS

TYPE

FIELD

DEFINITION

12-20

string

super_date

Date this entry superseded the listed entries.

22-25

string

id_code

ID code of this entry.

32-35

string

sid_code

ID code of a superseded entry.

37-40

string

sid_code

ID code of a superseded entry.

42-45

string

sid_code

ID code of a superseded entry.

47-50

string

sid_code

ID code of a superseded entry.

52-55

string

sid_code

ID code of a superseded entry.

57-60

string

sid_code

ID code of a superseded entry.

62-65

string

sid_code

ID code of a superseded entry.

67-70

string

sid_code

ID code of a superseded entry.

Parameters:

line (str) – line with PDB class

class pdb2pqr.pdb.SSBOND(line)[source]

SSBOND field

The SSBOND record identifies each disulfide bond in protein and polypeptide structures by identifying the two residues involved in the bond.

__init__(line)[source]

Initialize by parsing line

COLUMNS

TYPE

FIELD

DEFINITION

8-10

int

ser_num

Serial number.

16

string

chain_id1

Chain identifier.

18-21

int

seq_num1

Residue sequence number.

22

string

icode1

Insertion code.

30

string

chain_id2

Chain identifier.

32-35

int

seq_num2

Residue sequence number.

36

string

icode2

Insertion code.

60-65

string

sym1

Symmetry operator for 1st residue.

67-72

string

sym2

Symmetry operator for 2nd residue.

Parameters:

line (str) – line with PDB class

class pdb2pqr.pdb.TER(line)[source]

TER class

The TER record indicates the end of a list of ATOM/HETATM records for a chain.

__init__(line)[source]

Initialize by parsing line:

COLUMNS

TYPE

FIELD

DEFINITION

7-11

int

serial

Serial number.

18-20

string

res_name

Residue name.

22

string

chain_id

Chain identifier.

23-26

int

res_seq

Residue sequence number.

27

string

ins_code

Insertion code.

Parameters:

line (str) – line with PDB class

class pdb2pqr.pdb.TITLE(line)[source]

TITLE field

The TITLE record contains a title for the experiment or analysis that is represented in the entry. It should identify an entry in the PDB in the same way that a title identifies a paper.

__init__(line)[source]

Initialize by parsing a line.

COLUMNS

TYPE

FIELD

DEFINITION

11-70

string

title

Title of the experiment

Parameters:

line (str) – line with PDB class

class pdb2pqr.pdb.TURN(line)[source]

TURN field

The TURN records identify turns and other short loop turns which normally connect other secondary structure segments.

__init__(line)[source]

Initialize by parsing line

COLUMNS

TYPE

FIELD

DEFINITION

8-10

int

seq

Turn number; starts with 1 and increments by one.

12-14

string

turn_id

Turn identifier.

16-18

string

init_res_name

Residue name of initial residue in turn.

20

string

init_chain_id

Chain identifier for the chain containing this turn.

21-24

int

init_seq_num

Sequence number of initial residue in turn.

25

string

init_i_code

Insertion code of initial residue in turn.

27-29

string

end_res_name

Residue name of terminal residue of turn.

31

string

end_chain_id

Chain identifier for the chain containing this turn.

32-35

int

end_seq_num

Sequence number of terminal residue of turn.

36

string

end_i_code

Insertion code of terminal residue of turn.

41-70

string

comment

Associated comment.

Parameters:

line (str) – line with PDB class

class pdb2pqr.pdb.TVECT(line)[source]

TVECT class

The TVECT records present the translation vector for infinite covalently connected structures.

__init__(line)[source]

Initialize by parsing line

COLUMNS

TYPE

FIELD

DEFINITION

8-10

int

serial

Serial number

11-20

float

t1

Components of translation vector

21-30

float

t2

Components of translation vector

31-40

float

t2

Components of translation vector

41-70

string

text

Comments

Parameters:

line (str) – line with PDB class

pdb2pqr.pdb.read_atom(line)[source]

If the ATOM/HETATM is not column-formatted, try to get some information by parsing whitespace from the right. Look for five floating point numbers followed by the residue number.

Parameters:

line (str) – the line to parse

pdb2pqr.pdb.read_pdb(file_)[source]

Parse PDB-format data into array of Atom objects.

Parameters:

file (file) – open File-like object

Returns:

(a list of objects from this module, a list of record names that couldn’t be parsed)

Return type:

(list, list)

pdb2pqr.pdb.register_line_parser(klass)[source]

Register a line parser in the global dictionary.

Parameters:

klass – class for line parser