Energy evaluation

The first step in force-field based protein modeling involves the calculation of conformational energies. Such energies can be used to judge the relative stability of different conformations for a given protein structure. An important application is the use of force-field based energies as a scoring function in order to find biologically relevant conformations from a set of previously generated structures. This approach is based on the very reasonable assumption that biologically important, native, protein conformations reside at the global energy minimum and that a gradual decrease in energy towards this minimum can be used to discriminate models that are closer to the native structure from models that are further away.

The energy quantity that is of importance here is the free energy that contains not just enthalpic but also entropic contributions of both the protein chain and the surrounding solvent. The enthalpic energy component is readily available from a given protein force field as the internal interaction energy within the protein and between the protein and the solvent environment. The entropic components are generally more difficult to calculate. If an implicit solvent representation is used, based on a dielectric continuum rather than explicit solvent molecules, free energies of solvation are available directly. This leaves only the protein chain entropies that need to be added to obtain a complete estimate of the relative free energy of a given protein conformation. This approach, named MMPB/SA, has been put forth by Kollman and Case and involves the following terms:

G = U(int) + G(solv,p) + G(solv,np) + S(prot)
In this formula, the internal energy U(int) contains all of the intra-molecular bonded and non-bonded interactions within the protein as given by the force field. The free energy of solvation, which is the energy required to transfer a given system from vacuum into a solvent environment with a given dielectric constant, is calculated in two parts. The polar, or electrostatic, contribution G(solv,p) can be calculated from solutions to the Poisson-Boltzmann equation, hence the name MMPB/SA. Since Poisson-Boltzmann calculations are fairly expensive, especially if larger systems are considered, an analytical approximation based on computationally much more efficient Generalized Born formalisms is commonly used The resulting free energy estimates are then called MMGB/SA.
The non-polar contribution to the free energy of solvation, G(solv,np), stems from the work required to form a cavity within the solvent to accomodate the protein solute as well as van der Waals interactions between protein and solvent. This energy term is usually approximated as a simple linear function of the solvent accessible surface area.
Finally, the chain entropy of the protein, S(prot), may be estimated from normal modes of a well minimized structure. Such a calculation is generally quite costly, however, and often not very practical for a large number of conformations. Since chain entropies vary little for similar or even similarly compact structures, it is often calculated only for a small set of representative conformations or omitted altogether.

Within the MMTSB Tool Set conformational energies can be calculated with CHARMM or Amber through the utilities enerCHARMM.pl and enerAmber.pl that provide interfaces to either modeling package. Within CHARMM, Amber and OPLS force fields can be used in addition to standard CHARMM force fields. These tools, as all the other tools for force-field based all-atom modeling, require that recent versions of CHARMM (c29a2 or newer) and/or Amber (7 or newer) are installed and error messages will result if the program packages are either missing or only older versions are available.

The perl tools enerCHARMM.pl and enerAmber.pl, as well as other tools for running minimizations and molecular dynamics, are meant to relieve the user from having to learn how to use the fairly complex molecular modeling packages directly if the applications consist of relatively simple routine tasks. The tool set then automatically takes care of all necessary setup tasks under a reasonable set of assumptions that should be appropriate for most standard applications. It is still possible, though, to control the energy terms and simulation parameters through an extensive set of options that can be specified to alter the default behavior, if necessary.

In order to start with a simple example, let us use enerCHARMM.pl to obtain the internal energy of a given protein conformation from a PDB file:
enerCHARMM.pl 1vii.exp.min.pdb
      -513.6242
In this example, the CHARMM program is started, topology and parameter files are read from the (default) CHARMM22 force field, a molecule structure is setup with cartesian coordinates from the given PDB file, missing atoms such as hydrogens or termini are added if possible, electrostatic interactions are setup for a vacuum environment with a 18 Å cutoff, and the total internal energy, consisting of bonded and non-bonded terms, is calculated, returned and printed out. These are a lot of steps for such a simple command that would have to be done explicitly if CHARMM was run directly.

The two main sets of options that are available with enerCHARMM.pl are used to control the interaction with CHARMM (-par) and indicate which energy terms should be written out (-out) since not just the total energy, but a number of energy components are also available from CHARMM.

Energy components are identified through keywords such as bonds for direct bond interactions between covalently linked atoms, angles for three-atom angle terms between bonded atoms, dihedrals for four-atom torsion terms, vdwaals for van der Waals energies and elec for electrostatic interactions between non-bonded atoms, just to list the most important terms. One or more terms in any order can be requested with a comma-separated list as in the following example:
enerCHARMM.pl 
-out bonds,angles,vdwaals,elec 1vii.exp.min.pdb
bonds                        41.2286
angles                       95.2096
vdwaals                     -70.1491
elec                       -756.0559
In some cases, when this output is processed is further, the multiline output format is inconvenient. With the option -oneline it can be changed to a single line for all values in the order in which the energy terms were given on the command line:
enerCHARMM.pl -oneline 
-out bonds,vdwaals,elec 1vii.exp.min.pdb
        41.2286        -70.1491       -756.0559
Furthermore, it is possible to combine energy terms through addition or substraction by using the (+) and (-) operators. An example for such expressions is shown in the following:
enerCHARMM.pl -oneline 
-out bonds+angles+dihedrals,vdwaals+elec 
1vii.exp.min.pdb
       295.1188       -826.2051
In this case the bonded and non-bonded energy terms are summed up individually to give the two energy values shown in the output.