The objective of this tutorial is to
      become familiar with the MMTSB Tool Set and learn how to prepare a system for 
      modeling tasks starting with a file from the Protein Data Bank. The goal
      is to prepare a solvated system of a protein-RNA complex that can be used as the input for
      simulation studies.
    
	
	  | In this tutorial the double-stranded RNA-binding protein 
              Xlrbpa from Xenopus laevis is used as an example. A crystal structure for the 
	      protein-RNA complex is available from the Protein Data Bank 
	      with the PDB code 
	      1DI2. A picture of the asymmetric unit is shown on the right. More
               information about the crystal structure is available from this paper. The system consists of two RNA-binding domains (chain A and B), four chains
              of RNA (C, D, E, G), and the oxygen positions of 359 crystallographic water molecules.
              The RNA in the biological complex is obtained by using the symmetry copy of chain E
              instead of chain G. The first RNA strand consists of chain E followed by chain C, the
              second RNA strand of chain D followed by the symmetry copy of chain E. According to
              the PDB entry, the crystal structure was solved for a mutated protein (N112M). |   | 
        |  | 
         
	  | 1. Obtain/copy the PDB file 1DI2.pdb from the Protein Data Bank to the current
               working directory. | 
        | 
 | 
	
	  | 2. Extract the protein chains A and B 
    convpdb.pl -chain A 1DI2.pdb > 1di2.proteinA.pdb
    convpdb.pl -chain B 1DI2.pdb > 1di2.proteinB.pdb
             | 
        | 
 | 
	
	  | 3. Reverse the N112M mutation to obtain the biological sequence with mutate.pl. Because the
               mutation script does not handle chain IDs well, the chain ID of the input structure is first removed
               (set to blank) and later reset to A or B respectively. Note, how multiple commands can be combined through
               pipes so that the output from one command is used as input for the next command. 
 
    convpdb.pl -setchain " " 1di2.proteinA.pdb | mutate.pl -seq 112:N  | \ 
               convpdb.pl -setchain A > 1di2.proteinA.mutated.pdb
            Repeat for chain B.
 | 
        | 
 | 
	
	  | 4. Combine the two protein chains into a single PDB file containing both chains. 
    convpdb.pl -merge 1di2.proteinB.mutated.pdb 1di2.proteinA.mutated.pdb > \
               1di2.protein.pdb
            Take a look at the resulting structure with VMD. It should look like this:
 
  
 | 
        | 
 | 
 	
	  | 5. Extract the nucleic acid chains C, D, and E 
    convpdb.pl -chain C 1DI2.pdb > 1di2.rnaC.pdb
    ...
             | 
        | 
 | 
	
	  | 6. Generate the symmetry copy of chain E. The rotation matrix is available from the header
               of the PDB file (see the second matrix under 'REMARK 350 APPLY THE FOLLOWING TO CHAINS: E'). 
    convpdb.pl  -rotate -1 0 0 0 1 0 0 0 -1 1di2.rnaE.pdb > 1di2.rnaE.sc.pdb
             | 
	| 
 | 
	
	  | 7. Generate the first RNA strand from chain E followed by chain C. This requires renumbering
               of the chains from the original PDB. 
    convpdb.pl -setchain C -renumber 1 1di2.rnaE.pdb > \
               1di2.rna.strand1.part1.pdb
    convpdb.pl -renumber 11 1di2.rnaC.pdb > 1di2.rna.strand1.part2.pdb
            Merge the two parts into a single file:
    convpdb.pl -merge 1di2.rna.strand1.part2.pdb \
                1di2.rna.strand1.part1.pdb > 1di2.rna.strand1.pdb
            Repeat to generate strand 2 from chain D followed by the symmetry copy of chain E.
 Merge the two strands into a single file:
 
    convpdb.pl -merge 1di2.rna.strand2.pdb 1di2.rna.strand1.pdb > \
               1di2.rna.pdb
            Take a look at the resulting structure with VMD. It should look like this:
 
  
 | 
	| 
 | 
	
	  | 8. If you took a closer look at the resulting RNA duplex you might have noticed that the two
               strands appear to be broken in the middle. This is an artifact of how the crystal structure
               was obtained. The strand breaks can be "repaired" with a very short minimization in CHARMM because
               CHARMM can automatically add missing atoms. The following command carries out only 10 steps of
               steepest descent minimization. The 'nodeoxy' flag is needed to tell CHARMM that we are
               working with RNA instead of DNA. 
     minCHARMM.pl -par nodeoxy,minsteps=0,sdsteps=10 1di2.rna.pdb > \
                   1di2.rna.fixed.pdb
            Take a look at the resulting structure. The strand breaks should be gone. Also, the structure 
               now has all of the hydrogens compared to the PDB structure where hydrogens are missing because
               they are difficult to resolve experimentally. | 
	| 
 | 
	
	  | 9. We are now ready to combine the protein and RNA into a single file: 
      convpdb.pl -merge 1di2.rna.fixed.pdb 1di2.protein.pdb > 1di2.complex.pdb
            The model built so far should look like in the following image:
 
  
 | 
	| 
 | 
	
	  | 10. Next, we will add solvent to the complex. The crystal structure already 
            contains a number of water molecules. We will keep those and then add additional
            waters around to fill a simulation box. 
 First, we extract the waters from the PDB file and add missing hydrogens. It is
            convenient to keep the X-ray waters in a separate chain (chain X):
 
      convpdb.pl -nsel water 1DI2.pdb | complete.pl | convpdb.pl \ 
                 -setchain X > 1di2.xray.waters.pdb
            The  X-ray waters are combined with the complex ...
      convpdb.pl -merge 1di2.xray.waters.pdb 1di2.complex.pdb > \
                1di2.complex.waters.pdb
            ... and then solvated in a cubic box with at least 9 A between the complex or any 
            of the X-ray waters to the edge of the box:
      convpdb.pl -solvate -cutoff 8 -cubic 1di2.complex.waters.pdb > \
                  1di2.complex.solvated.pdb 
             | 
	| 
 | 
	
	  | 11. As a last step we need to add counterions to neutralize the system. The charge of 
            the protein-DNA complex can be obtained from CHARMM with the following command. Again,
            the 'nodeoxy' option is used because the complex contains RNA instead of DNA. 
      enerCHARMM.pl -par nodeoxy -charge 1di2.complex.pdb
            Counterions are added to solvated system by specifying the number of positive (SOD) and/or 
            negative ions (CLA). Decide how many and which type of ions we need to neutralize this
            system from the output of the previous command and then run the following command:
      convpdb.pl -ions <type>:<num> 1di2.complex.solvated.pdb > 1di2.complex.solvions.pdb
            You can check whether the resulting system is neutral with:
      enerCHARMM.pl -par nodeoxy -charge 1di2.complex.solvions.pdb
            Finally, we should have a fully solvated system that is ready as a starting point for simulations.
 
  
 |