Exercise 1



Computer Accounts
Hint
Using Unix
Activity 1
The text editior jot
Activity 2
Use of the World Wide Web
Data Bases
File Format Conversions
PDB files
Activity 3
Activity 4
Report for this Exercise

Computer Accounts

If you are not already installed as a user of the CCBL computer cluster, you will be made a regular user. Logging into the system will be demonstrated during the first laboratory session. You will have to give your user name and password each time you log into the system.

Minimize the Console window that comes up after you log on and do not use it for your work unless really necessary. System error messages appear here and if you are heavily using the window it will be hard to locate these messages.

You are allocated sufficient disk space to store 100MB of data. If you exceed this limit (easy to do) for more than a week, the system will not let you store additional files. Good housekeeping is essential--periodically remove files that you no longer need.

Workstations in the CCBL are networked. Any unit in the room can be used and will operate the identically to any other.

Changing your password: A simple password was used when your computer account was first established. It is a good idea to change this password immediately and then to change it frequently during the course of the quarter. Passwords must consist of 6 or more charac-ters, including at least two alphabetic characters, and at least one non-numeric character. Pass-words can (and should) be a mix of upper and lower cases. Most important is that passwords should be chosen so that they can be remembered!

To change your password, open a UNIX window and type 'yppasswd' at the system prompt. Respond to the questions that come up. You will be asked to enter the new password twice.

Return to the top of this page.
 

HINT

Consider keeping a laboratory notebook as you work through the laboratory exercises and your project.
 

Activity 1: Go to your home directory. Create a new directory called ch145 there. Change to the new directory (ch145) and create the following new directories there: exer1, exer2, exer3, exer4, exer 5, exer6, project. Now change to the directory exer1 and type pwd. Note that the path to the directory is given before the UNIX prompt. Confirm that this directory is empty (contains no files). The set of directories you have created can be used to segregate and store results obtained with each of the class exercises.

Return to the top of this page.
 

The text editor jot

Various text files, including the Cartesian coordinates of structures, are the input data for the software that will be used for this course. These files invariably have to be changed or edited in some way to removed unneeded information. The SGI operating system IRIX includes a rather nice text editor called jot. You can start the editor by opening a UNIX shell window (Toolchest -> Desktop -> Open Unix Shell) and then typing jot at the command line prompt. If you want to edit a particular file, type jot name_of_file. You will need to be able to edit text files rapidly and reliably, so spend some time becoming familiar with the capabilities of jot.

Return to the top of this page.

Activity 2: Go to the directory you have called exer1. Start the text editor. Check out the functions under the File, Edit, View, Select, and Options buttons. (One of the choices under Options is Long Menus. If this is selected, many more choices appear under these menu buttons including several search-and-replace functions.) Holding down and dragging the left mouse button lets you highlight text. Clicking the right mouse button then give you choice about what to do with the highlighted text.

Type some text into the first few lines. Store the text file in the directory exer1. Go to a UNIX window and confirm that the file has been stored as expected. Use a UNIX command to list the contents of the file and to determine how large (how many bytes) is the file you have stored. Then use UNIX to copy the text file to the directory exer4.

Return to the top of this page.
 

Use of the World Wide Web

The Netscape web browser is available as a pull-down from the Desktop menu item. If by some chance you have managed not to become familiar with searching the World Wide Web, learn how to access and use various search engines (Google, LookSmart, Lycos and the like). Learn how to connect to various web pages since information to be used for the exercises will have to be obtained through them. Learn how to access the class web page. You can find tutorials and other information on many aspects of the course available on the web. If you come across a particularly interesting, relevant, or useful site, let you fellow classmates know about it.

Return to the top of this page.

Activity 3: "Bookmarks" in the Netscape browser allow you to switch rapidly to web sites of particular interest. Erase all out-of-date or irrelevant Bookmarks from your copy of Netscape. Then open the following file: /usr/local/spec/sybyl/TriposBookshelf/index.html. This is the starting point for the on-line descriptions and tutorials associated with the SYBYL software that will be used for the course. Establish a Bookmark to this file.

Establish bookmarks for the following web resources:

Protein Data Bank
NIH Molecular Modeling homepage
National Center for Biological Information
MBS (Molecular Biology Shortcuts) website

Return to the top of this page.

Databases

One highly useful aspect of the WWW is the ready availability of structural data. Structural databases are updated frequently and are growing rapidly. In addition to the PDB, mentioned above, other structural databases include the Cambridge Structural Database  and the Nucleic Acid Database. (The CSD is not available on the WWW but rather is licensed to the department for an annual fee. A tutorial on using the CSD is available in the course reader.)

Return to the top of this page.

File Format Interconversions

Molecular structures used in computational studies of biochemical systems are almost always represented by a set of Cartesian coordinates. The coordinates generally are stored as text files. Unfortunately there is little agreement on what should be the format of these files and changing from one file format to another may be necessary when a particular program is to be used. BABEL (http://smog.com/chem/babel) is a freeware program that does interconversions of about 50 different file formats. It has been installed on the CCBL system and can be accessed by typing babel at the command line.

The commercial program mol2mol (http://www.compuchem.com/mol2mol.htm) offers a fancier interface and a different set of format interconversion possibilities.

Return to the top of this page.

PDB data files

The Protein Data Bank (PDB) is an archive of three-dimensional structures of biological macromolecules, intended to serve researchers, educators, and students. Each structure archived is represented by a file that contains atomic coordinates, bibliographic citations, primary and secondary structure information, as well as crystallographic structure factors or experimental NMR data.

PDB files have a definite format and use a specific nomenclature for amino acids and nucleosides. These are described in detail at http://www.rcsb.org/pdb/docs/format/pdbguide2.2/ guide2.2_frame.html and in the paper first announcing this format (J. Mol. Biol. 112, 535-542 (1997)). The information in the PDB files is line oriented. Each line in the file starts with a descriptor which is followed by data of some sort. PDB files are text files and can be modified by a text editor like jot. However, if a PDB file is modified the critical spacing or alignment of data must be retained in the modified files.

Standard residue names used in PDB entries are the following: amino acids :  ALA, ARG, ASN, ASP, CYS, GLN, GLU, GLY, HIS, ILE, LEU, LYS, MET, PHE, PRO, SER, THR, TRP, TYR, VAL, ASX, GLX; nucleic acids: A, C, G, T, U, I, +A, +C, +G, +T, +U, +I; other structural components: UNK (unknown); atoms not part of a standard amino acid or nucleoside: HETERO.

Inspection of PDB files will show that they usually contain the coordinates of other crystallographically distinguishable moieties, such as water, cofactors, and ions present during the crystallization process, in addition to the coordinates of the heavy atoms of the structure of interest.

Return to the top of this page.

Activity 4: Find and download the PDB file for "Dickerson's dodecamer". Open the downloaded (text) file in the text editor jot. Determine how many crystallographically distinct water molecules were found in the structure of this compound. What are the HETERO atoms in this structure?  Edit out the water molecules and write a new coordinate file for this structure with a different name from the original into the directory exer2.

Return to the top of this page.
 

Report for This Exercise

Answers to the following questions are to be turned in by January 14.

1. Find and print a description of the program Raster3D on the web. (Do not download or install this program on the CCBL system; this has already been done.)

2. Find and report the URL for a program that will help with the prediction of receptor-ligand interactions.

3. Give a complete literature citation for the paper that reported the structure of Dickerson's dodecamer. How many crystallographic water molecules are found in the structure reported in that paper?

Note: Your system directories will be inspected to determine if the activities outlined above have been carried out.

Return to the top of this page.
 

Return to the Lab Exercises Page.