Server Guide

The PSIPRED Protein Structure Analysis Workbench aggregates several UCL Bioinformatics Group prediction methods into one location. The following document gives a brief description of the services and how to use them. We additionally summarise the results the each analysis produces.

This guide is divided into three main sections. The first two sections explain the Input Form and the Results pages; the last section redirects to our Tutorials page, where a few cases are examined in more detail. You can view the input form at the main web page for the PSIPRED Server. You can also click here to view a fully interactive mock version of a typical results page.

CONTENTS

Input Data

The input form allows users to select the analyses they wish to perform and input their query sequence. There are a number of mandatory fields.

Select input data type

Users need to select whethere their input data is a protein sequence (either single sequences or and MSA) or a PDB structure file (in old format).

Choose Method

You must choose at least 1 predictive method to run.

Input Sequence

If you selected "Sequence Data" then enter your AMINO ACID sequence or MSA here. Please do not try to enter a nucleic acid sequence. We recommend that you enter your sequence as a plain single-letter string like this:

ALGSNLNTPVEQLHAALKAISQLSNTHLVTTSSFYKSKPLGPQDQPDYVNAVAKIETEL

Alternatively, you can enter your sequence in FASTA format, but the description text will be ignored by the server. Note that MSA data must be in FASTA format.

There is an upper limit of 1,500 residue to the length of sequences which can be submitted/ If your sequence is longer than this, try breaking it into likely domains before submitting it.

Input PDB

If you selected "PDB Structure" then use the Browse button to find the location of the PDB file you wish to predict.

Submission Details

Email Address

Enter your e-mail address here. Results will be returned as soon as they are available - usually within 40 minutes, though sometimes longer depending on the server load. Bear in mind that if you enter an incorrect e-mail address or do not provide and e-mail address. , there is no way the server can contact you! Also watch out that your anti-spam software isn't rejecting the messages from our server. Emails are held only for as long as it takes for complete predictions and send the emails.

Short Identifier

Use this field to assign a short memorable name to your prediction job. This is useful so that you can identify particular jobs in your mailbox. This is particularly important because PSIPRED will not necessarily return your results in the order you submitted them! Generally speaking, shorter jobs will be returned first. The name you specify will be included in the subject line of the e-mail messages sent to you from the server.

Advanced Options

Dompred

PSI-BLAST e-value cutoff:

Optimisation of the PSI-BLAST sequence alignment domain prediction showed an E-value cut-off of 0.01 to give the best trade-off between the sensitivity and selectivity (define?) of domain boundary prediction. Decreasing the E-value (ie reducing the number of 'significant' aligned sequences) was found to reduce sensitivity however increase the selectivity of domain boundary prediction.

Input number of PSI-BLAST iterations (default 5)

The default number of PSI-BLAST iterations used is 5. Decreasing the iteration number may increase the speed of the PSI-BLAST search, but my also result in the failure to identify more distant homologues. The user should be aware that the higher the iteration value the higher the risk of introducing profile wander into the PSI-BLAST sequence search

Bioserf and Domserf

Modeller Key

BioSerf is a fully automated homology modelling pipeline which uses MODELLER to construct a final homology model. Because of the licence terms if you select a BioSerf job you are required to provide the MODELLER Key available from the Sali Lab.

FFPred

SVM Library

You can choose an SVM search library optimised for either Human or Fly genes.

Metsite

Select Metal

Select one of six coordinating atoms for prediction

Select Chain ID

For the submitted PDB file you must select the chain ID you wish to make the prediction over

Select False Positive Rate

Use this to select the prediction sensitivity.

Memembed

Search Type

Select the search algorithm for minimising the energy of embedding the structure in the membrane

Target is Beta Barrel

Select whether the structure provided is a beta barrel fold

N-Terminal Location

Select the cellular location of the N-Terminus of the protein.

HSPred

Chain Selection

Select the chain IDs for the chains you wish to analyse

Results Page

Heading

At the top of the page a title bar shows you the name of your job and provides and easy button to copy the URL for your results page.

Sequence Annotation Panel

The sequence plot shows you the sequence you sumbmitted. If you have run a PSIPRED job residues will be annotated as per the predicted secondary structure. If you have run a MEMSAT, MESATSVM or MEMPACK job residues will be annotated as per the location of predicted TM Helices. If you have run both types of analysis you can toggle between these annotations with the appropriate buttons. Also note that if a DISOPRED or DOMPRED job has been run then predicted disordered residues and any putative domain boundaries will be marked. Please note that all domain boundaries will be annotated, this is not to imply they are all always simultaneously applicable.

Downloads Panel

When your job completes links to all the results files are available in this panel

Resubmission panel

The resubmission panel allows you to resubmit your sequence or a subsequence of it for further analysis. First use the boxes to select the sequence region you wish to resubmit. Select a method. Finally click the "Resubmit" button to submit a new job to the server. One obvious use would be to resubmit domain subsequences after running a DOMPRED job. age.

PSIPRED Results

PSIPRED returns the results of a neural network based 3-state Secondary Structure predictor. It is widely regarded as one of the best performing predictors in this field.

PSIPRED Cartoon

The diagrams annotate the query sequence with secondary structure cartoons and confidence value at each position in the alignment. The confidence is given as a series of blue bar graphs.

DISOPRED Results

Disopred Plot

The graph shows the DISOPRED3 disorder confidence levels against the sequence positions as a solid blue line. The chart shows the confidence values output by the predictor for Disoreder residues. Values great enough to match the selected false positive rate are added to the sequence annotation graph

To read background information about DISOPRED2 and some of the motivation behind disorder prediction you can view the DISOPRED Overview pages

MEMSAT-SVM Results

MEMSAT Schematics

The first diagram shows a cartoon of the MEMSATSVM and MEMSAT3 TM helix predictions. MEMSATSVM predictions now include a prediction of pore-lining helices. The key for the schematic can be found at the bottom of the diagram. Below the schematic are the traces for the assorted SVM outputs that the MEMSATSVM prediction was based on. Further down the page are a series of cartoon diagrams of the membrane topology annotated with the predicted helix coordinates

GenTHREADER, pGenTHREADER and pDomTHREADER Results

Genthreader Table

Each table show the number of structural hits for the query sequence. These are full PDB chains for GenTHREADER and pGenTHREADER and CATH domains for pDomTHREADER. For each structure the first portion of the table gives summary statistics

  • Conf. : The hit confidence category based on p-value; GUESS (<1), LOW (<=0.1), MEDIUM (<=0.01), HIGH (<=0.001), CERT (<=0.0001)
  • Net Score: The GenTHREADER raw score
  • P-Value : The p-value
  • Pair E: The Pairwise Energy
  • Solv E: The solvation Energy
  • Aln Score: The Pairwise alignment score
  • Aln Len: The length of the alignment
  • Str Len: The length of the structural hit
  • Seq Len: The length of the query sequence
  • Domain Start: The start of the domain (pDomTHERADER only)
  • Domain End: The end of the domain (pDomTHERADER only)
  • Domain Code: The CATH code for the domain hit (pDomTHREADER only)
The latter portion of the table links out to other resources and has the following columns
  • View Alignment: A button that opens an annotated alignment. Known ligand binding residues are annotated on the hit
  • SCOP Codes: A link that searches SCOP for the PDB chain (genTHREADER and pGenTHREADER only)
  • CATH Codes: A link that searches CATH for the PDB chain (genTHREADER and pGenTHREADER only)
  • Structure: A thumbnail image of the hit, clicking the link will take you to PDBSum
  • CATH Entry: A link that searches CATH web services to summarise the hit.

MetaPSICOV Results

Contact Map

MetaPSICOV produces a contact map. A 2D binary matrix of contacts. The query protein sequence is laid out along the both the X and Y axes and contacting residues are indicated by a 1 in the matrix. In the diagram these are annotated in green and the self-self cells are omitted and colours in black. This is a symmetric matrix so the lower and upper halves display that same data.

MEMPACK Results

Packing Diagram

Running a MEMPACK job will also run a MEMSATSVM job and those results will also be available. The MEMPACK output shows a top down diagram of the possible packing of the predicted transmembrane helices. Possible residues contacts are predicted between each helix then the helices are arranged and oriented to maximise the number of helix contacts that face one another.

DomPred Results

Boundary Graph

The DOMPRED output shows the graph output by the PSI-BLAST aligned termini algorithm. The graph annotates secondary structure regions, peaks in the aligned termini profile indicate regions that may form a Structural domain boundary. The putative domain boundaries represent the highest peaks in the graph. You can read the DomPred Overview for background and further details.

Bioserf and Domserf Results

Homology Model

If you provide a valid MODELLER key you are able to run a Bioserf or Domserf job. Bioserf is a fully automated homology modelling service which integrates PSI-BLAST, HHBlits, PSIPRED, pGenTHREADER and MODELLER. The final output is a PDB file which can be viewed by clicking the Bioserf tab on the results page. Bioserf produces a single homology model over the whole query protein chain

Domserf is a structural domain modelling protocol integrating PSI-BLAST, HHBLITS and pDomTHREADER searches of the CATH domain database. One of more models are produced and users can select these by using the buttons which indicate which sequence region each model applies to

FFPred Results

GO Term Table

FFPred attempts to predict GO terms for eukaryotic proteins using a series of Support Vector Machines (SVMs) optimised for either Fly or Human genes. The top of the page gives three tables which summarise these predictions, one table for each Gene Ontology domain (Biological Process, Molecular Function, Cellular Component). The tables provide the scoring for each GO term, equal to the posterior probability for the query protein to be annotated with that GO term. Also, note that predictions obtained using less reliable SVMs are shown at the bottom of each table over a red background. SVMs are regarded as reliable when their MCC, sensitivity, specificity and precision are jointly above a given threshold.

Below the tables are summaries of the features that were calculated for the incoming query sequence, and were used by the SVMs to obtain the predictions.

HSPred Results

Structure

The HSPred output indicates where on either side of the definied protein-protein interface that there are hotspot residues. Hot spots are here defined as those residues for which ? ? G > 2 kcal/mol (where ? ? G is the change in binding free energy due to an alanine mutation at that position). Outputs present the 2 analysed PDB chains. Blue residues are not involved in the interface, white residues have a minor contribution to the change in free energy and red residues are the main hotspot residues

Table

The table provided shows a list of the possible hotspot residues ranked by the predictor raw score

MEMEMBED Results

Membrane Embedding diagram

The results show the PDB structure the user provided embedded in a simple model of the lipid bilayer. Such that a simple energy potential is minimised

MetSite Results

Structure Diagram

The metsite output show the location of possible metal binding residues. Each residue that may bind the selected metal ion is listed in the table at the bottom of the page and these are annotated in the Structure. In the structure chains which were not used in the prediction are coloured black. Metal binding residues are in red.