Bioinformatics

Molecular biology in the internet

Main page

Appointments

Bioinformatics

Literature

Exercises

Tasks

Databases

Software

Sequence comparisons

Homology searches

Motif searches

Hidden Markov models

Hydrophobicity analyses

Topology and helix packing

Protein localization

Secondary structure

Super-secondary structure

3D structure

 

    Vorhersage der Proteinlokalisation:

    Mit der Entschlüsselung zahlreicher Genome und davon abgeleiteter Proteome stellt sich unweigerlich die Frage nach der Funktion dieser Proteine. Mitunter kann man mittels Datenbankvergleichen homologe Proteine finden, deren Funktion bekannt ist. Alternativ wäre es schon ein wertvoller Hinweis zur physiologischen Bedeutung eines Proteins, wenn man seine Lokalisation bestimmen oder vorhersagen könnte.

    PSORT ist ein Programm, das unter Verwendung verschiedener Vorhersage-Algorithmen (z.B. zur Identifizierung von membranspannenden Proteinsegmenten bzw. von Signalsequenzen zur Sekretion durch die Cytoplasmamembran) solche Vorhersagen erlaubt. iPSORT ist eine Weiterentwicklung dieses Ansatzes und basiert auf der Erkennung N-terminaler Sortingsignale. Die PSORT-Seite entpfiehlt sich auch wegen ihrer vorzüglichen Link-Sammlung!

    Ein anderer Ansatz ist im Programm NNPSL verwirklicht und beruht auf der Verwendung eines neuronalen Netzwerkes, mit dessen Hilfe lediglich von der Aminosäurezusammensetzung eines Proteins auf dessen Lokalisation geschlossen wird. Im Unterschied zum PSORT-Algorithmus ist hierbei für die erfolgreiche Vorhersage das Vorhandensein von Targetting-Signalen nicht erforderlich. Daher arbeitet dieses Programm auch mit Sequenzen aus Genvorhersage-Algorithmen, bei denen der N-Terminus unter Umständen nicht korrekt vorhergesagt wurde.

    Aus dem Labor von Gunnar von Heijne stammt ein Vorhersage-Algorithmus, TargetP, der davon ausgeht, dass die meisten Export- und Import-Wege N-terminale Peptidsequenzen als Lokalisationssignale erkennen. Auch dieses Programm basiert auf einem neuronalen Netzwerk. Es ist eines der derzeit besten Vorhersageprogramme.

    Weitere spezielle Programme (Organellen-Targetting, Kernlokalisierungssequenzen) finden sich in folgender Liste sowie unter "Referenzen".

    Weitere Internet-Sites zur Lokalisationsvorhersage:

  • PSORT-Abkömmlinge:

    Prokaryotic subcellular localization predictors:

  • P-Classifier (Wang et al., 2005) predicts the protein subcellular localization for Gram-negative bacteria using amino acid subalphabets and a combination of multiple support vector machines.
  • PSLpred (Bhasin et al., 2005) is a localization prediction tool for Gram-negative bacteria which utilizes support vector machine and PSI-BLAST to generate predictions for 5 localization sites.
  • LOCnet and LOCtarget (Nair and Rost, 2004). LOCnet is a eukaryotic and prokaryotic localization prediction tool that uses several of CUBIC's services to generate a prediction. LOCtarget is a database of predictions generated using LOCnet for eukaryotic structural genomics targets.
  • CELLO (subCELlular LOcalization predictive system) (Yu et al., 2004) uses Support Vector Machine based on n-peptide composition to assign a Gram-negative protein to the cytoplasm, inner membrane, periplasm, outer membrane or extracellular space.
  • LOCtree (Nair and Rost, 2005). LOCtree is a eukaryotic and prokaryotic localization prediction tool available at the CUBIC site. Databases of localization predictions made by CUBIC's servers are also available at this site.
  • SignalP (Bendtsen et al., 2004) predicts traditional N-terminal signal peptides in both prokaryotic and eukaryotic proteins.
  • Proteome Analyst's Subcellular Localization Server (Lu et al., 2004). The specialized server available at the PENCE Proteome Analyst site is able to classify Gram-negative, Gram-positive, fungi, plant and animal proteins to many localization sites.
  • SubLoc (Hua and Sun, 2001) uses Support Vector Machine to assign a prokaryotic protein to the cytoplasmic, periplasmic, or extracellular sites, and a eukaryotic protein to the cytoplasmic, mitochondrial, nuclear, or extracellular sites. A modified version of SubLoc was used in PSORT-B v.1.1 to differentiate cytoplasmic and non-cytoplasmic proteins.
  • NNPSL (Reinhardt and Hubbard, 1998) uses amino acid composition to assign a prokaryotic protein to the cytoplasmic, periplasmic, or extracellular sites, and a eukaryotic protein to the cytoplasmic, mitochondrial, nuclear, or extracellular sites.
  • ProtLock (Cedano et al., 1997) performs a correlation analysis of the amino acid composition and predicts the cellular location of a protein. The statistical analysis discriminates among the following five protein classes: integral membrane proteins, anchored membrane proteins, extracellular proteins, intracellular proteins and nuclear proteins.
  • Eukaryotic subcellular localization predictors:

  • PeroxiP predicts peroxisomal proteins and Pfam domains.
  • pTARGET (Guda and Subramaniam, 2005) uses amino acid composition and localization-specific Pfam domains to assign a eukaryotic protein to one of nine localization sites.
  • LOCSVMPSI (Xie et al., 2005) is a eukaryotic localization prediction method that incorporates evolutionary information into its predictions. The method uses PSI-BLAST and support vector machine to generate predictions for up to 12 localization sites.
  • pSLIP (Sarda et al, 2005) uses support vector machine and multiple physiochemical properties of amino acids to assign a eukaryotic protein to one of six localization sites.
  • Protein Prowler (Boden and Hawkins, 2005) classifies eukaryotic targeting signals as secretory, mitochondrion, chloroplast or other.
  • LOCtree (Nair and Rost, 2005). LOCtree is a eukaryotic and prokaryotic localization prediction tool available at the CUBIC site. Databases of localization predictions made by CUBIC's servers are also available at this site.
  • HSLpred (Bhasin et al, 2005) is a localization prediction tool for human proteins which utilizes support vector machine and PSI-BLAST to generate predictions for 4 localization sites.
  • PSLT (Scott et al., 2004) is a Bayesian network-based method that predicts human protein localization based on motif/domain co-occurence. The tool is not yet available online, however its predictions for 9793 human proteins in SWISS-PROT are available for download from the PSLT site.
  • LOCnet and LOCtarget (Nair and Rost, 2004). LOCnet is a eukaryotic and prokaryotic localization prediction tool that uses several of CUBIC's services to generate a prediction. LOCtarget is a database of predictions generated using LOCnet for eukaryotic structural genomics targets.
  • ESLPred (Bhasin and Raghava, 2004) uses Support Vector Machine and PSI-BLAST to assign eukaryotic proteins to the nucleus, mitochondrion, cytoplasm, or extracellular space.
  • SignalP (Bendtsen et al., 2004) predicts traditional N-terminal signal peptides in both prokaryotic and eukaryotic proteins.
  • Esub8 (Cui et al., 2004) predicts protein subcellular localizations (8 compartments) in eukaryotic organisms based on amino acid composition and using a support vector machine.
  • SecretomeP (Bendtsen et al., 2004) predicts eukaryotic proteins which are secreted via a non-traditional secretory mechanism.
  • Proteome Analyst's Subcellular Localization Server (Lu et al., 2004). The specialized server available at the PENCE Proteome Analyst site is able to classify Gram-negative, Gram-positive, fungi, plant and animal proteins to many localization sites.
  • NucPred (Heddad et al., 2004) uses the presence of nuclear localization signals identified through a genetic programming algorithm as the basis of its classification method.
  • LumenP (Westerlund et al., 2003) is a neural network predictor for protein localization in the thylakoid lumen. The program is available on request from the authors.
  • SubLoc (Hua and Sun, 2001) uses Support Vector Machine to assign a prokaryotic protein to the cytoplasmic, periplasmic, or extracellular sites, and a eukaryotic protein to the cytoplasmic, mitochondrial, nuclear, or extracellular sites. A modified version of SubLoc was used in PSORT-B v.1.1 to differentiate cytoplasmic and non-cytoplasmic proteins.
  • TargetP (Emanuelsson et al., 2000) predicts the presence of signal peptides, chloroplast transit peptides, and mitochondrial targeting peptides for plant proteins, and the presence of signal peptides and mitochondrial targeting peptides for eukaryotic proteins.
  • predictNLS (Cokol et al., 2000) uses nuclear localization signal motifs to predict whether a protein might be localized to the nucleus.
  • ACNpredictor (Jagla and Schuchhardt, 2000) is an artificial neural network (ACN) for recognition of sequence patterns. The method is applied to the prediction of signal peptide cleavage sites in human secretory proteins.
  • ChloroP (Emanuelsson et al., 1999) is a neural network based method for identifying chloroplast transit peptides and their cleavage sites.
  • NNPSL (Reinhardt and Hubbard, 1998) uses amino acid composition to assign a prokaryotic protein to the cytoplasmic, periplasmic, or extracellular sites, and a eukaryotic protein to the cytoplasmic, mitochondrial, nuclear, or extracellular sites.
  • ProtLock (Cedano et al., 1997) performs a correlation analysis of the amino acid composition and predicts the cellular location of a protein. The statistical analysis discriminates among the following five protein classes: integral membrane proteins, anchored membrane proteins, extracellular proteins, intracellular proteins and nuclear proteins.
  • Predotar is a neural-network-based prediction program capable of identifying ER signal peptides and mitochondrial or plastid transit peptides.
    Proteinlokalisations-Datenbank:

  • PSORTdb, a database of bacterial protein subcellular localizations (Rey et al., 2005).
  • NMP-db, a database of nuclear matrix associated proteins (Mika and Rost, unpublished).
  • LOChom, a database of subcellular localization predictions based on sequence homology to experimentally annotated proteins (Nair and Rost, unpublished).
  • LOCtarget, a database of predicted subcellular localization for potential targets for structural genomics from TargetDb (Nair and Rost, 2004).
  • ER-GolgiDB, a database of predictions for Endoplasmic Reticulum and Golgi Apparatus localization based on sequence homology to experimentally annotated proteins (Wrzeszczynski and Rost, 2004).
  • LOC3D, a database of predicted subcellular localization for eukaryotic proteins of known 3D structure (Nair and Rost, 2003).
  • NLSdb, a database of nuclear localization signals (NLSs) and of nuclear proteins targeted to the nucleus by NLS motifs (Nair et al., 2003).
  • LOCkey, a database for predicted subcellular localization for entire proteomes using LOCtree (Nair and Rost, 2002).
    Beispielsequenzen:

  • Aminosäuresequenz-Datenbank via Entrez

  • Bacteriorhodopsin von Halobacterium salinarium: 7 Helix-Bündel-Protein
  • LacY: Lactose-Permease von Escherichia coli
  • LamB: Maltoporin von Escherichia coli
  • MalE: Maltose-binding protein von Escherichia coli: Protein mit N-terminaler Signalsequenz zur Sekretion ins Periplasma
  • MalF: Untereinheit der Maltose-Permease von Escherichia coli
  • MalG: Untereinheit der Maltose-Permease von Escherichia coli
  • MalK: ATPase-Untereinheit des Maltose-Transportsystems von Escherichia coli
  • OmpA von Escherichia coli: 2-Domänen-Protein: die N-terminale Proteindomäne ist in Form eines 8-strängigen beta-Fasses in die äussere Membran eingelagert; die C-terminale Proteindomäne befindet sich im Periplasma
  • TonB von Escherichia coli: Protein mit N-terminaler Transmembran-Helix
  • TolA von Escherichia coli: Protein mit N-terminaler Transmembran-Helix

 

Latest update of content: October 3, 2005


Ralf Koebnik
Institut de recherche pour le dèveloppement
UMR 5096, CNRS-UP-IRD
911, Avenue Agropolis, BP 64501
34394 Montpellier, Cedex 5
FRANCE
Phone: +33 (0)4 67 41 62 28
Fax: +33 (0)4 67 41 61 81
Email: koebnik(at)gmx.de
Please replace (at) by @.


Home Back to main page