This is an archive copy of the SinCris component of the IUCr web site dating from 2008. For current content please visit the RESOURCES section of https://www.iucr.org.

SinCris

ProAnWin - Protein Analyst for Windows

home    software section
State Research Center of Virology and Biotechnology
Koltsovo, Novosibirsk Region, 633159 Russia
and
Irina Pika, Anatoly Frolov, Vladimir Ivanisenko with Alexey Eroshkin
are pleased to announce the availability of new MS Windows application for multiple protein sequence alignment, comparative sequences analysis, studying protein structure-activity (property/phenotype) relationships and designing site-directed mutagenesis.

DESCRIPTION:

ProAnWin studies the relationships between protein/peptide activity (or property or related phenotype) and characteristics of some regions in primary or tertiary structure of these molecules. Structure-activity analysis is based on the sequences of protein family, data on protein activity (pK, ED50, Km or any other) and, if available, 3D structure of one of these proteins (supposing the common 3D fold for all the homologs). The main aim is to find out the factors responsible for the variation of protein activities: location of activity-modulating site and important structural characteristics of the site.

The program makes the following: input of sequences from several formats (SWISS-PROT, PIR, FASTA, GCG, CLUSTAL) and 3D structure in PDB format; flexible multiple protein sequences alignment and threading sequences into known 3D structure (ClustalV + manual alignment); input of user-defined protein activities, properties or related phenotypes (with possibility to transform activity: log(x), 1/x, etc.); calculation of many characteristics (hydrophobicity, amphipathicity, etc.) of linear and spatial protein sites; fast multiple (up to eight independent factors) linear regression analysis of structure-activity relationships; activity prediction for untested or mutated proteins; data visualization (regression plots, 3D pictures with sites highlighted, multiple alignments); displaying found sites on sequences and 3D structure. The program has two main related windows - with protein sequences and with 3D structure; any site highlighted in sequences is highlighted in 3D structure and vise versa.

ProAnWin aligns complete set of sequences, subset or any selected block, providing thus possibility for iterative alignment that preserve some previously found blocks or those imposed from some biological data (active center, catalytic residues).

The program can be applied to analysis of various protein-related biological data, to prediction of activity (phenotype) of newly sequenced proteins and to simulation of protein-engineering experiments.

DATA EXAMPLES:

1. The family of disintegrins (proteins from snake venom) with tested activity.
Name                   Sequence (part)                     Activity*
		 41        51        61        71        81
Trigramin alpha  QCGEGLCCDQCSFIEEGTVCRIARGDDLDDYCNGRSAGCPRNP  130
Albolabrin       .............MKK..I..R............I........  222
Elegantin        ..AD.......R.KKKR.I..R....NP..R.T.Q..D....G  136
Flavoridin       ..AD.......R.KKKTGI.......FP..R.T.L.ND...WN  100
Batroxastatin    ..A........R.KGA.KI..R....NP..R.T.Q..D....R  133
Applagin         ..A........L.MK.....-R.....VN.....I........   50
Kistrin          ........E..K.SRA.KI...P...MP..R.T.Q..D...YH  128
Echistatin alpha E.ES.P..RN.L.LK...I.LR.....M......LTCP.....   56
Bitistatin       ..NH.E.....K.KKAR.........WN....T.K.SD..W.H  237
Bitan alpha      ..NH.E.....R.KKA..........WN....T.K.SD..W.H  108

* - Activity is measured as the concentration of protein (in nM) required to 50% attenuation of platelet-rich plasma aggregation stimulated by adenosine-diphosphate.

2. The set of synthetic peptides with tested antimicrobial activity

Name         Activity*    Peptide sequence

Analog A2      400    GIHYLSHKSFSKFFAGVGKFTNS
Analog A1      100    GIHYLSHKSFSKFFAGVQKFTNS
Antisense P     60    GIHYLSHKSFSKFFCGVQKFTNS
Analog B1       40    GIHYLSHKSFSKFFKGVQKFTNS
Analog B2       40    GIHYLSHKSFSKFFKGVGKFTNS
Magainin 2      20    GIGKFLHSAKKFGKAFVGEIMNS
Analog C1       20    GIHKLSHKSFSKFFKGVQKFTNS
Analog C2       20    GIHKLSHKSFSKFFKGVGKFTNS
Analog P1       10    AIHNFAHKSFAKFFRAVKKFANA
Analog P2        5    AIHNLAHKSLAKLLRAVKKLANA
Analog P3        5    GIHNFAHKSFAKFFRAVKKFANS
Analog M2        3    KIHKLAHKLLKKLLKAVKKLAKA

* - Minimal inhibitory concentration (in mcg/ml) against E.coli

3. The set of unrelated peptides with tested immunogenicity.

-------------------------------------------------
Protein  Oncogene         Sequence      Immuno-
region                                  genicity*
-------------------------------------------------
409-425  C-SRC        RLIEDNEYTARQGAKFP     4
468-482  C-SRC        NREVLDQVERGYRMP       4
499-508  C-SRC        WRRDPEERPT            4
001-018  V-KI-RAS     MTEYKLVVVGASGVGKSA    5
119-135  V-KI-RAS     DLPSRTVDTKQAQELAR     5
161-175  V-KI-RAS     REIRQYRLKKISKEE       2
001-018  V-HA-RAS     MTEYKLVVVGARGVGKSA    4
001-018  C-HA(EJ)-RAS MTEYKLVVVGAVGVGKSA    3
001-018  C-HA-RAS     MTEYKLVVVGAGGVGKSA    5
029-044  V-HA-RAS     VDEYDPTIEDSYRKQV      4
091-108  V-HA-RAS     EDIHQYREQIKRVKDSDD    4
126-136  V-HA-RAS     ESRQAQALARS           4
146-155  V-HA-RAS     AKTRQGVEDA            5
160-179  V-HA-RAS     VREIRQHKLRKLNPPDESGP  5
011-024  V-MYB        PQESSKAGPPSGTT        4
033-047  V-MYB        MAFAHNPPAGPLPGA       3
146-162  V-MYB        DNTRTSGDNAPVSCLGE     4
168-186  V-MYB        PSPPVDHGCLPEESASPAR   4
170-185  V-MYB        PPVDHGCLPEESASPA      2
247-260  V-MYB        PFHKDQTFTEYRKM        4
247-265  V-MYB        PFHKDQTFTEYRKMHGGAV   4
541-555  V-FES        RHSTSSSEQEREGGR       4
584-593  V-FES        PEVQKPLHEQ            4
782-796  V-FES        FLRTEGARLRMKTLL       4
840-846  V-FES        SREAADG               0
893-905  V-FES        ASPYPNLSNQQTR         3
901-913  V-FES        NQQTREFVEKGGR         4
222-234  V-MYC        PPTTSSDSEEEQE         0
323-334  V-MYC        RTLDSEENDKRR          4
340-350  V-MYC        ERQRRNELKLR           4
363-371  V-MYC        NNEKAPKVV             1
389-403  V-MYC        RLIAEKEQLRRRREQ       4
395-405  V-MYC        EQLRRRREQLK           4
400-406  V-MYC        RREQLKH               0

* logarithm of antipeptide antibody titers.

4. Phenotype-genotype correlations. Influenza A virus M2 protein from strains sensitive (labeled "sen") and resistant to amantadine or rimantadine ("res").

Strain  Sensitivity    Sequence  (N-terminal part only)

PR8-34   res  MSLLTEVETPIRNEWGCRCNGSSDPLAIAANIIGILHLILWILDR
MON88    res  ....................D.................T......
LEN3-83  res  ..........................T...........T......
MOS88    res  ..........................T...........T......
MON86    res  ..........................T...........T......
SVER82   res  ..........................T...........T......
WS33     res  ....................D.....V..................
LEN85    res  ....................D.....VV.................
WSN33    res  ....................D....FV..................
LEN49    res  ....................D.....VV..........T......
LEN6-83  res  ....................D...S.VV..S..............
SWONT81  res  ....................D.....VA..S..............
SW29-37  res  ....................D.....VA..S..............
SWIA30   res  ..........T.........D.....VA..S..............
SWWIS61  res  ..........T.S.......D.....VA..S..............
SWIA88   res  .................K..D.....VAV.S..............
AA60     sen  ....................D.....VV..S.............H
KOREA68  sen  ....................D.....VV..S......F.......
BANG79   sen  ....................D.....VV..S..............
FW50     sen  ....................D.....VV..S..............
MEM88    sen  ....................D.....VV..S..............
USSR77   sen  .............Q......D.....VV..S..............
PINALB79 sen  ..........T..G.E.K.SD.....V...S..............
SWHK82   sen  ..........T..G.E.K.SD.....V...S..............
SWNED85  sen  ..........T..G....FSD.....V...S..............
FPVR34   sen  ..........T..G.E....D.....I...S............N.
MLRDNY78 sen  ..........T..G.E.K.SD.....V...S..............
TYMN81   sen  ..........T..G.E.K.SD.....V...S..............
TYMN80   sen  ..........T..G.E.K.SD.....V...S..............
CKVIC85  sen  ..........T..G.E.K.SD.....V...S..............

ProAnWin IS USEFUL IN:

AVAILABILITY:

ProAnWin is available (as self-extracted archive) from EBI software library: The version is limited in number of analyzed sequences.

INSTALLATION:

The files required to run ProAnWin are distributed in the form of a single compressed file. Create a directory "PROANWIN" in your hard disk, for example, C. Copy the file to the directory, run the file from DOS prompt and answer Yes to all questions. To start the program run PROAWIN.EXE from windows.

PROGRAM CONTENT:

Directory:

PUBLICATIONS:

  1. Eroshkin A.M., Zhilkin P.A., Fomin V.I. Algorithm and computer program PROANAL for analysis of relationship between structure and activity in a family of proteins or peptides. CABIOS, 1993, 9, 491-497.
  2. Eroshkin A.M., Minenkova O.O., Fomin V.A., Ivanisenko V.A., Ilyichev A.A. Analysis of peptide fragment insertions into major coat protein of bacteriophages M13, f1 and fd. Relation of protein structural characteristics and viability of mutant phages. Molec. Biology (Russia), 1993, 27, 1345-1355.
  3. Eroshkin A.M., Fomin V.I., Zhilkin P.A., Ivanisenko V.A., Kondrakhin Y.V. PROANAL version 2: multifunctional program for analysis of multiple protein sequence alignments and studying structure-activity relationships in protein families. CABIOS, 1995, 11, 39-44.
  4. Morozov B.M., Ivanisenko V.A., Eroshkin A.M., Ugarova N.N. Analysis of relations between bioluminescence color and the structure of beetle luciferases: identification of the sites influencing bioluminescence color. Molec. Biology (Russia), in press. Comments, bug reports, suggestions for new features are welcome and should be sent by e-mail to: Alexey Eroshkin

    OTHER TOOLS AVAILABLE:

    ProAnalyst, Multifunctional analysis of protein sequences and structures (MS-DOS version of ProAnWin with additional functionality: searching motifs, physico-chemical plots, alphabetical and physico-chemical analysis of protein sequence variation, structure-activity determination profile, etc.): IUBio archive: ftp://iubio.bio.indiana.edu/molbio/ibmpc/panalys1 EMBL library: ftp://ftp.ebi.ac.uk/pub/software/dos/proanalyst NSC library: ftp://ftp.bionet.nsc.ru/pub/biology/vector/proanaly.dem/panalys$

    ProMSED, Protein Multiple Sequences EDitor for MS Windows 3.x/95 ("a la" Word for Windows style + ClustalV + manual alignment + amino acid coloring + more): EMBL library: ftp://ftp.ebi.ac.uk/pub/software/dos/promsed NSC library: ftp://ftp.bionet.nsc.ru/pub/biology/vector/promsed.dem/promsed$ IUBio archive: ftp://iubio.bio.indiana.edu/molbio/ibmpc/promsed1


    Dr. Alexey Eroshkin
    Institute of Molecular Biology
    State Research Center of Virology and Biotechnology "Vector"
    Koltsovo, Novosibirsk Region 633159
    Russia
    E.mail:
    eroshkin@vector.nsk.su
    Tel: +7 (3832) - 647774
    Fax: +7 (3832) - 328831

    Please send your comments and your suggestions to Yves Epelboin, epelboin@lmcp.jussieu.fr .


    Last update November 04 1996 Y.E.
    This service is made available through a grant from CNRS and Ministère de l'Education Nationale