This is an archive copy of the IUCr web site dating from 2008. For current content please visit https://www.iucr.org.
[IUCr Home Page] [CIF Home Page] [mmCIF Home Page]

Re: STRUCT_REF_SEQ_DIF

Paula Fitzgerald (paula_fitzgerald@Merck.Com)
Tue, 19 Mar 96 14:14:47 EST


Hello again -

Herb Bernstein writes:

> It gets kind of messy to handle deletions purely by alignment of the
> segments that do align, since you then don't have an aligned segment
> to refer to to define what was deleted.  You can cure this
> implicitly when there are aligned segments before and after the
> deletion, but you are kind of stuck if the deletions are at one
> end or the other, and really stuck when those deletions have outlying
> insertions with no further outlying alignments.   All of this
> gets trivial when you are able to specify what is going on residue
> by residue and are allowed to match up a given residue on either
> side against "."  I am not saying it is impossible to do it the
> other way, just more difficult and in need of a lot of clarifying
> semantics.

We talked about this when John, Helen and I met last night, and weren't able
to come to any conclusions that we were sure were sound.  It is true that
when these sequence alignment categories were set up, we were only thinking
in terms of the alignment of streches of sequence of equal length.  We did
provide for point differences, but not for insertions and/or deletions.

I floated the idea that we might do something on the order of the following
to deal with the issues that Herb raised (not taking great care to get the
syntax exactly right):

_struct_ref_seq.db_align_beg      26
_struct_ref_seq.db_align_end      32
_struct_ref_seq.seq_align_beg     11
_struct_ref_seq.seq_align_end     18

 loop_
_struct_ref_seq_dif.db_seq_num
_struct_ref_seq_dif.seq_num
_struct_ref_seq_dif.db_mon_id
_struct_ref_seq_dif.mon_id
  .  11   .  ala
  .  12   .  glu
  .  15   .  tyr
  31  .  his  .
  32  .  arg  .

which would clarify the alignment in a theoretical case like:

db:    .  . 26 27  . 28 29 30 31 32

seq:  11 12 13 14 15 16 17 18  .  .

This would mean adding the token _struct_ref_seq_dif.db_seq_num, which is
probably a good idea anyway, and then using the '.' to indicate the 
positions of insertions and deletions.

This proposal was not met with a great burst of enthusiasm, so we decided
to defer this issue for more discussion.  What do you all think?  Not only
about how to provide for insertions and deletions, but about whether we ought
to be providing for them. 

Paula

********************************************************************************
 Dr. Paula M. D. Fitzgerald  ______________ voice and FAX: (908) 594-5510
   Merck Research Laboratories ______________ email: paula_fitzgerald@merck.com
     P.O. Box 2000, Ry50-105     ______________ or bean@merck.com           
       Rahway, NJ 07065  USA 
         (for express mail use 126 E. Lincoln Ave. instead of P. O. Box 2000)  
********************************************************************************