Hello again - Herb Bernstein writes: > It gets kind of messy to handle deletions purely by alignment of the > segments that do align, since you then don't have an aligned segment > to refer to to define what was deleted. You can cure this > implicitly when there are aligned segments before and after the > deletion, but you are kind of stuck if the deletions are at one > end or the other, and really stuck when those deletions have outlying > insertions with no further outlying alignments. All of this > gets trivial when you are able to specify what is going on residue > by residue and are allowed to match up a given residue on either > side against "." I am not saying it is impossible to do it the > other way, just more difficult and in need of a lot of clarifying > semantics. We talked about this when John, Helen and I met last night, and weren't able to come to any conclusions that we were sure were sound. It is true that when these sequence alignment categories were set up, we were only thinking in terms of the alignment of streches of sequence of equal length. We did provide for point differences, but not for insertions and/or deletions. I floated the idea that we might do something on the order of the following to deal with the issues that Herb raised (not taking great care to get the syntax exactly right): _struct_ref_seq.db_align_beg 26 _struct_ref_seq.db_align_end 32 _struct_ref_seq.seq_align_beg 11 _struct_ref_seq.seq_align_end 18 loop_ _struct_ref_seq_dif.db_seq_num _struct_ref_seq_dif.seq_num _struct_ref_seq_dif.db_mon_id _struct_ref_seq_dif.mon_id . 11 . ala . 12 . glu . 15 . tyr 31 . his . 32 . arg . which would clarify the alignment in a theoretical case like: db: . . 26 27 . 28 29 30 31 32 seq: 11 12 13 14 15 16 17 18 . . This would mean adding the token _struct_ref_seq_dif.db_seq_num, which is probably a good idea anyway, and then using the '.' to indicate the positions of insertions and deletions. This proposal was not met with a great burst of enthusiasm, so we decided to defer this issue for more discussion. What do you all think? Not only about how to provide for insertions and deletions, but about whether we ought to be providing for them. Paula ******************************************************************************** Dr. Paula M. D. Fitzgerald ______________ voice and FAX: (908) 594-5510 Merck Research Laboratories ______________ email: paula_fitzgerald@merck.com P.O. Box 2000, Ry50-105 ______________ or bean@merck.com Rahway, NJ 07065 USA (for express mail use 126 E. Lincoln Ave. instead of P. O. Box 2000) ********************************************************************************