Discussion List Archives

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [ddlm-group] The Grazulis eliding proposal: how to incorporateinto CIF?. .. .. .

Dear Simon,

Think of those in-the-lab users who actually like CIF and are brave enough 
to hand-edit them.  For 2 decades we have trained them in the CIF string 
conventions, e.g. that 'O''' is the same thing as "O''".  Now we are 
telling them that the first version is wrong under CIF2.  Think how they 
will react to being told that we are changing parsers to reject 'O'''' not 
because we don't know what it means, but because some people who work with 
lexers are uncomfortable with the context-sensitivity of the first form 
and have decided to stop all quoted strings on the first appearance of the 
quote mark, instead of the one followed by white space, but even though we 
are doing that, we are still requiring white space after the terminal 
quote mark.  This sort of pointless fussiness does not help anybody get 
more crystal structures done, it just creates confusion and wastes time.

I now believe we made a mistake in disallowing the CIF1 style quoted 
strings.  Allowing them now will not invalidate any CIF2 documents created 
according to the more restrictive spec.  It will just avoid invalidating 
perfectly good CIF1 style constructs.

Along the same lines, the restriction on the contents of blank-delimited 
strings and tags goes to far.  We should back it off to allow as many 
exsiting CIF1 constructs as possible.

It is bad enough that we have alientated the macromolecular community with 
the complexity of CIF1 mmCIF.  Do we really want to risk alientating the 
small molecule community with complex changes in what users can do that we 
don't really need to impose on them.

Regards,
   Herbert

=====================================================
  Herbert J. Bernstein, Professor of Computer Science
    Dowling College, Kramer Science Center, KSC 121
         Idle Hour Blvd, Oakdale, NY, 11769

                  +1-631-244-3035
                  yaya@dowling.edu
=====================================================

On Wed, 29 Jun 2011, SIMON WESTRIP wrote:

> Totally agree with your perception of how CIF is viewed by the two
> communities - small molecule versus macromolecular.
> 
> I cant claim to speak for software creators nor journal publishers, but as I
> happen to create CIF-processing software
> for a journal publisher, I would like to think that my contributions to
> these discussions reflect to some extent
> the practice/requirements/experience of that field. Granted, I am not a
> trained developer or crystallographer,
> and my contributions to this forum sometimes lack clarity, but I would have
> thought that therein lies the merit
> of my involvement in these discussions. Furthermore, as an editor, I have,
> for better or worse, worked with CIF since the
> days when the IUCr used to 'hand-craft' CIFs based on submitted manuscripts,
> and dealt then, and still now, with
> those 'users in the lab' and their disinterest or frustration with CIF (and
> occasionally their refreshing enthusiasm).
> 
> Wider views should be solicited - it would be nice to think that they will
> champion CIF to the same extent as the IUCr.
> 
> Cheers
> 
> Simon
> 
> ____________________________________________________________________________
> From: Herbert J. Bernstein <yaya@bernstein-plus-sons.com>
> To: Group finalising DDLm and associated dictionaries <ddlm-group@iucr.org>
> Sent: Wednesday, 29 June, 2011 22:41:52
> Subject: Re: [ddlm-group] The Grazulis eliding proposal: how to incorporate
> into CIF?. .. .. .
> 
> The feedback I get from small molecule people about CIF is the exact
> opposite of what I get from macromolecular people.  CIF is its current
> 1.1 form is a huge success with small molecule people, and they would
> be very unhappy if anything we did broke a very successful process.
> 
> The macromolecular people really don't like CIF in its current form.
> The prefer the PDB format for coordinates.  They find mmCIF much
> too complex, so anything we do that sounds like a simplification
> might draw some support.
> 
> Overall, I suspect most users in the lab neither know of nor care
> about CIF string conventions.  Our audience has to be the software
> creators, archive managers and journal publishers.  I have not heard
> anything in response to CIF2 from those communities.  Has anybody?
> 
> =====================================================
> Herbert J. Bernstein, Professor of Computer Science
>   Dowling College, Kramer Science Center, KSC 121
>         Idle Hour Blvd, Oakdale, NY, 11769
> 
>                 +1-631-244-3035
>                 yaya@dowling.edu
> =====================================================
> 
> On Wed, 29 Jun 2011, SIMON WESTRIP wrote:
> 
> > OK - cant find the positive feedback in the public archives - must have
> been
> > a 'personal communication'...
> >
> > To be honest, no change in the quoting rules would make life easier, both
> > for current
> > developers and current users. Furthermore, now that CIF has departed
> company
> > from
> > STAR, perhaps we should reflect on some of the changes and why they came
> to
> > be in
> > the first place (that said, I would still like to borrow from STAR with
> > respect to its
> > cif/block/dataitem referencing mechanism... :-)
> >
> > Cheers
> >
> > Simon
> >
> >___________________________________________________________________________
> _
> > From: Herbert J. Bernstein <yaya@bernstein-plus-sons.com>
> > To: Group finalising DDLm and associated dictionaries
> <ddlm-group@iucr.org>
> > Sent: Wednesday, 29 June, 2011 13:28:38
> > Subject: Re: [ddlm-group] The Grazulis eliding proposal: how to
> incorporate
> > into CIF?. .. .. .
> >
> > I seem to have missed seeing the positive feedback on 'simplification'
> > of the quoting rules.  Could you point me at the messages?
> >
> >
> > At 1:24 PM +0100 6/29/11, SIMON WESTRIP wrote:
> > >Hmmm ... and there was I thinking that we may still some sort of
> > >backslash-based
> > >elide system for the quoted strings...
> > >
> > >Of the CIF changes, I seem to recall that the 'simplification' of
> > >the quoting rules actually received
> > >positive feedback when the changes document was made more widely
> available,
> > so
> > >I would be surprised and reluctant to see a reversal at this stage.
> > >
> > >Cheers
> > >
> > >Simon
> > >
> > >
> > >
> > >From: Herbert J. Bernstein <yaya@bernstein-plus-sons.com>
> > >To: Group finalising DDLm and associated dictionaries
> <ddlm-group@iucr.org>
> > >Sent: Wednesday, 29 June, 2011 12:22:39
> > >Subject: Re: [ddlm-group] The Grazulis eliding proposal: how to
> > >incorporate into CIF?. .. .. .
> > >
> > >Dear Colleagues,
> > >
> > >Having slept on this, I have a proposal -- that we return string
> > >handling as completely as possible to CIF1 conventions in supporting
> DDLm.
> > >
> > >  Please recall the following from the original proposal on this
> > >form of elide:
> > >
> > >"5. If the prefixed text fields are implemented, arbitrary values can
> > >
> > >be represented in CIFs at least as conveniently as can text fields in the
> > >current CIF1.1 format. Thus, there is strictly speaking no need for the
> > >"""/''' strings, and one could simplify CIF2.x by omitting them
> > >
> > >altogether. However, the proposed method is orthogonal to the """/'''
> > >string format, and thus both can be implemented simultaneously if
> > >necessary."
> > >
> > >in other words, part of the proposal was to drop the treble quotes
> > >entirely, i.e. to return at least in part to CIF1 string handling.
> > >
> > >I suggest we complete the process and restore the CIF1 parse for all
> > >quoted strings, i.e., that we _not_ terminate a quoted string scan for
> > >the terminal quote on the first occurrence of the terminal quote,
> > >but only on the first occurrence of the terminal quote followed by
> > >white space.  The only place in DDLm where this causes a problem is
> > >within bracketed constructs in the handling of the terminal bracket,
> > >the comma or the colon immediately after a terminal quote mark or
> > >in dealing with an unquoted string.    I propose that within the
> > >bracketed constructs _only_ we terminate the scan for a closing quote
> > >delimiter on the combination of the quote delimiter followed by any of:
> > >
> > >    whitespace
> > >    comma
> > >    the closing bracket
> > >    colon
> > >
> > >This will preserve almost all existing CIFS (e.g. the ones with 'O''')
> > >as valid, unchanged, and limit our parser changes for CIF2 primarily
> > >to the handling of the new bracketed constructs and of this new elide
> > >convention.  There is some small risk of invalidating existing CIFS
> > >in adopting the new elide convention at the parser level rather than
> > >at the semantic convention level, but, after sleeping on it, I agree
> > >that that risk is minimal.
> > >
> > >Regards,
> > >  Herbert
> > >
> > >
> > >
> > >At 6:17 PM -0400 6/28/11, Herbert J. Bernstein wrote:
> > >>Dear John,
> > >>
> > >>    There is nothing terribly wrong with any one aspect of the
> > >>CIF2 document.  My problem is that for me it does not hang
> > >>together as a coherent whole espcially on the issue of
> > >>string representation.  Right now to me it is an uncomfortable
> > >>mixture of CIF1 and Python string handling, and, as I have
> > >>repeatedly stated, I would prefer to change to being entirely
> > >>consistent with Python.  If that is not to be, I would prefer
> > >>to go back to CIF1 string handling.  That is and has been my
> > >>position for a very long time.
> > >>
> > >>    If something I have said in the past discussions is not clear,
> > >>I will be happy to amplify, but right now I don't know what
> > >>I can say that would not have me repeating myself.
> > >>
> > >>    Regards,
> > >>      Herbert
> > >>=====================================================
> > >>  Herbert J. Bernstein, Professor of Computer Science
> > >  >    Dowling College, Kramer Science Center, KSC 121
> > >>          Idle Hour Blvd, Oakdale, NY, 11769
> > >>
> > >>                  +1-631-244-3035
> > >>                  <mailto:yaya@dowling.edu>yaya@dowling.edu
> > >>=====================================================
> > >>
> > >>On Tue, 28 Jun 2011, Bollinger, John C wrote:
> > >>
> > >>>  Dear Herbert,
> > >>>
> > >>>  On Tuesday, June 28, 2011 1:46 PM, Herbert J. Bernstein wrote:
> > >>>
> > >>>>    I fear we are not comunicating very effectively.
> > >>>
> > >>>
> > >>>  Evidently not.
> > >>>
> > >>>
> > >>>>  I am
> > >>>>  _not_ comfortable with the current state of the CIF2
> > >>>>  document, and I do not find the current emendation to
> > >>>>  be an improvement.
> > >>>
> > >>>
> > >>>  Your opinions about the document and the proposed change are entirely
> > at
> > >>>  your discretion, of course, but it does come as a surprise to me that
> > >>>  you think the entire changes document unsatisfactory.  If there are
> > >>>  issues with it that you have not previously brought before the group
> > >>  > then I would greatly appreciate the opportunity to hear and perhaps
> > >>>  comment on them before Madrid.  Even those who will actually be
> present
> > >>>  in Madrid might appreciate the opportunity to consider those issues
> > >>>  ahead of time.
> > >>>
> > >>>  As far as I am aware, the only point of CIF 2.0 syntax remaining open
> > >>>  for discussion is the (in-)ability of CIF 2.0 to represent arbitrary
> > >>>  strings.  Inasmuch as both this working group and COMCIFS already
> > >>>  approved the changes document, I would be very reluctant to reopen it
> > >>>  for general changes.  Nevertheless, if you have discovered serious
> > flaws
> > >>>  in it then better to fix them sooner than later.  Otherwise, it is
> > >>>  unproductive to criticize proposals for being based on the only
> > document
> > >>>  available to base them on.
> > >>>
> > >>>
> > >>>>  Much as I would dearly love to have
> > >>>>  the current line-folding protocol in CIF2, I think it
> > >>>>  is much more important to work on making CIF2 into
> > >>>>  something clear and coherent.  I for one find the
> > >>>>  either-or approach to prefixes and line-folding unnecessary
> > >>>>  and confusing.
> > >>>
> > >>>
> > >>>  Line folding and prefixes are compatible.  Saulius pointed this out
> in
> > >>>  his initial description of the protocol, as a group we commented on
> it
> > >>>  in our subsequent discussion, and my formal proposal from earlier
> today
> > >>>  specifically and explicitly provides for them to work together.  I'm
> > not
> > >>>  sure how you acquired an impression to the contrary, but if it was
> from
> > >>>  my text then please explain so that I can improve it.
> > >>>
> > >>>
> > >>>>  When working with old fortran compilers,
> > >>>>  I _need_ the line folding protocol.  If the prefixes
> > >>>>  are bing introduced, I need a way to deal with both the
> > >>>>  prefixes _and_ the line-folding protocol, not have it
> > >>>>  be either-or.  I understand that mose people don't
> > >>>>  see a problem, but I work with software both on new computers
> > >>>>  and very, very old computers (e.g. I just brought an Indigo
> > >>>>  back to life).
> > >>>
> > >>>
> > >>>  I don't see a problem specific to line-folding and prefixes.  Not
> even
> > >>>  for very old compilers.  I do appreciate that some old compilers
> > present
> > >>>  issues for text processing in general, and that these manifest in the
> > >>>  context of CIF.  I think most of us do, else CIF 2.0 would be
> > different.
> > >>>
> > >>>
> > >>>>    I repeat my suggestion that we need to meet and talk things
> > >>>>  out.  Maybe then I will understand what the rest of you are trying
> > >>>>  to do, and maybe I will be able to explain what I am trying
> > >>>>  to do.
> > >>>
> > >>>
> > >>>  It was my understanding that there would indeed be opportunities for
> > >>>  those who attend Madrid to meet, and I hope that at least one such
> > >>>  meeting happens.  It will bear more fruit the better its participants
> > >>>  can prepare for it, however.  If I were going to be present, then I
> > >>>  would want to have as specific an idea as possible of the topics to
> be
> > >>>  covered.  Bringing those up here, in advance, would have the added
> > >>>  advantage of including, at least to some extent, group members who
> > >>>  otherwise would be shut out.
> > >>>
> > >>>
> > >>>  Regards,
> > >>>
> > >>>  John
> > >>>
> > >>>  --
> > >>>  John C. Bollinger, Ph.D.
> > >>>  Department of Structural Biology
> > >>>  St. Jude Children's Research Hospital
> > >  >>
> > >>>
> > >>>
> > >>>
> > >>>  Email Disclaimer:
> > >>><http://www.stjude.org/emaildisclaimer>www.stjude.org/emaildisclaimer
> > >>>
> > >>>  _______________________________________________
> > >>>  ddlm-group mailing list
> > >>>  <mailto:ddlm-group@iucr.org>ddlm-group@iucr.org
> > >>>
> >>>><http://scripts.iucr.org/mailman/listinfo/ddlm-group>http://scripts.iucr
> 
> > .org/mailman/listinfo/ddlm-group
> > >>>
> > >>_______________________________________________
> > >>ddlm-group mailing list
> > >><mailto:ddlm-group@iucr.org>ddlm-group@iucr.org
> >>><http://scripts.iucr.org/mailman/listinfo/ddlm-group>http://scripts.iucr.
> 
> > org/mailman/listinfo/ddlm-group
> > >
> > >
> > >--
> > >=====================================================
> > >  Herbert J. Bernstein, Professor of Computer Science
> > >    Dowling College, Kramer Science Center, KSC 121
> > >        Idle Hour Blvd, Oakdale, NY, 11769
> > >
> > >                  +1-631-244-3035
> > >                  <mailto:yaya@dowling.edu>yaya@dowling.edu
> > >=====================================================
> > >_______________________________________________
> > >ddlm-group mailing list
> > ><mailto:ddlm-group@iucr.org>ddlm-group@iucr.org
> >><http://scripts.iucr.org/mailman/listinfo/ddlm-group>http://scripts.iucr.o
> 
> > rg/mailman/listinfo/ddlm-group
> > >
> > >
> > >_______________________________________________
> > >ddlm-group mailing list
> > >ddlm-group@iucr.org
> > >http://scripts.iucr.org/mailman/listinfo/ddlm-group
> >
> >
> > --
> > =====================================================
> >   Herbert J. Bernstein, Professor of Computer Science
> >     Dowling College, Kramer Science Center, KSC 121
> >         Idle Hour Blvd, Oakdale, NY, 11769
> >
> >                   +1-631-244-3035
> >                   yaya@dowling.edu
> > =====================================================
> > _______________________________________________
> > ddlm-group mailing list
> > ddlm-group@iucr.org
> > http://scripts.iucr.org/mailman/listinfo/ddlm-group
> >
> >
> 
>
_______________________________________________
ddlm-group mailing list
ddlm-group@iucr.org
http://scripts.iucr.org/mailman/listinfo/ddlm-group

Reply to: [list | sender only]
International Union of Crystallography

Scientific Union Member of the International Science Council (admitted 1947). Member of CODATA, the ISC Committee on Data. Partner with UNESCO, the United Nations Educational, Scientific and Cultural Organization in the International Year of Crystallography 2014.

International Science Council Scientific Freedom Policy

The IUCr observes the basic policy of non-discrimination and affirms the right and freedom of scientists to associate in international scientific activity without regard to such factors as ethnic origin, religion, citizenship, language, political stance, gender, sex or age, in accordance with the Statutes of the International Council for Science.