[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [ddlm-group] The Grazulis eliding proposal: how to incorporateinto CIF?. .. .. .

Dear Simon,

   While the approach you suggest is feasible to implement,
indeed is one that has already been implemented in
CBFlib for an earlier version of DDLm support, I urge
you to support adoption of the currently posted syntax
document with no further changes, but only for the
limited purpose of allowing progress on the creation of
a new core CIF dictionary and supporting software
for its use in validation of data CIFs in the IUCr
journal flows.  Syd Hall and Nick Spadaccini have a
new DDLm proposal, which, when coupled to the new syntax
proposal, should allow significant forward progress
on this important effort.  Please note that a
dictionary created following the proposed delimiter
rules plus the whitespace requirement would also
comply with the approach you suggest, while a
dictionary created following the approach you suggest
might not conform to the current proposal, depending
on the paticular strings quoted.  Therefore adopting
the current proposal for purposes of getting the
new core dictionary moving forward does not have
a negative impact.   Certainly much greater caution
in this regard will be needed in moving forward
with the proposed syntax change for user data files,
but having the new dictionaries and prototype software
and experience in working with them in the IUCr
journal operations should allow further discussion
in this area to be grounded in meaningful
experience.

   Regards,
     Herbert


=====================================================
  Herbert J. Bernstein, Professor of Computer Science
    Dowling College, Kramer Science Center, KSC 121
         Idle Hour Blvd, Oakdale, NY, 11769

                  +1-631-244-3035
                  yaya@dowling.edu
=====================================================

On Sun, 28 Aug 2011, Saulius Gra?ulis wrote:

> Dear Colleagues,
>
> I have looked more carefully into the CIF2.0 syntax proposal, and would
> like to present some thoughts as a manager of a crystallographic data
> repository (COD in my case).
>
> Most changes are indeed very welcome and useful. I am especially
> enthusiastic about adoption of UTF8!
>
> I am happy to see my prefixing proposal being considered for inclusion.
>
> There is one change, however, that would cause major problems for me as
> a CIF user: this is the proposal to change the single- and double-quoted
> string handling in a way that is incompatible with CIF 1.x
>
> No matter how "good" or "bad" any of the string handling conventions
> are, we need to face the fact that CIF 1.x is already used, widespread
> and there exists a large number of archival CIF 1.x records that need to
> be maintained. Changing interpretation of a closing quote, and
> forbidding quotes in the similarly-quoted strings would cause major
> problems for archives:
>
> a) the old CIF1.x records will need different parser, or a different
> parser mode. This mode can not always be determined automatically from
> the file contents (the #\#CIF... magic sequence is a comment, it can be
> discarded and thus we can not rely on it. Besides, it can be plainly wrong).
>
> b) Both archivers and users will need to care about which kind of CIF
> they have. Ideally, every existing CIF1.1 CIF should become a valid
> CIF2.x CIF.
>
> c) Most people will find it extremely confusing if CIF1 rules for
> strings are substantially different from CIF2. I have put some effort in
> teaching people how to quote CIF strings; imagine they will now have to
> learn *both* CIF1 and CIF2 rules, that are arbitrarily incompatible.
>
> d) The current Change 6 ("delimited string"), IMHO, does not solve eny
> problems. The CIF string handling was made different from the string
> handling in the C programming language -- an unusual but definitely a
> working construct. And it is definitely quite good -- it just works ).
> Yes, one needs to put some effort to learn it -- but the same holds for
> the C-style strings, or for the new CIF2 proposal.
>
> Given the enormous work of useless work that the new quoted string
> syntax would put on CIF authors and maintainers, I would be extremely
> happy to see the CIF2 development implemented along the lines suggested
> by Herbert, as quoted below:
>
> Herbert J. Bernstein wrote:
>
>> I suggest we complete the process and restore the CIF1 parse for all
>> quoted strings, i.e., that we _not_ terminate a quoted string scan for
>> the terminal quote on the first occurrence of the terminal quote,
>> but only on the first occurrence of the terminal quote followed by
>> white space.
>>
>> The only place in DDLm where this causes a problem is
>> within bracketed constructs in the handling of the terminal bracket,
>> the comma or the colon immediately after a terminal quote mark or
>> in dealing with an unquoted string.    I propose that within the
>> bracketed constructs _only_ we terminate the scan for a closing quote
>> delimiter on the combination of the quote delimiter followed by any of:
>>
>>      whitespace
>>      comma
>>      the closing bracket
>>      colon
>
> Indeed, the delimited string terminated criteria can be easily updated
> to handle ['one'','two''',three''''] style of strings without breaking
> CIF1 backwards compatibility.
>
> Sincerely yours,
> Saulius
>
> -- 
> Dr. Saulius Gra?ulis
> Institute of Biotechnology, Graiciuno 8
> LT-02241 Vilnius, Lietuva (Lithuania)
> fax: (+370-5)-2602116 / phone (office): (+370-5)-2602556
> mobile: (+370-684)-49802, (+370-614)-36366
> _______________________________________________
> ddlm-group mailing list
> ddlm-group@iucr.org
> http://scripts.iucr.org/mailman/listinfo/ddlm-group
_______________________________________________
ddlm-group mailing list
ddlm-group@iucr.org
http://scripts.iucr.org/mailman/listinfo/ddlm-group

Reply to: [list | sender only]