Discussion List Archives

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [ddlm-group] THREAD 3: The alphabet of non-delimited strings.

My preference would be to do, as we have done in the past, and keep the
CIF data files themselves as upwards compatible as possible, so that
applications was work with essentially one lexer with a few minor flags,
and a single parser with some major state flags.  This is what we have
done with both CIFtbx and CBFlib, and it has worked reasonably well
so far.  For this reason I would suggest minimal changes to the definition
of non-delimited strings.  -- Herbert

=====================================================
  Herbert J. Bernstein, Professor of Computer Science
    Dowling College, Kramer Science Center, KSC 121
         Idle Hour Blvd, Oakdale, NY, 11769

                  +1-631-244-3035
                  yaya@dowling.edu
=====================================================

On Sun, 4 Oct 2009, James Hester wrote:

> Please note that the example in my previous post of validating against
> DDL1/2 dictionaries instead of DDLm dictionaries was purely
> illustrative, and as far as I know of has never been advocated as a
> requirement, so applications would only have to deal with DDL1/2
> dictionaries insofar as no equivalent DDLm dictionary is available.
> Otherwise, I believe Herbert's description is correct.  Herbert:
> perhaps you could describe your preferred alternative situation to the
> one you have just described?
>
>
> On 10/3/09, Herbert J. Bernstein <yaya@bernstein-plus-sons.com> wrote:
>> Dear Colleagues,
>>
>>    If I understand correctly, then, what is proposed is the following:
>>
>>    There will be 3 dialects of CIFS:  CIF 1.0, CIF 1.1 and CIF 1.2, all of
>> which it will be possible to process against DDLm dictionaries, which
>> themselves will conform to CIF 1.2.  Many applications will have to have
>> two distinct lexical scanners -- one to scan CIF 1.0 and CIF 1.1, which
>> follow essentially the same lexical rules, and one to scan CIF 1.2 which
>> will follow somewhat different lexical rules, and will have to deal with
>> either 2 or three types of dictionaries:  DDL1, DDL2 and DDLm.
>>
>>    I still think this is unwise, indeed will help to increase the general
>> distaste for CIF in the maromolecular community, but if that is the will
>> of the committee, I will cope with it.
>>
>>    I would suggest a straw vote, settle the matter one way or another,
>> announce the decision and move on to the next issue.
>>
>>    Regards,
>>      Herbert
>>
>> =====================================================
>>   Herbert J. Bernstein, Professor of Computer Science
>>     Dowling College, Kramer Science Center, KSC 121
>>          Idle Hour Blvd, Oakdale, NY, 11769
>>
>>                   +1-631-244-3035
>>                   yaya@dowling.edu
>> =====================================================
>>
>> On Sat, 3 Oct 2009, James Hester wrote:
>>
>>> The IUCr CIF website states in bold: "No changes are required in
>>> existing archival data files in order to apply domain dictionaries
>>> written in DDLm."
>>>
>>> I take this to mean that the same datanames would be defined in DDLm
>>> dictionaries as in previous dictionaries, with essentially identical
>>> definitions and types.  Among other things, this means that the DDLm
>>> methods could be applied 'retrospectively' to files produced with
>>> reference to DDL1/2 domain dictionaries.
>>>
>>> How do the proposed changes in non-delimited string content affect
>>> this behaviour?  Not at all, as far as I can see.  A domain dictionary
>>> can be 'applied' either at software construction time, by hard-coding
>>> in the datanames and properties, or dynamically, by reading domain
>>> dictionaries at execution time.  In the former case, an application
>>> may continue to use a CIF1.1 parser to deal with existing archival
>>> data files, and no parser is used for the dictionaries (indeed, given
>>> the guarantee on the website, nothing changes from the current
>>> situation as far as existing files are concerned).  In the latter
>>> case, a CIF1.2 parser is needed to read in the domain dictionaries,
>>> but a CIF1.1 parser could continue to be used to read in existing
>>> archival data files.  I conclude that we are not breaking any promises
>>> with our non-delimited string proposal.
>>>
>>> Perhaps it bears emphasising that the CIF1.2 syntax and DDLm are
>>> entirely different things.  There is exactly one link between them:
>>> the CIF1.2 list syntax is necessary in order to support specification
>>> of list structures in DDLm.  A file written with CIF1.2 syntax does
>>> *not* require definitions to be written in DDLm: indeed, all existing
>>> archival data files could be converted to use the proposed
>>> non-delimited string syntax and still continue to validate against
>>> DDL1/2 domain dictionaries.  As a corollary of the guarantee on the
>>> IUCr website, even a CIF1.2 data file which uses datanames with
>>> bracketed data values can be validated against DDL1/2 domain
>>> dictionaries, as the list-valued datanames will not appear in those
>>> DDL1/2 dictionaries and will therefore be ignored, and the common
>>> datanames are guaranteed to be defined in the same way.
>>>
>>> NB one thing that we will need to back off on is simplification of the
>>> character set of datanames, as we will need to be able to match the
>>> current datanames in DDL1/2 character for character.
>>>
>>> Herbert has provided two examples where the CIF1.1 syntax and the
>>> proposed CIF1.2 syntax differ, and has stated that he doesn't want to
>>> turn these perfectly reasonable non-delimited strings into errors when
>>> parsing.  I would suggest that he continue as before parsing these
>>> CIF1.1 files as CIF1.1 files (no errors), and develop a new CIF1.2
>>> parser (which will be necessary anyway) to deal with those data files
>>> that will contain bracketed expressions etc.  I guess the point is
>>> that promulgation of a new syntax standard will not automatically make
>>> previous standards disappear or become invalid, and as a new parser
>>> will need to be developed anyway it is a good time to clean up the
>>> standard.
>>>
>>>
>>> --
>>> T +61 (02) 9717 9907
>>> F +61 (02) 9717 3145
>>> M +61 (04) 0249 4148
>>> _______________________________________________
>>> ddlm-group mailing list
>>> ddlm-group@iucr.org
>>> http://scripts.iucr.org/mailman/listinfo/ddlm-group
>>>
>> _______________________________________________
>> ddlm-group mailing list
>> ddlm-group@iucr.org
>> http://scripts.iucr.org/mailman/listinfo/ddlm-group
>>
>
>
> -- 
> T +61 (02) 9717 9907
> F +61 (02) 9717 3145
> M +61 (04) 0249 4148
> _______________________________________________
> ddlm-group mailing list
> ddlm-group@iucr.org
> http://scripts.iucr.org/mailman/listinfo/ddlm-group
>
_______________________________________________
ddlm-group mailing list
ddlm-group@iucr.org
http://scripts.iucr.org/mailman/listinfo/ddlm-group

Reply to: [list | sender only]
International Union of Crystallography

Scientific Union Member of the International Science Council (admitted 1947). Member of CODATA, the ISC Committee on Data. Partner with UNESCO, the United Nations Educational, Scientific and Cultural Organization in the International Year of Crystallography 2014.

International Science Council Scientific Freedom Policy

The IUCr observes the basic policy of non-discrimination and affirms the right and freedom of scientists to associate in international scientific activity without regard to such factors as ethnic origin, religion, citizenship, language, political stance, gender, sex or age, in accordance with the Statutes of the International Council for Science.