Discussion List Archives

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [ddlm-group] THREAD 3: The alphabet of non-delimited strings.

Dear Colleagues,

   If I understand correctly, then, what is proposed is the following:

   There will be 3 dialects of CIFS:  CIF 1.0, CIF 1.1 and CIF 1.2, all of 
which it will be possible to process against DDLm dictionaries, which 
themselves will conform to CIF 1.2.  Many applications will have to have 
two distinct lexical scanners -- one to scan CIF 1.0 and CIF 1.1, which 
follow essentially the same lexical rules, and one to scan CIF 1.2 which 
will follow somewhat different lexical rules, and will have to deal with 
either 2 or three types of dictionaries:  DDL1, DDL2 and DDLm.

   I still think this is unwise, indeed will help to increase the general 
distaste for CIF in the maromolecular community, but if that is the will 
of the committee, I will cope with it.

   I would suggest a straw vote, settle the matter one way or another, 
announce the decision and move on to the next issue.

   Regards,
     Herbert

=====================================================
  Herbert J. Bernstein, Professor of Computer Science
    Dowling College, Kramer Science Center, KSC 121
         Idle Hour Blvd, Oakdale, NY, 11769

                  +1-631-244-3035
                  yaya@dowling.edu
=====================================================

On Sat, 3 Oct 2009, James Hester wrote:

> The IUCr CIF website states in bold: "No changes are required in
> existing archival data files in order to apply domain dictionaries
> written in DDLm."
>
> I take this to mean that the same datanames would be defined in DDLm
> dictionaries as in previous dictionaries, with essentially identical
> definitions and types.  Among other things, this means that the DDLm
> methods could be applied 'retrospectively' to files produced with
> reference to DDL1/2 domain dictionaries.
>
> How do the proposed changes in non-delimited string content affect
> this behaviour?  Not at all, as far as I can see.  A domain dictionary
> can be 'applied' either at software construction time, by hard-coding
> in the datanames and properties, or dynamically, by reading domain
> dictionaries at execution time.  In the former case, an application
> may continue to use a CIF1.1 parser to deal with existing archival
> data files, and no parser is used for the dictionaries (indeed, given
> the guarantee on the website, nothing changes from the current
> situation as far as existing files are concerned).  In the latter
> case, a CIF1.2 parser is needed to read in the domain dictionaries,
> but a CIF1.1 parser could continue to be used to read in existing
> archival data files.  I conclude that we are not breaking any promises
> with our non-delimited string proposal.
>
> Perhaps it bears emphasising that the CIF1.2 syntax and DDLm are
> entirely different things.  There is exactly one link between them:
> the CIF1.2 list syntax is necessary in order to support specification
> of list structures in DDLm.  A file written with CIF1.2 syntax does
> *not* require definitions to be written in DDLm: indeed, all existing
> archival data files could be converted to use the proposed
> non-delimited string syntax and still continue to validate against
> DDL1/2 domain dictionaries.  As a corollary of the guarantee on the
> IUCr website, even a CIF1.2 data file which uses datanames with
> bracketed data values can be validated against DDL1/2 domain
> dictionaries, as the list-valued datanames will not appear in those
> DDL1/2 dictionaries and will therefore be ignored, and the common
> datanames are guaranteed to be defined in the same way.
>
> NB one thing that we will need to back off on is simplification of the
> character set of datanames, as we will need to be able to match the
> current datanames in DDL1/2 character for character.
>
> Herbert has provided two examples where the CIF1.1 syntax and the
> proposed CIF1.2 syntax differ, and has stated that he doesn't want to
> turn these perfectly reasonable non-delimited strings into errors when
> parsing.  I would suggest that he continue as before parsing these
> CIF1.1 files as CIF1.1 files (no errors), and develop a new CIF1.2
> parser (which will be necessary anyway) to deal with those data files
> that will contain bracketed expressions etc.  I guess the point is
> that promulgation of a new syntax standard will not automatically make
> previous standards disappear or become invalid, and as a new parser
> will need to be developed anyway it is a good time to clean up the
> standard.
>
>
> -- 
> T +61 (02) 9717 9907
> F +61 (02) 9717 3145
> M +61 (04) 0249 4148
> _______________________________________________
> ddlm-group mailing list
> ddlm-group@iucr.org
> http://scripts.iucr.org/mailman/listinfo/ddlm-group
>
_______________________________________________
ddlm-group mailing list
ddlm-group@iucr.org
http://scripts.iucr.org/mailman/listinfo/ddlm-group

Reply to: [list | sender only]
International Union of Crystallography

Scientific Union Member of the International Council for Science (admitted 1947). Member of CODATA, the ICSU Committee on Data. Member of ICSTI, the International Council for Scientific and Technical Information. Partner with UNESCO, the United Nations Educational, Scientific and Cultural Organization in the International Year of Crystallography 2014.

ICSU Scientific Freedom Policy

The IUCr observes the basic policy of non-discrimination and affirms the right and freedom of scientists to associate in international scientific activity without regard to such factors as ethnic origin, religion, citizenship, language, political stance, gender, sex or age, in accordance with the Statutes of the International Council for Science.