Discussion List Archives

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [ddlm-group] THREAD 3: The alphabet of non-delimited strings.

The IUCr CIF website states in bold: "No changes are required in
existing archival data files in order to apply domain dictionaries
written in DDLm."

I take this to mean that the same datanames would be defined in DDLm
dictionaries as in previous dictionaries, with essentially identical
definitions and types.  Among other things, this means that the DDLm
methods could be applied 'retrospectively' to files produced with
reference to DDL1/2 domain dictionaries.

How do the proposed changes in non-delimited string content affect
this behaviour?  Not at all, as far as I can see.  A domain dictionary
can be 'applied' either at software construction time, by hard-coding
in the datanames and properties, or dynamically, by reading domain
dictionaries at execution time.  In the former case, an application
may continue to use a CIF1.1 parser to deal with existing archival
data files, and no parser is used for the dictionaries (indeed, given
the guarantee on the website, nothing changes from the current
situation as far as existing files are concerned).  In the latter
case, a CIF1.2 parser is needed to read in the domain dictionaries,
but a CIF1.1 parser could continue to be used to read in existing
archival data files.  I conclude that we are not breaking any promises
with our non-delimited string proposal.

Perhaps it bears emphasising that the CIF1.2 syntax and DDLm are
entirely different things.  There is exactly one link between them:
the CIF1.2 list syntax is necessary in order to support specification
of list structures in DDLm.  A file written with CIF1.2 syntax does
*not* require definitions to be written in DDLm: indeed, all existing
archival data files could be converted to use the proposed
non-delimited string syntax and still continue to validate against
DDL1/2 domain dictionaries.  As a corollary of the guarantee on the
IUCr website, even a CIF1.2 data file which uses datanames with
bracketed data values can be validated against DDL1/2 domain
dictionaries, as the list-valued datanames will not appear in those
DDL1/2 dictionaries and will therefore be ignored, and the common
datanames are guaranteed to be defined in the same way.

NB one thing that we will need to back off on is simplification of the
character set of datanames, as we will need to be able to match the
current datanames in DDL1/2 character for character.

Herbert has provided two examples where the CIF1.1 syntax and the
proposed CIF1.2 syntax differ, and has stated that he doesn't want to
turn these perfectly reasonable non-delimited strings into errors when
parsing.  I would suggest that he continue as before parsing these
CIF1.1 files as CIF1.1 files (no errors), and develop a new CIF1.2
parser (which will be necessary anyway) to deal with those data files
that will contain bracketed expressions etc.  I guess the point is
that promulgation of a new syntax standard will not automatically make
previous standards disappear or become invalid, and as a new parser
will need to be developed anyway it is a good time to clean up the

T +61 (02) 9717 9907
F +61 (02) 9717 3145
M +61 (04) 0249 4148
ddlm-group mailing list

Reply to: [list | sender only]
International Union of Crystallography

Scientific Union Member of the International Science Council (admitted 1947). Member of CODATA, the ISC Committee on Data. Partner with UNESCO, the United Nations Educational, Scientific and Cultural Organization in the International Year of Crystallography 2014.

International Science Council Scientific Freedom Policy

The IUCr observes the basic policy of non-discrimination and affirms the right and freedom of scientists to associate in international scientific activity without regard to such factors as ethnic origin, religion, citizenship, language, political stance, gender, sex or age, in accordance with the Statutes of the International Council for Science.