Discussion List Archives

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [ddlm-group] THREAD 3: The alphabet of non-delimited strings.


I have just returned from a cruise along the coast of Labrador which started before the current discussions began.  I have spent the last couple of days reading through all 74 contributions that have subsequently arrived in my computer.  Most of the discussion is a little outside my concern with dictionaries, but I have been trying to see if there are any obvious problems with what is proposed.  Perhaps someone can point out if I have misinterpreted anything.

I am adopting James' suggestion that the new standard is sufficiently different to require a major version change to CIF2.0 rather than CIF1.2.

For clarification are two ' (i.e. '') the same as one "?  Some of the illustrations seem to indicate that this is so, but this may be the result of the fonts used which do not distinguish between '' and ".  (Thisi is why I am using Courier.) If they are not the same then '' is not a delimiter and ''red'' would be interpreted as three items '' red '' whereas "red" would be interpreted as the single item red without any quotes.  Are both ''' and """ legal dlimitiers?

I have been considering the problems of CIF 1.0 and 1.1 files being read by CIF2.0 applications and vice versa.

As Simon points out, the tags are all undelimited strings and therefore are restricted in their allowed character set in CIF2.0. I don't think that any existing tags violate Nick's latest set of excluded characters.  Characters such as % and / do appear in DDL1, though they will probably be removed in DDLm.  If any illegal characters do appear they present no problem as long as a CIF2 application recognizes that it is reading a CIF1.  Where the name appears as an alias in a DDLm dictionary it can be be made legal by being quoted.  Providing a CIF1 application can be taught to recognize _ and . as interchangeable it should have no problem in reading CIF2 names, but it may not recognize the names which will be different or absent in the DDL1 dictionary.  This could result in a loss of information, which may or may not be important.  It would clearly have serious problems with arrays and other cases where the new delimiters were used.

I assume that CIF2.0 applies to both the dictionaries and the CIFs themselves.  Are there conditions (like global_) that only apply to dictionaries?  CIFs prepared using the CIF2.0 standards are likely in the first instance to code matrices and vectors as separate elements.  Existing methods can combine these into arrays.  Eventually I foresee that such values will be coded directly as arrays as this is more efficient.  Methods will then be needed to decompose these arrays into their elements in case individual elements need to be retrieved.  I see no problem except that a CIF2.0 coded in this way clearly could not be read by a CIF1 application.

The use of an expression such as #CIF2,0 as a magic number as the first string in a CIF could cause problems since the CIF standard states that anything after # is not part of the CIF and can be stripped out without destroying the integrity of the CIF, i.e., anything following # has no bearing on the either the syntax or the semantics of the CIF.  Have I missed something here?  Software designed, e.g., to strip out the comments in a template could easily strip out the magic number.  No problem if this is a CIF1 file, but it would create an illegal file if it did this to a CIF2 file.  Some legacy software might not be sophisticated enough to recognize the problem.  In general I would strongly advocate using a different initial character for this string.


fn:I.David Brown
org:McMaster University;Brockhouse Institute for Materials Research
adr:;;King St. W;Hamilton;Ontario;L8S 4M1;Canada
title:Professor Emeritus
tel;work:+905 525 9140 x 24710
tel;fax:+905 521 2773

ddlm-group mailing list

Reply to: [list | sender only]
International Union of Crystallography

Scientific Union Member of the International Science Council (admitted 1947). Member of CODATA, the ISC Committee on Data. Partner with UNESCO, the United Nations Educational, Scientific and Cultural Organization in the International Year of Crystallography 2014.

International Science Council Scientific Freedom Policy

The IUCr observes the basic policy of non-discrimination and affirms the right and freedom of scientists to associate in international scientific activity without regard to such factors as ethnic origin, religion, citizenship, language, political stance, gender, sex or age, in accordance with the Statutes of the International Council for Science.