Discussion List Archives

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: CIF 2.0 syntax proposal for retaining backwards CIF 1.xcompatibility. .

Dear Coleagues,

     The real issue here is not what CIF 2 will be.  That is settled.  
The real issue
is how CIF 1 will adapt to the new constructs introduced with DDLm and dREL.
It is appears that the PDB intends to keep the macromolecular community 
in the CIF 1 world, even if the dictionaries move up to DDLm, and we have
committed to keeping all the core-cif CIF 1 files not only readable, but
validatible against the new DDLm dictionaries.

   The reality is that, in such a world, people are going to carry 
things back
and forth between CIF 1 and CIF 2.  If we don't recommend mappings,
we are going to end up with a lot of different ones and a lot of confusion
and mismapped data.

   The missing elements to keep CIF 1 viable in this context seem to be:

   1.  Handling UTF8
   2.  Handling bracketed constructs
   3.  Handling the different quoting and white-space conventions

In each case what is needed is a faithful translation from the relevant
CIF 2 constructs to valid CIF 1 and a faithful translation from CIF 1
constructs to valid CIF2.  With that available, those who wish to use
CIF 1 tools with CIF 2 files can work, as can those who wish to
use CIF 2 tools against CIF 1 files.

1.  Handling UTF8.  Most UTF8 files are ASCII files, so almost any of
the common encoding mechanism will work -- e.g. the HTML approach
or the Python approach -- for carrying UTF8 characters in CIF1 files.
The other direction is a non-problem.

2.  Handling bracketed constructs.  Almost any quoting scheme will allow
a bracketed construct to be carried as an opaque value in a CIF 1 file.
I propose that we carry CIF 2 bracketed constructs in CIF 1 files as 
delimited quoted text, beginning either with \n;$\n (newline, semicolonm 
newline) for non-line-folded versions or with \n;\\$\n (newline, semicolon,
backslash, dollar, newline) for line-folded versions

3.  Handling the different quoting and white-space conventions.  This will
require aggressive use of both the CIF 1 and CIF 2 quoting mechanisms, but
should be doable.


On 9/17/13 1:21 PM, Bollinger, John C wrote:
> On Sunday, September 15, 2013 9:59 PM, James Hester wrote:
>> Reply to Saulius's suggestions of altered syntax.
>> ===================================
> [...]
>> incompatibility with CIF1 is not, in itself, news and is not
>> sufficient to justify changes to a syntax that has been sweated over
>> for many years.
> [...]
>> It is to avoid such contortions that we agreed to allow
>> incompatibility between CIF1 and CIF2.
>> We need to avoid any further major syntax changes in CIF2. Closing
>> the book on syntax changes results in a precise understanding of
>> the differences between CIF1 and CIF2, so we can meaningfully explore
>> managing CIF1-CIF2 transitions in alternative ways, e.g. through
>> documentation and policy.
> The fundamental question Saulius raises is whether a new, backwards-incompatible version of CIF is relevant or desirable.  Are the costs of dropping backwards compatibility too high for the benefits we hope to gain?  From a higher perspective, those costs may include some or all of the following:
> - Loss of developer good will
> - Lack of community acceptance
> - Technical issues at various levels arising from confusing one format with the other
> - User confusion
> As James observed, COMCIFS and the DDLm-WG's position has been that those potential  costs and any others are indeed worth the benefits, so the threshold issue here is whether Saulius has given us sufficient reason to revisit *that* judgment.  I take James's and Herbert's responses to Saulius's proposal as "no" answers to that question.  Myself, I am undecided, though any way around I am not eager to reopen the syntax for changes.
> Supposing that we proceed with backward-incompatible CIF 2, as appears to be the momentum, perhaps we should consider ways to reduce the above costs / risks.  One of the main approaches that occur to me is to take all reasonable steps to position CIF 2 as an improved *alternative* to CIF 1 (as James describes it to be), as opposed to an advancement of it.  This would be mostly a promotion and labeling effort.  Steps along that path might include
> - relabeling the "Changes" document and the language therein to replace "changes" with "differences",
> - making sure to promote CIF 2 as an alternative rather than an evolutionary development when we talk about it with colleagues and in publications,
> - emphasizing that CIF1 and CIF2 are expected to coexist for an indeterminate time, and
> - promoting a different filename convention for CIF 2 files, such as extension ".cif2" instead of plain ".cif".
> Does it make sense to do something like that?
> John
> --
> John C. Bollinger, Ph.D.
> Computing and X-Ray Scientist
> Department of Structural Biology
> St. Jude Children's Research Hospital
> Email Disclaimer:  www.stjude.org/emaildisclaimer
> Consultation Disclaimer:  www.stjude.org/consultationdisclaimer
> _______________________________________________
> comcifs mailing list
> comcifs@iucr.org
> http://mailman.iucr.org/mailman/listinfo/comcifs

comcifs mailing list

Reply to: [list | sender only]
International Union of Crystallography

Scientific Union Member of the International Science Council (admitted 1947). Member of CODATA, the ISC Committee on Data. Partner with UNESCO, the United Nations Educational, Scientific and Cultural Organization in the International Year of Crystallography 2014.

International Science Council Scientific Freedom Policy

The IUCr observes the basic policy of non-discrimination and affirms the right and freedom of scientists to associate in international scientific activity without regard to such factors as ethnic origin, religion, citizenship, language, political stance, gender, sex or age, in accordance with the Statutes of the International Council for Science.