[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: CIF 2.0 syntax proposal for retaining backwards CIF 1.xcompatibility. .

Dear Coleagues,

     The real issue here is not what CIF 2 will be.  That is settled.  
The real issue
is how CIF 1 will adapt to the new constructs introduced with DDLm and dREL.
It is appears that the PDB intends to keep the macromolecular community 
files
in the CIF 1 world, even if the dictionaries move up to DDLm, and we have
committed to keeping all the core-cif CIF 1 files not only readable, but
validatible against the new DDLm dictionaries.

   The reality is that, in such a world, people are going to carry 
things back
and forth between CIF 1 and CIF 2.  If we don't recommend mappings,
we are going to end up with a lot of different ones and a lot of confusion
and mismapped data.

   The missing elements to keep CIF 1 viable in this context seem to be:

   1.  Handling UTF8
   2.  Handling bracketed constructs
   3.  Handling the different quoting and white-space conventions

In each case what is needed is a faithful translation from the relevant
CIF 2 constructs to valid CIF 1 and a faithful translation from CIF 1
constructs to valid CIF2.  With that available, those who wish to use
CIF 1 tools with CIF 2 files can work, as can those who wish to
use CIF 2 tools against CIF 1 files.

1.  Handling UTF8.  Most UTF8 files are ASCII files, so almost any of
the common encoding mechanism will work -- e.g. the HTML approach
or the Python approach -- for carrying UTF8 characters in CIF1 files.
The other direction is a non-problem.

2.  Handling bracketed constructs.  Almost any quoting scheme will allow
a bracketed construct to be carried as an opaque value in a CIF 1 file.
I propose that we carry CIF 2 bracketed constructs in CIF 1 files as 
semicolon
delimited quoted text, beginning either with \n;$\n (newline, semicolonm 
dollar,
newline) for non-line-folded versions or with \n;\\$\n (newline, semicolon,
backslash, dollar, newline) for line-folded versions

3.  Handling the different quoting and white-space conventions.  This will
require aggressive use of both the CIF 1 and CIF 2 quoting mechanisms, but
should be doable.

Regards,
   Herbert





On 9/17/13 1:21 PM, Bollinger, John C wrote:
> On Sunday, September 15, 2013 9:59 PM, James Hester wrote:
>    
>> Reply to Saulius's suggestions of altered syntax.
>> ===================================
>>      
> [...]
>    
>> incompatibility with CIF1 is not, in itself, news and is not
>> sufficient to justify changes to a syntax that has been sweated over
>> for many years.
>>      
> [...]
>    
>> It is to avoid such contortions that we agreed to allow
>> incompatibility between CIF1 and CIF2.
>>
>> We need to avoid any further major syntax changes in CIF2. Closing
>> the book on syntax changes results in a precise understanding of
>> the differences between CIF1 and CIF2, so we can meaningfully explore
>> managing CIF1-CIF2 transitions in alternative ways, e.g. through
>> documentation and policy.
>>      
>
> The fundamental question Saulius raises is whether a new, backwards-incompatible version of CIF is relevant or desirable.  Are the costs of dropping backwards compatibility too high for the benefits we hope to gain?  From a higher perspective, those costs may include some or all of the following:
>
> - Loss of developer good will
> - Lack of community acceptance
> - Technical issues at various levels arising from confusing one format with the other
> - User confusion
>
> As James observed, COMCIFS and the DDLm-WG's position has been that those potential  costs and any others are indeed worth the benefits, so the threshold issue here is whether Saulius has given us sufficient reason to revisit *that* judgment.  I take James's and Herbert's responses to Saulius's proposal as "no" answers to that question.  Myself, I am undecided, though any way around I am not eager to reopen the syntax for changes.
>
> Supposing that we proceed with backward-incompatible CIF 2, as appears to be the momentum, perhaps we should consider ways to reduce the above costs / risks.  One of the main approaches that occur to me is to take all reasonable steps to position CIF 2 as an improved *alternative* to CIF 1 (as James describes it to be), as opposed to an advancement of it.  This would be mostly a promotion and labeling effort.  Steps along that path might include
> - relabeling the "Changes" document and the language therein to replace "changes" with "differences",
> - making sure to promote CIF 2 as an alternative rather than an evolutionary development when we talk about it with colleagues and in publications,
> - emphasizing that CIF1 and CIF2 are expected to coexist for an indeterminate time, and
> - promoting a different filename convention for CIF 2 files, such as extension ".cif2" instead of plain ".cif".
>
> Does it make sense to do something like that?
>
>
> John
>
> --
> John C. Bollinger, Ph.D.
> Computing and X-Ray Scientist
> Department of Structural Biology
> St. Jude Children's Research Hospital
>
>
> Email Disclaimer:  www.stjude.org/emaildisclaimer
> Consultation Disclaimer:  www.stjude.org/consultationdisclaimer
> _______________________________________________
> comcifs mailing list
> comcifs@iucr.org
> http://mailman.iucr.org/mailman/listinfo/comcifs
>    

_______________________________________________
comcifs mailing list
comcifs@iucr.org
http://mailman.iucr.org/mailman/listinfo/comcifs

Reply to: [list | sender only]