[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [ddlm-group] CIF 1.5

To: Group finalising DDLm and associated dictionaries <[email protected]>
Subject: Re: [ddlm-group] CIF 1.5
From: David Brown <[email protected]>
Date: Tue, 01 Dec 2009 10:19:56 -0500
In-Reply-To: <[email protected]>
References: <[email protected]> <[email protected]> <[email protected]><[email protected]>

Title:

Before we close off the discussion on CIF1.5 I just want to put in my final 2 cents worth. As I mentioned before, I see no point in introducing CIF1.5 which will only muddy the waters and lead to total confusion. CIF1.5 is quite unnecessary. 1. Legacy software will not be able to read CIF1.5 any more than it will be able to read CIF2.0 files, so we might as well go directly to CIF 2.0. And how quickly will the legacy software be converted to output CIF1.5 files anyway. You need a 10 year lead time for changes in packages such as SHELX and another decade before people upload the latest version. Even then they will produce CIF1.1 output that they can load into their other programs. 2. For the forseable future DDLm applications will have to have a CIF1.1 lexer and a preparser to convert legacy files into CIF2.0 mode. 3. What dictionaries will be used for files written in CIF1.5? It will be difficult enough to find volunteers to convert DDL1 to DDLm, and I have no idea if there are any plans to convert DDL2 to DDLm dictionaries. If CIF1.5 will use DDLm then why not just go straight to CIF2.0? 4. CIF2.0 datafiles can look almost exactly like CIF1.1 datafiles except for a few datanames and some undelimited data values that include forbidden characters. Most people will not notice the difference between CIF2.0 files and regular old fashioned CIFs. Indeed many CIF1.1 data files could probably be read in with CIF2.0 parsers without a problem. The biggest problem are the DDL2 datanames that contain 'U[1][2]', but these are not found in DDL1. Since the whole DDL2 data archive is centrally held I assume could it could be easily converted (if it was thought worthwhile), If there are problems in DDL1 they are confined to one or two datanamens. Undelimited data values containing illegal characters could be a problem. The CIF1.1 lexer and preparser mentioned in 2 above will deal with all of these. 5. DDLm does not require that its lists, vectors and matrices be entered as arrays. dREL allows all of these new CIF2.0 constructs to be reconstituted from their primitives as required. The future as I foresea it will see everone carrying on with current software and CIF1.1 datafiles as long as they want. CIF2.0 software will be developed to take advantage of the new features, but with a CIF1.1 front end to carry out the minimal required conversion to CIF2.0, such applications will be able to read all existing and future CIFs of every stripe. Eventually CIF1.1 legacy software will die or be converted to CIF2.0 and the rest of the world will painlessly convert to to CIF2.0 data files, probably without the ueser even noticing. I think we are imagining monsters lurking behind trees even in a treeless desert. CIF1.5 should be dropped and not resurrected, and I am prepared to debate this with Herbert privately (so as not te waste everyone else's time)if he is not convinced.David Brian McMahon wrote: Dear Colleagues I agree with James. The remit of this group was to finalise DDLm. An early conclusion was that this necessarily involved syntax changes at the STAR level, and the consequent discussions have revolved around the idea of providing a specification for CIF (essentially at the syntax level) that took advantage of these syntactic changes and allowed uniform handling of CIF data files and DDLm dictionaries. For me, the immediate benefit of these discussions has been a much more complete account of what needs to be done upstream, at the STAR level, to accommodate the changes that are desirable in downstream (CIF and DDLm applications) at some point. So, for example, the STAR spec needs formally to be revised to allow Unicode character sets (certainly UTF-8, which is what we settled on for CIF; as far as I recall, it's still possible that the STAR revision could allow other Unicode encodings that Herbert needs for imgCIF, and I'd be interested in knowing whether the new spec could also allow the inclusion of full binary data streams so that CBF could properly become one of the STAR family of formats). There must also be the new delimiter characters and formal rules for handling list items. We've developed these conclusions by using various use cases and Gedankenexperimente, but we've not, in the main, been driven by the need to meet real problems currently difficult of solution in the community. Indeed, recent work with embedded visualisation scripts and incorporation of TeX mathematical fragments into CIFs destined for publication in Acta show that there's much more that can still be achieved within the existing syntactic framework. So let us complete the job of finalising the specifications (STAR++, DDLm, CIF2.0), and then involve the wider community in discussing how, when and if they are to be implemented. Brian On Tue, Dec 01, 2009 at 02:30:09PM +1100, James Hester wrote: Dear Herbert and colleagues, Little quibble: I wrote 'one more type' rather than 'more than one type'. Anyway, I suggest that we concentrate on finalising CIF2.0 syntax, then put a draft out for discussion in the broader community, and if there is sufficient feedback to the effect of 'we need an intermediate format', then we can address the issue of CIF1.5. Addressing it now distracts us from the task of putting CIF2.0 to bed, which we will still need to do in any case. On Tue, Dec 1, 2009 at 11:17 AM, Herbert J. Bernstein < [email protected]> wrote: Dear James, Please look at the following part of your first paragraph: "with a commitment to support CIF1.1 for the long term and a guaranteed way to distinguish the two types of data files." and please look at the following part of your second paragraph "Furthermore, they now have to support one more type of file going into the future." I seem to be missing something. If we are going to support CIF 1.1 for the long term and we are going to have CIF 2 be a very different file type, then it is not CIF 1.5 that will cause software devlopers to have to support one more file type going into the future, but the fundamental decisions made by this group. If you support CIF 1.1 and a very different CIF 2, then you are going to end up with mixed files, i.e. multiple ad hoc CIF 1.5 (or actually CIF 1.55) files. All I am doing is proposing to formalize what is going to happen anyway. I've had my say. Regards, Herbert ===================================================== Herbert J. Bernstein, Professor of Computer Science Dowling College, Kramer Science Center, KSC 121 Idle Hour Blvd, Oakdale, NY, 11769 +1-631-244-3035 [email protected] ===================================================== On Tue, 1 Dec 2009, James Hester wrote: (Note to those reading this later: this continues a thread started within the 'space as list item separator' thread. I recommend reading those messages before continuing on here). (For those who came in late: We flirted with the idea of a minimally disruptive path from CIF1.1 to CIF2.0 back in the beginning of this group (late September/early October, I believe) , and ended up choosing to define one maximally disruptive CIF2.0 standard together with a commitment to support CIF1.1 for the long term and a guaranteed way to distinguish the two types of data files.) Picking up the CIF1.5 discussion... Introducing CIF1.5 is a further source of confusion. Apart from this, it produces extra workload for software authors. Herb has essentially defined CIF1.5 as CIF1.1 plus new syntactical elements (or in other words CIF2.0 minus character set limitations and UTF8). So in order to support CIF1.5, authors of both CIF reading and CIF writing software have to add this new syntax. Then when they decide to support CIF2.0, they have to once again revisit their software. I would have thought it far more sensible to ask them to update and distribute their software only once. Furthermore, they now have to support one more type of file going into the future. I see absolutely no benefit in this idea. On Tue, Dec 1, 2009 at 9:40 AM, Herbert J. Bernstein < [email protected]> wrote: Dear James, The point is that we will need to make it easy for people working with CIF 1 and CIF 1.1 based tools to cobble together valid CIF 2 data. The most important bit will be a way to include vectors and matrices in their data. This will allow them to do it. Please note that it hase taken several years to just get to the point where we are beginning to rigorously define CIF 2. If we are lucky, it will only take a few years to have a full set of tools to allow users and software writers to reliably produce true CIF 2 data. Regards, Herbert ===================================================== Herbert J. Bernstein, Professor of Computer Science Dowling College, Kramer Science Center, KSC 121 Idle Hour Blvd, Oakdale, NY, 11769 +1-631-244-3035 [email protected] ===================================================== On Tue, 1 Dec 2009, James Hester wrote: Dear Herbert: as CIF 1.1 doesn't define lists, I'm not sure why you suggest that the example below is a valid tag. On Tue, Dec 1, 2009 at 12:36 AM, Herbert J. Bernstein <[email protected]> wrote: Sorry something got lost in the prior message. It should have read: Dear Colleagues, Back to the question of commas. If you accept the desirability of having a CIF 1.5, commas in lists become very useful. Someone with a CIF 1.1 editor will be able to prepare a CIF 1.5 file for many useful cases by doing all lists with commas and no embedded blanks as long as they can make their lists fit on single lines. In CIF 1.1 [[1,2,3],[4,5,6],[7,8,9]] is a valid value for a tag, but [[1 2 3] [4 5 6] [7 8 9]] is not. No, neither example is a valid CIF 1.1 tag. CIF 1.1 explicitly excludes brackets as the first character of a non-delimited string. Having the option of commas in lists will help to smooth the transition for at least some people. _______________________________________________ ddlm-group mailing list [email protected] http://scripts.iucr.org/mailman/listinfo/ddlm-group

begin:vcard
fn:I.David Brown
n:Brown;I.David
org:McMaster University;Brockhouse Institute for Materials Research
adr:;;King St. W;Hamilton;Ontario;L8S 4M1;Canada
email;internet:[email protected]
title:Professor Emeritus
tel;work:+905 525 9140 x 24710
tel;fax:+905 521 2773
version:2.1
end:vcard

_______________________________________________
ddlm-group mailing list
[email protected]
http://scripts.iucr.org/mailman/listinfo/ddlm-group

Reply to: [list | sender only]

Follow-Ups:

Re: [ddlm-group] CIF 1.5 (Herbert J. Bernstein)

References:

[ddlm-group] CIF 1.5 (James Hester)

Re: [ddlm-group] CIF 1.5 (Herbert J. Bernstein)

Re: [ddlm-group] CIF 1.5 (James Hester)

Re: [ddlm-group] CIF 1.5 (Brian McMahon)

Prev by Date: Re: [ddlm-group] Role of separators in CIF

Next by Date: Re: [ddlm-group] CIF 1.5

Prev by thread: Re: [ddlm-group] CIF 1.5

Next by thread: Re: [ddlm-group] CIF 1.5

Index(es):

Date

Thread

Discussion List Archives

Re: [ddlm-group] CIF 1.5