[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Reply to: [list | sender only]
Re: [ddlm-group] CIF-2 changes
- To: Nick.Spadaccini@uwa.edu.au, Group finalising DDLm and associated dictionaries <ddlm-group@iucr.org>
- Subject: Re: [ddlm-group] CIF-2 changes
- From: "Herbert J. Bernstein" <yaya@bernstein-plus-sons.com>
- Date: Tue, 17 Nov 2009 08:11:52 -0500 (EST)
- In-Reply-To: <C72863BD.12491%nick@csse.uwa.edu.au>
- References: <C72863BD.12491%nick@csse.uwa.edu.au>
Dear Colleagues, We have now come full circle with no agreement on anything. I for one, for the reasons outlined in many prior messages, do not think this latest (=oldest) approach to be a good idea. Clearly, if we are ever to resolve this, we need to get all the players into a meeting at one time and work things out. I suspect we will not be able to arrange a timely physical meeting. Perhaps some sort of an emeeting (Ajax, Skype or somesuch) would work. Regards, Herbert ===================================================== Herbert J. Bernstein, Professor of Computer Science Dowling College, Kramer Science Center, KSC 121 Idle Hour Blvd, Oakdale, NY, 11769 +1-631-244-3035 yaya@dowling.edu ===================================================== On Tue, 17 Nov 2009, Nick Spadaccini wrote: > Davidąs Option 3 is the simplest way forward, and actually revisits much of > what was discussed back in 2007-08. Somehow those discussions were locked > far back in my brain, only to be awakened by Davidąs summary. Thanks for > that. > > So now I return to the STAR syntax. DDLm is part of STAR and hence > restrictions on data names so they can be parsed etc is a STAR issue. I am > brought around to Joeąs idea that STAR accepts any 8 bit character sequence > since that is the most complete set and that this will be restricted to > UTF-8 within the CIF specification. Any other adoptee of STAR can choose > whatever restricted encoding they wish. > > I still need to treat data names as programming identifiers within dREL so > accordingly I propose we restrict the data names in STAR (and all variants) > to be ASCII [A-Za-z0-9_.] as we have used in the sample dictionaries, DDLm > and dREL. > > The data values will be represented as discussed in previous threads and > that the reverse solidus and the token delimiters discussed will be ASCII > characters. We can now return to [] as the list delimiters, and {} as the > associative array delimiters. > > Backward compatibility to CIF1 names is handled by exploiting the _alias > attributes in the definition. A CIF2 parser with dictionary can handle > everything. Any CIF1 parser can handle CIF1 data files (also CIF2 data files > up to a point, but wonąt know what the data names mean unless they have > hardcoded it). > > A CIF2 parser would like a leading comment to tell it what sort of file it > is parsing. It the absence of that comment, a pre-scan will need to be done. > The telltale indicators it is a CIF1 data file are multiple occurrences of, > > (1) data names that potentially contain [] or / > (2) unquoted strings with illegal characters > (3) quoted strings that result in parse failure (typically because they must > have an embedded [but not elided] quote character as allowed in CIF1). > > It needs to be a pre-scan because all 3 of the above in an identified CIF2 > data file would result in something quite different since there are coercion > rules for when the whitespace separator is missing. > > For instance IF I KNOW it is a CIF2 file and I read > > _name[1] > > Then this can only be an error and I coerce into > > _name [1] > > IF I DONąT KNOW the file type, the occurrence of _name[1] flags it as > potentially a CIF1 file. If _name[1] is in an alias list, this re-enforces > the likelihood of CIF1. Multiple instances of these łerrors˛ (or any others > in the above list) indicate it is a CIF1 file (my only other conclusion > would be it is a VERY BADLY written CIF2). > > I think this takes us back to a very simple rule set, and I donąt think the > restriction in the character set for data names will cause problems. For all > the excitement of UTF-8 etc I know of programming languages that support > reading and writing data in such encodings but I havenąt seen one that > allows/encourages one to write programmes declaring identifiers in UTF-8 > character sets. (They well exist I just havenąt seen them). > > > On 17/11/09 12:04 AM, "David Brown" <idbrown@mcmaster.ca> wrote: > >> James, >> >> There seems to be a lull in the discussions on CIF2 syntax so this would be a >> good time for you, or appointed chosen by you, to summarize where we are at >> and propose a set of rules that will can work with as we move forward. I >> realize that much of the work I have already done on dictionaries will need to >> be revisited, and Herbert also seems anxious to have some decisions on the >> various topics that have been discussed. >> >> I believe we have a consensus on a number of points, but these need to be >> written down clearly and need our formal agreement so we can move ahead. >> >> David >> >> >> _______________________________________________ >> ddlm-group mailing list >> ddlm-group@iucr.org >> http://scripts.iucr.org/mailman/listinfo/ddlm-group > > cheers > > Nick > > -------------------------------- > Associate Professor N. Spadaccini, PhD > School of Computer Science & Software Engineering > > The University of Western Australia t: +61 (0)8 6488 3452 > 35 Stirling Highway f: +61 (0)8 6488 1089 > CRAWLEY, Perth, WA 6009 AUSTRALIA w3: www.csse.uwa.edu.au/~nick > MBDP M002 > > CRICOS Provider Code: 00126G > > e: Nick.Spadaccini@uwa.edu.au > > > >
_______________________________________________ ddlm-group mailing list ddlm-group@iucr.org http://scripts.iucr.org/mailman/listinfo/ddlm-group
Reply to: [list | sender only]
- Follow-Ups:
- Re: [ddlm-group] CIF-2 changes (Nick Spadaccini)
- References:
- Re: [ddlm-group] CIF-2 changes (Nick Spadaccini)
- Prev by Date: Re: [ddlm-group] CIF-2 changes
- Next by Date: Re: [ddlm-group] CIF-2 changes
- Prev by thread: Re: [ddlm-group] CIF-2 changes
- Next by thread: Re: [ddlm-group] CIF-2 changes
- Index(es):