[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Reply to: [list | sender only]
Re: [ddlm-group] CIF-2 changes
- To: Nick.Spadaccini@uwa.edu.au, Group finalising DDLm and associated dictionaries <ddlm-group@iucr.org>
- Subject: Re: [ddlm-group] CIF-2 changes
- From: "Herbert J. Bernstein" <yaya@bernstein-plus-sons.com>
- Date: Tue, 17 Nov 2009 08:54:37 -0500 (EST)
- In-Reply-To: <C728C5E7.12498%nick@csse.uwa.edu.au>
- References: <C728C5E7.12498%nick@csse.uwa.edu.au>
Dear Nick, The question on which we are circling is the valid data names. Look over the chain of emails -- every possible combination is still on the table. We need to get everybody to sign on to one clear, complete and final specification. We need a meeting. Regards, Herbert ===================================================== Herbert J. Bernstein, Professor of Computer Science Dowling College, Kramer Science Center, KSC 121 Idle Hour Blvd, Oakdale, NY, 11769 +1-631-244-3035 yaya@dowling.edu ===================================================== On Tue, 17 Nov 2009, Nick Spadaccini wrote: > > Sorry Herb, how is this full circle with no agreement? I have suggested we > adopt almost all of the changes we discussed except that with David's option > we can now simply enforce a more limited character set on data names so that > the parsing problems within dREL for names with included [] are now > eliminated, while still making it possible to handle legacy names. As a > consequence of not needing to support [] in names we can now revert back to > using them for list delimiters. > > The latter is the only circle, otherwise what was generally agreed in > discussion is still there. My discussion below was for STAR which is the > superset of CIF. The CIF2 specific stuff is still on the table. > > Have I missed something? > > On 17/11/09 9:11 PM, "Herbert J. Bernstein" <yaya@bernstein-plus-sons.com> > wrote: > >> Dear Colleagues, >> >> We have now come full circle with no agreement on anything. I for one, >> for the reasons outlined in many prior messages, do not think this >> latest (=oldest) approach to be a good idea. >> >> Clearly, if we are ever to resolve this, we need to get all the players >> into a meeting at one time and work things out. I suspect we will not be >> able to arrange a timely physical meeting. Perhaps some sort of an >> emeeting (Ajax, Skype or somesuch) would work. >> >> Regards, >> Herbert >> >> ===================================================== >> Herbert J. Bernstein, Professor of Computer Science >> Dowling College, Kramer Science Center, KSC 121 >> Idle Hour Blvd, Oakdale, NY, 11769 >> >> +1-631-244-3035 >> yaya@dowling.edu >> ===================================================== >> >> On Tue, 17 Nov 2009, Nick Spadaccini wrote: >> >>> Davidąs Option 3 is the simplest way forward, and actually revisits much of >>> what was discussed back in 2007-08. Somehow those discussions were locked >>> far back in my brain, only to be awakened by Davidąs summary. Thanks for >>> that. >>> >>> So now I return to the STAR syntax. DDLm is part of STAR and hence >>> restrictions on data names so they can be parsed etc is a STAR issue. I am >>> brought around to Joeąs idea that STAR accepts any 8 bit character sequence >>> since that is the most complete set and that this will be restricted to >>> UTF-8 within the CIF specification. Any other adoptee of STAR can choose >>> whatever restricted encoding they wish. >>> >>> I still need to treat data names as programming identifiers within dREL so >>> accordingly I propose we restrict the data names in STAR (and all variants) >>> to be ASCII [A-Za-z0-9_.] as we have used in the sample dictionaries, DDLm >>> and dREL. >>> >>> The data values will be represented as discussed in previous threads and >>> that the reverse solidus and the token delimiters discussed will be ASCII >>> characters. We can now return to [] as the list delimiters, and {} as the >>> associative array delimiters. >>> >>> Backward compatibility to CIF1 names is handled by exploiting the _alias >>> attributes in the definition. A CIF2 parser with dictionary can handle >>> everything. Any CIF1 parser can handle CIF1 data files (also CIF2 data files >>> up to a point, but wonąt know what the data names mean unless they have >>> hardcoded it). >>> >>> A CIF2 parser would like a leading comment to tell it what sort of file it >>> is parsing. It the absence of that comment, a pre-scan will need to be done. >>> The telltale indicators it is a CIF1 data file are multiple occurrences of, >>> >>> (1) data names that potentially contain [] or / >>> (2) unquoted strings with illegal characters >>> (3) quoted strings that result in parse failure (typically because they must >>> have an embedded [but not elided] quote character as allowed in CIF1). >>> >>> It needs to be a pre-scan because all 3 of the above in an identified CIF2 >>> data file would result in something quite different since there are coercion >>> rules for when the whitespace separator is missing. >>> >>> For instance IF I KNOW it is a CIF2 file and I read >>> >>> _name[1] >>> >>> Then this can only be an error and I coerce into >>> >>> _name [1] >>> >>> IF I DONąT KNOW the file type, the occurrence of _name[1] flags it as >>> potentially a CIF1 file. If _name[1] is in an alias list, this re-enforces >>> the likelihood of CIF1. Multiple instances of these łerrors˛ (or any others >>> in the above list) indicate it is a CIF1 file (my only other conclusion >>> would be it is a VERY BADLY written CIF2). >>> >>> I think this takes us back to a very simple rule set, and I donąt think the >>> restriction in the character set for data names will cause problems. For all >>> the excitement of UTF-8 etc I know of programming languages that support >>> reading and writing data in such encodings but I havenąt seen one that >>> allows/encourages one to write programmes declaring identifiers in UTF-8 >>> character sets. (They well exist I just havenąt seen them). >>> >>> >>> On 17/11/09 12:04 AM, "David Brown" <idbrown@mcmaster.ca> wrote: >>> >>>> James, >>>> >>>> There seems to be a lull in the discussions on CIF2 syntax so this would be >>>> a >>>> good time for you, or appointed chosen by you, to summarize where we are at >>>> and propose a set of rules that will can work with as we move forward. I >>>> realize that much of the work I have already done on dictionaries will need >>>> to >>>> be revisited, and Herbert also seems anxious to have some decisions on the >>>> various topics that have been discussed. >>>> >>>> I believe we have a consensus on a number of points, but these need to be >>>> written down clearly and need our formal agreement so we can move ahead. >>>> >>>> David >>>> >>>> >>>> _______________________________________________ >>>> ddlm-group mailing list >>>> ddlm-group@iucr.org >>>> http://scripts.iucr.org/mailman/listinfo/ddlm-group >>> >>> cheers >>> >>> Nick >>> >>> -------------------------------- >>> Associate Professor N. Spadaccini, PhD >>> School of Computer Science & Software Engineering >>> >>> The University of Western Australia t: +61 (0)8 6488 3452 >>> 35 Stirling Highway f: +61 (0)8 6488 1089 >>> CRAWLEY, Perth, WA 6009 AUSTRALIA w3: www.csse.uwa.edu.au/~nick >>> MBDP M002 >>> >>> CRICOS Provider Code: 00126G >>> >>> e: Nick.Spadaccini@uwa.edu.au >>> >>> >>> >>> > > cheers > > Nick > > -------------------------------- > Associate Professor N. Spadaccini, PhD > School of Computer Science & Software Engineering > > The University of Western Australia t: +61 (0)8 6488 3452 > 35 Stirling Highway f: +61 (0)8 6488 1089 > CRAWLEY, Perth, WA 6009 AUSTRALIA w3: www.csse.uwa.edu.au/~nick > MBDP M002 > > CRICOS Provider Code: 00126G > > e: Nick.Spadaccini@uwa.edu.au > > > > > _______________________________________________ > ddlm-group mailing list > ddlm-group@iucr.org > http://scripts.iucr.org/mailman/listinfo/ddlm-group >
_______________________________________________ ddlm-group mailing list ddlm-group@iucr.org http://scripts.iucr.org/mailman/listinfo/ddlm-group
Reply to: [list | sender only]
- Follow-Ups:
- Re: [ddlm-group] CIF-2 changes (David Brown)
- References:
- Re: [ddlm-group] CIF-2 changes (Nick Spadaccini)
- Prev by Date: Re: [ddlm-group] CIF-2 changes
- Next by Date: Re: [ddlm-group] CIF-2 changes
- Prev by thread: Re: [ddlm-group] CIF-2 changes
- Next by thread: Re: [ddlm-group] CIF-2 changes
- Index(es):