[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Reply to: [list | sender only]
Re: [ddlm-group] Revised version of syntax change summary document
- To: Group finalising DDLm and associated dictionaries <ddlm-group@iucr.org>
- Subject: Re: [ddlm-group] Revised version of syntax change summary document
- From: Nick Spadaccini <nick@csse.uwa.edu.au>
- Date: Thu, 10 Dec 2009 09:42:53 +0800
- Authentication-Results: postfix;
- In-Reply-To: <20091209140355.GA29341@emerald.iucr.org>
On 9/12/09 10:03 PM, "Brian McMahon" <bm@iucr.org> wrote: > A few comments on the latest version of the CIF2 syntax changes > summary document. > > I'm glad to see the explanation of tokens and separators. I was going > to ask for something of the sort. The visual aid is quite a good way > of doing this - and it does emphasise that the word "token" is a > rather dangerous (i.e. potentially ambiguous) one, since it can > apply promiscuously to a complete list or to lists contained in lists > or - n'est-ce pas? - to the individual elements within a list. > > For the target audience for this document, this level of ambiguity, > normally resolved by context, is probably OK, but we should be very > careful in drafting the final complete specification document. > > In similar vein, a complete specification should probably define > very carefully what is meant by phrases such as "lexical characters". > Again, I don't think that degree of pedantry is necessary for the > purposes of getting this out to the developer community. > > A few more specific points. > > 1. Permitted character set (under "Terminology" and/or "Encoding"). > CIF 1.1 explicitly EXCLUDES some of the characters in the ASCII set, > usually thought of as 'control characters'. Specifically, the excluded > characters are (decimal values) 00-08, 11, 12, 14-31 and 127. Should > this be restated clearly in this document for clarity? These were omitted because for safety and we are proposing adopting the XML character restrictions (believing the XML community have done their job). What should be restated? Their omission or reasons for their omission? > [Possibly relevant: what are the "additional 20 UNICODE characters > that constitute whitespace" mentioned in the "Terminology section"?] See http://en.wikipedia.org/wiki/Whitespace_(computer_science) Under UNICODE. > 2. Encoding. > "UTF-8 directly supports an extensive range of printable objects that > are not accessible through ASCII." Not strictly true: acceptance of a > \uNNNN encoding would give you access to all of these using the ASCII > character set. Just drop this sentence. I suggest dropping the next > also. We haven't yet revisited my suggestion that the IUCr markup > conventions be disallowed in CIF 2 - which, of course, isn't a > syntactic issue at this level of discourse. Done. > 3. Character set for data names. > States "A data name ... may be followed by any number of characters": > currently there's an implementation limit of 74 (plus the initial > underscore). I don't recall our discussing a proposal to change that, > specifically. Sorry wearing my other hat at the time of writing. STAR has no such restrictions. CIF has a data name length restriction presumably intimately tied to the 80 character line length restriction. Whether or not this is revisited is up to the IUCr community. Do you want me to include the restriction now, or is there a desire to discuss a change now? I have no opinion on the matter of whether CIF has restrictions to data name length. > [Typo in the "Reasoning" paragraph - should be "they ARE excluded"] Thanks. Done. > 4. Delimited strings. The descriptions of single- and double-quote > delimited strings use the term "newline character" - would be better > as "newline sequence" as used elsewhere. Thanks, missed those. > 5. List and Table data types. The phrase "In the context of being > outside of data tokens" is cumbersome, and I'm not sure I understand > how to parse it in an English grammatical sense. Would these > descriptions read better (but also be correct) if rephrased as: > > A data value of type list is initiated by ... and terminated by ... > A data value of type table is initiated by ... and terminated by ... OK. Done. > Perhaps a simple example would also be useful, given that these > introduce the most disruptive syntax change, e.g.: > > loop_ > _colour_name > _colour_value_rgb > red [1 0 0] > green [0 1 0] OK. Done. > [In Change 8 there is a typo: "curly braces brackets" is redundant.] Done. > Best wishes > Brian > > On Wed, Dec 09, 2009 at 10:02:52AM +0000, Brian McMahon wrote: >> At Nick's request I have posted an updated version of the syntax >> change document which should clarify a few things in light of the >> most recent discussion. This is available at the URL >> >> http://www.iucr.org/__data/assets/pdf_file/0017/27224/syntaxchangesproposed20 >> 091209.pdf >> >> (Nick: perhaps an internal identifier - a date would do - would help to >> differentiate future versions if one prints them out and sets them side >> by side on one's desk?) >> >> Cheers >> Brian > _______________________________________________ > ddlm-group mailing list > ddlm-group@iucr.org > http://scripts.iucr.org/mailman/listinfo/ddlm-group cheers Nick -------------------------------- Associate Professor N. Spadaccini, PhD School of Computer Science & Software Engineering The University of Western Australia t: +61 (0)8 6488 3452 35 Stirling Highway f: +61 (0)8 6488 1089 CRAWLEY, Perth, WA 6009 AUSTRALIA w3: www.csse.uwa.edu.au/~nick MBDP M002 CRICOS Provider Code: 00126G e: Nick.Spadaccini@uwa.edu.au _______________________________________________ ddlm-group mailing list ddlm-group@iucr.org http://scripts.iucr.org/mailman/listinfo/ddlm-group
Reply to: [list | sender only]
- References:
- Prev by Date: [ddlm-group] testers for vcif 3
- Next by Date: Re: [ddlm-group] Data-name character restrictions - one last time
- Prev by thread: Re: [ddlm-group] Revised version of syntax change summary document
- Next by thread: Re: [ddlm-group] Revised version of syntax change summary document
- Index(es):