[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Reply to: [list | sender only]
Re: [ddlm-group] What we have resolved so far
- To: [email protected], Group finalising DDLm and associated dictionaries <[email protected]>
- Subject: Re: [ddlm-group] What we have resolved so far
- From: James Hester <[email protected]>
- Date: Thu, 19 Nov 2009 13:55:18 +1100
- In-Reply-To: <C72AC749.124D2%[email protected]>
- References: <[email protected]><C72AC749.124D2%[email protected]>
We should split the unresolved issues into separate threads otherwise we're going to have a hell of a time tracking it all. To those reading this message: if I haven't managed to initiate the relevant threads by the time you feel moved to reply, please do so yourself. On Thu, Nov 19, 2009 at 12:58 PM, Nick Spadaccini <[email protected]> wrote: > A quick response to JRH's email, to put things in context. My more detailed > summary is still coming. > > > On 19/11/09 8:22 AM, "James Hester" <[email protected]> wrote: > >> Nick's forthcoming email notwithstanding, here is a quick list of what >> I think we have resolved and not resolved so far: >> >> RESOLVED: >> >> 1. �The new standard is called CIF2 >> >> 2. �All files conforming to the new standard must have a header >> containing something like the characters "#CIF2" >> >> 3. �Non-quote-delimited strings may not contain any >> syntactically-significant characters (exact character set has been >> specified by Nick, but before UTF-8 decision) >> >> 4. �Quote delimited strings may not contain instances of the >> terminating character, regardless of following whitespace. > > Unless you do as in (5) > >> 5. �In a quote-delimited string, a reverse solidus escapes the >> following character, if that character is otherwise syntactically >> meaningful >> >> 6. �Files are UTF-8 encoded >> >> 7. �No tuples >> >> UNRESOLVED (with notes) >> >> 1. �Do we maintain the fixed line length restriction? >> � � - I will post something to the relevant thread to provoke a resolution > > Currently at 2048 bytes. I will propose maintaining this in deference to > legacy and future Fortran programmes. > >> 2. �Is an escaping reverse solidus part of the datavalue? >> � � - This conversation didn't appear to resolve itself > > I will propose yes, that it is left to a downstream application. This is > actually consistent with how Python works. My email timestamped > > Mon, 09 Nov 2009 10:35:41 +0800 > > Explained my reasoning. > >> >> 3. �Are square brackets permitted in datanames? (getting close to resolution) > > I will propose a character set restricted with only _ and . as allowed > punctuation characters. All data names can be identifiers in dREL, and even > those we assume won't be in dREL can be because someone writing a completely > different dictionary can import our definitions and then add our data names > to their dREL scripts. > > To simplify this issue I suggest avoiding the problem. Legacy CIF1 names > will be aliased in CIF dictionaries so that when we read a CIF1 data name in > a CIF1 file we can immediately map it to its CIF2 name (this avoids the need > to remediate all existing CIF1 files). > >> 4. �Does STAR also adopt UTF-8 or go with straight binary? (This may >> be up to Nick) > > I will propose binary. Any other application domain can then choose UTF-8, > UTF-16, UCS2 or whatever encoding they wish. This will make Herb's imgCIF a > legitimate STAR application while not a CIF2 application because of his > binary component being in binUTF? binUCS?. > >> 5. �Can we use whitespace instead of comma as a list item delimiter? >> � � -not yet tackled seriously but deserves consideration > > I will propose it has to be a comma, but make the coercion rule that space > separated values in a list-type object be coerced into comma separated > values. That is, read spaces as you want, but don't encourage them. > > >> 6. �Are braces only or square brackets + braces used to delimit lists >> and associative arrays? >> � � - some consider this decision to be coupled to (3), obvious preference >> � � � is for square brackets and braces if other issues are solved > > With my proposal for 3 acceptable, then I would propose returning to [] for > lists and {} for associative arrays, making it possible to distinguish the > two at the lexical level by reading the first character. > >> 7. �What is the exact form of the header comment (there was some >> discussion of adding a second character such as % or !)? > > I think it should be the same as Unix shell headers. > >> 8. �Usage of triple-quoted strings: (a) do we need them? (b) do we >> need both of them? > > (a) Yes if you want inline multiline strings. (b) Seems superfluous but > makes encoding a """ in a ''' string much easier (and vice versa) without > having to elide. > >> 9. �Are general unicode characters allowed in non-quote-delimited strings? > > You know my view on this. I want to discourage non-delimited strings and > encourage delimited strings. But I can't see (for now) any reason that the > characters sets have to be different. > > There is one thing about Unicode we have to clarify. The XML specification > does not allow ALL Unicode characters because some of them (I think) break > the parsing process. The exclusion set is small, but probably significant. I > don't know the details but when we say Unicode characters we had better be > explicit as to which. Herb, you seem to have a handle on the XML spec maybe > you can explain what the exclusion set is and why. You can propose to this > group what the Unicode set should be. > > cheers > > Nick > > -------------------------------- > Associate Professor N. Spadaccini, PhD > School of Computer Science & Software Engineering > > The University of Western Australia � �t: +61 (0)8 6488 3452 > 35 Stirling Highway � � � � � � � � � �f: +61 (0)8 6488 1089 > CRAWLEY, Perth, �WA �6009 AUSTRALIA � w3: www.csse.uwa.edu.au/~nick > MBDP �M002 > > CRICOS Provider Code: 00126G > > e: [email protected] > > > > > _______________________________________________ > ddlm-group mailing list > [email protected] > http://scripts.iucr.org/mailman/listinfo/ddlm-group > -- T +61 (02) 9717 9907 F +61 (02) 9717 3145 M +61 (04) 0249 4148 _______________________________________________ ddlm-group mailing list [email protected] http://scripts.iucr.org/mailman/listinfo/ddlm-group
Reply to: [list | sender only]
- References:
- [ddlm-group] What we have resolved so far (James Hester)
- Re: [ddlm-group] What we have resolved so far (Nick Spadaccini)
- Prev by Date: Re: [ddlm-group] Relationship of CIF2 to legacy platforms
- Next by Date: [ddlm-group] Use of elides in strings
- Prev by thread: Re: [ddlm-group] What we have resolved so far
- Next by thread: [ddlm-group] Which brakets are reserved?
- Index(es):