[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Reply to: [list | sender only]
Re: [ddlm-group] THREAD 3: The alphabet of non-delimited strings.
- To: Group finalising DDLm and associated dictionaries <ddlm-group@iucr.org>
- Subject: Re: [ddlm-group] THREAD 3: The alphabet of non-delimited strings.
- From: James Hester <jamesrhester@gmail.com>
- Date: Fri, 2 Oct 2009 16:57:33 +1000
- In-Reply-To: <20090930092332.H90159@epsilon.pair.com>
- References: <C6E123F5.11EB6%nick@csse.uwa.edu.au><20090924063136.D23301@epsilon.pair.com><279aad2a0909300514s2608eb59u851ed658352164b4@mail.gmail.com><20090930092332.H90159@epsilon.pair.com>
Herbert writes: " Bottom line -- what is proposed is a very different language that will use a significantly different lexer and parser from the one used for DDL1 and DDL2 CIFS, guaranteeing to leave us with multiple dialects for a very long time. I think that is a shame -- rather than DDLm consolidating DDL1 and DDL2 and adding useful new features, we are simply going to end up with DDL1, DDL2 and DDL3 as three distinct dialects. I think this is unwise." In order not to confuse matters, let us restrict the use of the terms DDL1, DDL2 and DDL3 to dictionary definition languages, not the syntax variations we are currently discussing. I believe Herbert has in mind CIF 1.0, 1.1 and 1.2. I would like to explore his concern about the difference in the proposed CIF 1.2 parser. Some difference is inevitable in that we have added two new constructs, the triple quote delimited string and the bracketed list. Because of this, a CIF 1.1 parser will break on a CIF 1.2 file regardless of any changes to string content rules, so that is presumably not the main concern. Perhaps the concern is that a CIF1.2 parser will not be able to parse all files built according to previous CIF syntax versions? But this is always going to be the case due to the (theoretical) possibility of a triple quote appearing as a value in a CIF 1.1 file, which would mean a single quote under CIF1.1 rules, but the beginning of a string under CIF1.2. Perhaps Herbert could expand on why this inability of a CIF 1.2 parser to parse a CIF 1.1 file is a problem. To take the DDL1/DDL2/DDL3 comment at face value, these are by design three distinct dictionary languages, with DDL3 taking the best of DDL1 and 2. I don't see why this is a shame. Herbert goes on to say: Just to be clear, I do think the restriction on character set of non-delimited strings is unwise -- of all the changes proposed, I believe that it is the one that invalidates the largest number of existing CIFS, and serves no useful purpose that could not be achieved by the simple exclusion of specific cases, as we have already done. In what sense are existing CIFs 'invalidated'? They are all still valid CIF1.1 files, which is a published standard. Perhaps Herbert or somebody could expand on what the real world issues might be because of the proposed change? Finally, Herbert writes: "I would also consider all the printable UTF-8 characters as valid." Herbert, could you please explain in more detail this proposal. Do you mean that only the one-byte printable UTF-8 characters (= ASCII) are included? Or do you mean that all of UTF-8 is included, i.e. characters may need up to 4 bytes to be represented? If the latter, then are we proposing to accept all legal UTF-8 byte values, without using an intermediate representation? Is this use of UTF8 restricted to delimited strings? -- T +61 (02) 9717 9907 F +61 (02) 9717 3145 M +61 (04) 0249 4148 _______________________________________________ ddlm-group mailing list ddlm-group@iucr.org http://scripts.iucr.org/mailman/listinfo/ddlm-group
Reply to: [list | sender only]
- Follow-Ups:
- Re: [ddlm-group] THREAD 3: The alphabet of non-delimited strings. (Herbert J. Bernstein)
- Re: [ddlm-group] THREAD 3: The alphabet of non-delimited strings. (James Hester)
- References:
- Re: [ddlm-group] THREAD 3: The alphabet of non-delimited strings. (Nick Spadaccini)
- Re: [ddlm-group] THREAD 3: The alphabet of non-delimited strings. (Herbert J. Bernstein)
- Re: [ddlm-group] THREAD 3: The alphabet of non-delimited strings. (James Hester)
- Re: [ddlm-group] THREAD 3: The alphabet of non-delimited strings. (Herbert J. Bernstein)
- Prev by Date: Re: [ddlm-group] THREAD 3: The alphabet of non-delimited strings.
- Next by Date: Re: [ddlm-group] THREAD 3: The alphabet of non-delimited strings.
- Prev by thread: Re: [ddlm-group] THREAD 3: The alphabet of non-delimited strings.
- Next by thread: Re: [ddlm-group] THREAD 3: The alphabet of non-delimited strings.
- Index(es):