[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Reply to: [list | sender only]
Re: [ddlm-group] THREAD 3: The alphabet of non-delimited strings.
- To: Group finalising DDLm and associated dictionaries <ddlm-group@iucr.org>
- Subject: Re: [ddlm-group] THREAD 3: The alphabet of non-delimited strings.
- From: "Herbert J. Bernstein" <yaya@bernstein-plus-sons.com>
- Date: Fri, 2 Oct 2009 07:00:42 -0400 (EDT)
- In-Reply-To: <279aad2a0910012357u31ffa9cbkd6dc8ddad277193a@mail.gmail.com>
- References: <C6E123F5.11EB6%nick@csse.uwa.edu.au><20090924063136.D23301@epsilon.pair.com><279aad2a0909300514s2608eb59u851ed658352164b4@mail.gmail.com><20090930092332.H90159@epsilon.pair.com><279aad2a0910012357u31ffa9cbkd6dc8ddad277193a@mail.gmail.com>
Dear Colleagues, When we went from CIF 1.0 to CIF 1.1, we all tried very hard to make as many CIF 1.0 files as possible remain valid CIF 1.1 files without the need for any changes. When DDLm was introduced a promise was made to the community that is still on the IUCr web site in bold face: "No changes are required in existing archival data files in order to apply domain dictionaries written in DDLm." If we are now breaking that promise, which it appears we are about to do if we are not very, very careful, then I believe the have an ethical obligation to make that clear to the community and invite them into the discussion. I have to run to get ready to submit a proposal now, but I will respond more directly to James Hester's questions about the details of how this change impacts existing CIFs later today, but please do take a look at what we said on http://www.iucr.org/resources/cif/ddl/ddlm Regards, Herbert ===================================================== Herbert J. Bernstein, Professor of Computer Science Dowling College, Kramer Science Center, KSC 121 Idle Hour Blvd, Oakdale, NY, 11769 +1-631-244-3035 yaya@dowling.edu ===================================================== On Fri, 2 Oct 2009, James Hester wrote: > Herbert writes: > > " Bottom line -- what is proposed is a very different language > that will use a significantly different lexer and parser from > the one used for DDL1 and DDL2 CIFS, guaranteeing to leave us > with multiple dialects for a very long time. I think that is > a shame -- rather than DDLm consolidating DDL1 and DDL2 and > adding useful new features, we are simply going to end up with > DDL1, DDL2 and DDL3 as three distinct dialects. > > I think this is unwise." > > In order not to confuse matters, let us restrict the use of the terms > DDL1, DDL2 and DDL3 to dictionary definition languages, not the syntax > variations we are currently discussing. I believe Herbert has in mind > CIF 1.0, 1.1 and 1.2. I would like to explore his concern about the > difference in the proposed CIF 1.2 parser. Some difference is > inevitable in that we have added two new constructs, the triple quote > delimited string and the bracketed list. Because of this, a CIF 1.1 > parser will break on a CIF 1.2 file regardless of any changes to > string content rules, so that is presumably not the main > concern. Perhaps the concern is that a CIF1.2 parser will not be able > to parse all files built according to previous CIF syntax versions? > But this is always going to be the case due to the (theoretical) > possibility of a triple quote appearing as a value in a CIF 1.1 file, > which would mean a single quote under CIF1.1 rules, but the beginning > of a string under CIF1.2. Perhaps Herbert could expand on why this > inability of a CIF 1.2 parser to parse a CIF 1.1 file is a problem. > > To take the DDL1/DDL2/DDL3 comment at face value, these are by design > three distinct dictionary languages, with DDL3 taking the best of DDL1 > and 2. I don't see why this is a shame. > > Herbert goes on to say: > > Just to be clear, I do think the restriction on character > set of non-delimited strings is unwise -- of all the changes > proposed, I believe that it is the one that invalidates the > largest number of existing CIFS, and serves no useful > purpose that could not be achieved by the simple exclusion > of specific cases, as we have already done. > > In what sense are existing CIFs 'invalidated'? They are all still > valid CIF1.1 files, which is a published standard. Perhaps Herbert or > somebody could expand on what the real world issues might be because > of the proposed change? > > Finally, Herbert writes: > > "I would also consider all the printable UTF-8 characters as valid." > > Herbert, could you please explain in more detail this proposal. Do > you mean that only the one-byte printable UTF-8 characters (= ASCII) > are included? Or do you mean that all of UTF-8 is included, > i.e. characters may need up to 4 bytes to be represented? If the > latter, then are we proposing to accept all legal UTF-8 byte values, > without using an intermediate representation? Is this use of UTF8 > restricted to delimited strings? > > > -- > T +61 (02) 9717 9907 > F +61 (02) 9717 3145 > M +61 (04) 0249 4148 > _______________________________________________ > ddlm-group mailing list > ddlm-group@iucr.org > http://scripts.iucr.org/mailman/listinfo/ddlm-group > _______________________________________________ ddlm-group mailing list ddlm-group@iucr.org http://scripts.iucr.org/mailman/listinfo/ddlm-group
Reply to: [list | sender only]
- Follow-Ups:
- Re: [ddlm-group] THREAD 3: The alphabet of non-delimited strings. (Herbert J. Bernstein)
- References:
- Re: [ddlm-group] THREAD 3: The alphabet of non-delimited strings. (Nick Spadaccini)
- Re: [ddlm-group] THREAD 3: The alphabet of non-delimited strings. (Herbert J. Bernstein)
- Re: [ddlm-group] THREAD 3: The alphabet of non-delimited strings. (James Hester)
- Re: [ddlm-group] THREAD 3: The alphabet of non-delimited strings. (Herbert J. Bernstein)
- Re: [ddlm-group] THREAD 3: The alphabet of non-delimited strings. (James Hester)
- Prev by Date: Re: [ddlm-group] THREAD 3: The alphabet of non-delimited strings.
- Next by Date: Re: [ddlm-group] THREAD 3: The alphabet of non-delimited strings.
- Prev by thread: Re: [ddlm-group] THREAD 3: The alphabet of non-delimited strings.
- Next by thread: Re: [ddlm-group] THREAD 3: The alphabet of non-delimited strings.
- Index(es):