[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Reply to: [list | sender only]
Re: [ddlm-group] CIF-2 changes
- To: [email protected], Group finalising DDLm and associated dictionaries <[email protected]>
- Subject: Re: [ddlm-group] CIF-2 changes
- From: "Herbert J. Bernstein" <[email protected]>
- Date: Tue, 17 Nov 2009 08:54:37 -0500 (EST)
- In-Reply-To: <C728C5E7.12498%[email protected]>
- References: <C728C5E7.12498%[email protected]>
Dear Nick,
The question on which we are circling is the valid data names. Look
over the chain of emails -- every possible combination is still on the
table. We need to get everybody to sign on to one clear, complete and
final specification.
We need a meeting.
Regards,
Herbert
=====================================================
Herbert J. Bernstein, Professor of Computer Science
Dowling College, Kramer Science Center, KSC 121
Idle Hour Blvd, Oakdale, NY, 11769
+1-631-244-3035
[email protected]
=====================================================
On Tue, 17 Nov 2009, Nick Spadaccini wrote:
>
> Sorry Herb, how is this full circle with no agreement? I have suggested we
> adopt almost all of the changes we discussed except that with David's option
> we can now simply enforce a more limited character set on data names so that
> the parsing problems within dREL for names with included [] are now
> eliminated, while still making it possible to handle legacy names. As a
> consequence of not needing to support [] in names we can now revert back to
> using them for list delimiters.
>
> The latter is the only circle, otherwise what was generally agreed in
> discussion is still there. My discussion below was for STAR which is the
> superset of CIF. The CIF2 specific stuff is still on the table.
>
> Have I missed something?
>
> On 17/11/09 9:11 PM, "Herbert J. Bernstein" <[email protected]>
> wrote:
>
>> Dear Colleagues,
>>
>> We have now come full circle with no agreement on anything. I for one,
>> for the reasons outlined in many prior messages, do not think this
>> latest (=oldest) approach to be a good idea.
>>
>> Clearly, if we are ever to resolve this, we need to get all the players
>> into a meeting at one time and work things out. I suspect we will not be
>> able to arrange a timely physical meeting. Perhaps some sort of an
>> emeeting (Ajax, Skype or somesuch) would work.
>>
>> Regards,
>> Herbert
>>
>> =====================================================
>> Herbert J. Bernstein, Professor of Computer Science
>> Dowling College, Kramer Science Center, KSC 121
>> Idle Hour Blvd, Oakdale, NY, 11769
>>
>> +1-631-244-3035
>> [email protected]
>> =====================================================
>>
>> On Tue, 17 Nov 2009, Nick Spadaccini wrote:
>>
>>> David�s Option 3 is the simplest way forward, and actually revisits much of
>>> what was discussed back in 2007-08. Somehow those discussions were locked
>>> far back in my brain, only to be awakened by David�s summary. Thanks for
>>> that.
>>>
>>> So now I return to the STAR syntax. DDLm is part of STAR and hence
>>> restrictions on data names so they can be parsed etc is a STAR issue. I am
>>> brought around to Joe�s idea that STAR accepts any 8 bit character sequence
>>> since that is the most complete set � and that this will be restricted to
>>> UTF-8 within the CIF specification. Any other adoptee of STAR can choose
>>> whatever restricted encoding they wish.
>>>
>>> I still need to treat data names as programming identifiers within dREL so
>>> accordingly I propose we restrict the data names in STAR (and all variants)
>>> to be ASCII [A-Za-z0-9_.] as we have used in the sample dictionaries, DDLm
>>> and dREL.
>>>
>>> The data values will be represented as discussed in previous threads and
>>> that the reverse solidus and the token delimiters discussed will be ASCII
>>> characters. We can now return to [] as the list delimiters, and {} as the
>>> associative array delimiters.
>>>
>>> Backward compatibility to CIF1 names is handled by exploiting the _alias
>>> attributes in the definition. A CIF2 parser with dictionary can handle
>>> everything. Any CIF1 parser can handle CIF1 data files (also CIF2 data files
>>> up to a point, but won�t know what the data names mean � unless they have
>>> hardcoded it).
>>>
>>> A CIF2 parser would like a leading comment to tell it what sort of file it
>>> is parsing. It the absence of that comment, a pre-scan will need to be done.
>>> The telltale indicators it is a CIF1 data file are multiple occurrences of,
>>>
>>> (1) data names that potentially contain [] or /
>>> (2) unquoted strings with illegal characters
>>> (3) quoted strings that result in parse failure (typically because they must
>>> have an embedded [but not elided] quote character as allowed in CIF1).
>>>
>>> It needs to be a pre-scan because all 3 of the above in an identified CIF2
>>> data file would result in something quite different since there are coercion
>>> rules for when the whitespace separator is missing.
>>>
>>> For instance IF I KNOW it is a CIF2 file and I read
>>>
>>> _name[1]
>>>
>>> Then this can only be an error and I coerce into
>>>
>>> _name [1]
>>>
>>> IF I DON�T KNOW the file type, the occurrence of _name[1] flags it as
>>> potentially a CIF1 file. If _name[1] is in an alias list, this re-enforces
>>> the likelihood of CIF1. Multiple instances of these �errors� (or any others
>>> in the above list) indicate it is a CIF1 file (my only other conclusion
>>> would be it is a VERY BADLY written CIF2).
>>>
>>> I think this takes us back to a very simple rule set, and I don�t think the
>>> restriction in the character set for data names will cause problems. For all
>>> the excitement of UTF-8 etc I know of programming languages that support
>>> reading and writing data in such encodings but I haven�t seen one that
>>> allows/encourages one to write programmes declaring identifiers in UTF-8
>>> character sets. (They well exist I just haven�t seen them).
>>>
>>>
>>> On 17/11/09 12:04 AM, "David Brown" <[email protected]> wrote:
>>>
>>>> James,
>>>>
>>>> There seems to be a lull in the discussions on CIF2 syntax so this would be
>>>> a
>>>> good time for you, or appointed chosen by you, to summarize where we are at
>>>> and propose a set of rules that will can work with as we move forward. I
>>>> realize that much of the work I have already done on dictionaries will need
>>>> to
>>>> be revisited, and Herbert also seems anxious to have some decisions on the
>>>> various topics that have been discussed.
>>>>
>>>> I believe we have a consensus on a number of points, but these need to be
>>>> written down clearly and need our formal agreement so we can move ahead.
>>>>
>>>> David
>>>>
>>>>
>>>> _______________________________________________
>>>> ddlm-group mailing list
>>>> [email protected]
>>>> http://scripts.iucr.org/mailman/listinfo/ddlm-group
>>>
>>> cheers
>>>
>>> Nick
>>>
>>> --------------------------------
>>> Associate Professor N. Spadaccini, PhD
>>> School of Computer Science & Software Engineering
>>>
>>> The University of Western Australia t: +61 (0)8 6488 3452
>>> 35 Stirling Highway f: +61 (0)8 6488 1089
>>> CRAWLEY, Perth, WA 6009 AUSTRALIA w3: www.csse.uwa.edu.au/~nick
>>> MBDP M002
>>>
>>> CRICOS Provider Code: 00126G
>>>
>>> e: [email protected]
>>>
>>>
>>>
>>>
>
> cheers
>
> Nick
>
> --------------------------------
> Associate Professor N. Spadaccini, PhD
> School of Computer Science & Software Engineering
>
> The University of Western Australia t: +61 (0)8 6488 3452
> 35 Stirling Highway f: +61 (0)8 6488 1089
> CRAWLEY, Perth, WA 6009 AUSTRALIA w3: www.csse.uwa.edu.au/~nick
> MBDP M002
>
> CRICOS Provider Code: 00126G
>
> e: [email protected]
>
>
>
>
> _______________________________________________
> ddlm-group mailing list
> [email protected]
> http://scripts.iucr.org/mailman/listinfo/ddlm-group
>
_______________________________________________ ddlm-group mailing list [email protected] http://scripts.iucr.org/mailman/listinfo/ddlm-group
Reply to: [list | sender only]
- Follow-Ups:
- Re: [ddlm-group] CIF-2 changes (David Brown)
- References:
- Re: [ddlm-group] CIF-2 changes (Nick Spadaccini)
- Prev by Date: Re: [ddlm-group] CIF-2 changes
- Next by Date: Re: [ddlm-group] CIF-2 changes
- Prev by thread: Re: [ddlm-group] CIF-2 changes
- Next by thread: Re: [ddlm-group] CIF-2 changes
- Index(es):

