[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Reply to: [list | sender only]
Re: [ddlm-group] Characterset of non-delimited strings insidecompound data items
- To: Group finalising DDLm and associated dictionaries <[email protected]>
- Subject: Re: [ddlm-group] Characterset of non-delimited strings insidecompound data items
- From: James Hester <[email protected]>
- Date: Wed, 10 Nov 2010 12:30:50 +1100
- In-Reply-To: <[email protected]>
- References: <[email protected]><[email protected]><[email protected]><[email protected]>
I support this suggestion. If no objections are received by the end
of the week, I propose changing 'Change 5' in the current draft to the
following text, and then presenting the resulting draft to COMCIFS,
rather than go through a further round of voting in this group.
Revised wording for Change 5:
A data value in CIF2 may be a whitespace delimited string of allowed characters.
A whitespace delimited string cannot contain the characters [ { ] }.
Furthermore, the first character of a whitespace delimited string
cannot be any of the ASCII characters " ' _ $. STAR keywords may not
appear as whitespace delimited strings: loop_
global_ save_* stop_ data_* (case insensitive), where * refers to zero
or more characters.
Reasoning: The above exclusions are required for CIF2 syntax to be unambiguous.
On Wed, Nov 10, 2010 at 2:03 AM, SIMON WESTRIP
<[email protected]> wrote:
>
> I would suggest adding [ & { also:
>
> _t � � � � � �[ depth_1[depth_1_still [ depth_2 x=a1[a2[a3[a4[i]]]];]]
>
>
> ?
>
> Simon
>
>
>
>
> ________________________________
> From: James Hester <[email protected]>
> To: Group finalising DDLm and associated dictionaries <[email protected]>
> Sent: Tuesday, 9 November, 2010 3:14:55
> Subject: Re: [ddlm-group] Characterset of non-delimited strings inside
> compound data items
>
> Indeed.� I agree that "ease of use of flex" is not a good criterion. A
> better way of putting it would be "simplicity of implementation".
>
> Glad you don't object to excluding close brackets.
>
> James.
>
> On Tue, Nov 9, 2010 at 1:23 PM, Herbert J. Bernstein
> <[email protected]> wrote:
>> While I have no particular objection to excluding the close brackets from
>> non-delimited strings, personally I think making easy use of flex a
>> criterion for the design of CIF2 is not a good idea. -- Herbert
>> =====================================================
>> �Herbert J. Bernstein, Professor of Computer Science
>> � �Dowling College, Kramer Science Center, KSC 121
>> � � � � Idle Hour Blvd, Oakdale, NY, 11769
>>
>> � � � � � � � � �+1-631-244-3035
>> � � � � � � � � �[email protected]
>> =====================================================
>>
>> On Tue, 9 Nov 2010, James Hester wrote:
>>
>>> Dear DDLm group,
>>>
>>> John Bollinger has alerted me to a glitch in the current DDLm
>>> specification, to wit: (i) close bracket characters are allowed as
>>> non-final characters in a non-delimited string, and (ii) there is no
>>> requirement for whitespace between a datavalue and the close bracket
>>> symbol that denotes the end of a table or list value.
>>>
>>> This means that, in order to decide whether a close bracket character
>>> terminates a list or is just another character in the non-delimited
>>> string, the parser must look ahead, potentially many levels of
>>> nesting. �For example:
>>>
>>> _t � � � � � �[outer [inner1 inner2]]
>>>
>>>
>>> The parser does not know that the first close bracket closes the inner
>>> list until it has read past the second close bracket. � Or even more
>>> confusingly:
>>>
>>> _t � � � � � �[ depth_1 [ depth_2 [ depth_3 x=a1[a2[a3[a4[i]]]];]]]
>>>
>>>
>>> While this behaviour is not intractable, it is also not possible to
>>> use simple lexing tools (e.g. flex) to handle such syntax. �I would
>>> therefore like to propose the following change to the current draft
>>> specification:
>>>
>>> "The characters ']' and '}' may not appear anywhere in a non-delimited
>>> datavalue"
>>>
>>> James.
>>> --
>>> T +61 (02) 9717 9907
>>> F +61 (02) 9717 3145
>>> M +61 (04) 0249 4148
>>> _______________________________________________
>>> ddlm-group mailing list
>>> [email protected]
>>> http://scripts.iucr.org/mailman/listinfo/ddlm-group
>>>
>> _______________________________________________
>> ddlm-group mailing list
>> [email protected]
>> http://scripts.iucr.org/mailman/listinfo/ddlm-group
>>
>
>
>
> --
> T +61 (02) 9717 9907
> F +61 (02) 9717 3145
> M +61 (04) 0249 4148
> _______________________________________________
> ddlm-group mailing list
> [email protected]
> http://scripts.iucr.org/mailman/listinfo/ddlm-group
>
> _______________________________________________
> ddlm-group mailing list
> [email protected]
> http://scripts.iucr.org/mailman/listinfo/ddlm-group
>
>
--
T +61 (02) 9717 9907
F +61 (02) 9717 3145
M +61 (04) 0249 4148
_______________________________________________
ddlm-group mailing list
[email protected]
http://scripts.iucr.org/mailman/listinfo/ddlm-group
Reply to: [list | sender only]
- References:
- [ddlm-group] Characterset of non-delimited strings inside compounddata items (James Hester)
- Re: [ddlm-group] Characterset of non-delimited strings insidecompound data items (Herbert J. Bernstein)
- Re: [ddlm-group] Characterset of non-delimited strings insidecompound data items (James Hester)
- Re: [ddlm-group] Characterset of non-delimited strings insidecompound data items (SIMON WESTRIP)
- Prev by Date: Re: [ddlm-group] Characterset of non-delimited strings insidecompound data items. .. .
- Next by Date: [ddlm-group] Moving forward with DDLm
- Prev by thread: Re: [ddlm-group] Characterset of non-delimited strings insidecompound data items. .. .
- Next by thread: [ddlm-group] Vote on accepting CIF2 draft document
- Index(es):

