Discussion List Archives

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [ddlm-group] Characterset of non-delimited strings insidecompound data items. .

If the brackets in these constructs are treated as delimiters (though 'nestable' unlike the other delimiters), then perhaps both the opening and closing delimiter
should be separated from a preceding/subsequent value (or 'parent' value) by whitespace. Afterall, the brackets delimit a *value*, even if that value is made up of other
values? To me this seems quite logical (and actually allows a fairly simple description of the syntax at a lexical level in terms of delimited and non-delimited
'strings' separated by whitespace), but I suspect this will not be received favourably :-)

Cheers

Simon


From: "Bollinger, John C" <John.Bollinger@STJUDE.ORG>
To: Group finalising DDLm and associated dictionaries <ddlm-group@iucr.org>
Sent: Tuesday, 9 November, 2010 15:35:07
Subject: Re: [ddlm-group] Characterset of non-delimited strings insidecompou nd data items. .

I would be fine with disallowing all of [, ], {, and } from appearing anywhere in a whitespace-delimited data value.  The [ and { do not at present cause lexical ambiguity, but there‚Äôs something to be said for consistent treatment of opening and closing delimiters.  Do note, however, that the opening [ of a list or { of a table is still required to be separated from any preceding value by whitespace.  Thus

 

_t            [ depth_1[depth_1_still ] ]

 

would contain an error at the second [.  (There is still an error under the current spec or if only ] and } are forbidden in whitespace-delimited data values, but it is at the final ].)  This is still OK, though:

 

_t            [[depth_2] depth_1]

 

 

John

 

 

From: ddlm-group-bounces@iucr.org [mailto:ddlm-group-bounces@iucr.org] On Behalf Of SIMON WESTRIP
Sent: Tuesday, November 09, 2010 9:03 AM
To: Group finalising DDLm and associated dictionaries
Subject: Re: [ddlm-group] Characterset of non-delimited strings insidecompou nd data items. .

 


I would suggest adding [ & { also:

_t            [ depth_1[depth_1_still [ depth_2 x=a1[a2[a3[a4[i]]]];]]


?

Simon


 


From: James Hester <jamesrhester@gmail.com>
To: Group finalising DDLm and associated dictionaries <ddlm-group@iucr.org>
Sent: Tuesday, 9 November, 2010 3:14:55
Subject: Re: [ddlm-group] Characterset of non-delimited strings inside compound data items

Indeed.  I agree that "ease of use of flex" is not a good criterion. A
better way of putting it would be "simplicity of implementation".

Glad you don't object to excluding close brackets.

James.

On Tue, Nov 9, 2010 at 1:23 PM, Herbert J. Bernstein
<yaya@bernstein-plus-sons.com> wrote:
> While I have no particular objection to excluding the close brackets from
> non-delimited strings, personally I think making easy use of flex a
> criterion for the design of CIF2 is not a good idea. -- Herbert
> =====================================================
>  Herbert J. Bernstein, Professor of Computer Science
>    Dowling College, Kramer Science Center, KSC 121
>         Idle Hour Blvd, Oakdale, NY, 11769
>
>                  +1-631-244-3035
>                  yaya@dowling.edu
> =====================================================
>
> On Tue, 9 Nov 2010, James Hester wrote:
>
>> Dear DDLm group,
>>
>> John Bollinger has alerted me to a glitch in the current DDLm
>> specification, to wit: (i) close bracket characters are allowed as
>> non-final characters in a non-delimited string, and (ii) there is no
>> requirement for whitespace between a datavalue and the close bracket
>> symbol that denotes the end of a table or list value.
>>
>> This means that, in order to decide whether a close bracket character
>> terminates a list or is just another character in the non-delimited
>> string, the parser must look ahead, potentially many levels of
>> nesting.  For example:
>>
>> _t            [outer [inner1 inner2]]
>>
>>
>> The parser does not know that the first close bracket closes the inner
>> list until it has read past the second close bracket.   Or even more
>> confusingly:
>>
>> _t            [ depth_1 [ depth_2 [ depth_3 x=a1[a2[a3[a4[i]]]];]]]
>>
>>
>> While this behaviour is not intractable, it is also not possible to
>> use simple lexing tools (e.g. flex) to handle such syntax.  I would
>> therefore like to propose the following change to the current draft
>> specification:
>>
>> "The characters ']' and '}' may not appear anywhere in a non-delimited
>> datavalue"
>>
>> James.
>> --
>> T +61 (02) 9717 9907
>> F +61 (02) 9717 3145
>> M +61 (04) 0249 4148
>> _______________________________________________
>> ddlm-group mailing list
>> ddlm-group@iucr.org
>> http://scripts.iucr.org/mailman/listinfo/ddlm-group
>>
> _______________________________________________
> ddlm-group mailing list
> ddlm-group@iucr.org
> http://scripts.iucr.org/mailman/listinfo/ddlm-group
>



--
T +61 (02) 9717 9907
F +61 (02) 9717 3145
M +61 (04) 0249 4148
_______________________________________________
ddlm-group mailing list
ddlm-group@iucr.org
http://scripts.iucr.org/mailman/listinfo/ddlm-group


Email Disclaimer: www.stjude.org/emaildisclaimer
_______________________________________________
ddlm-group mailing list
ddlm-group@iucr.org
http://scripts.iucr.org/mailman/listinfo/ddlm-group

Reply to: [list | sender only]
International Union of Crystallography

Scientific Union Member of the International Council for Science (admitted 1947). Member of CODATA, the ICSU Committee on Data. Member of ICSTI, the International Council for Scientific and Technical Information. Partner with UNESCO, the United Nations Educational, Scientific and Cultural Organization in the International Year of Crystallography 2014.

ICSU Scientific Freedom Policy

The IUCr observes the basic policy of non-discrimination and affirms the right and freedom of scientists to associate in international scientific activity without regard to such factors as ethnic origin, religion, citizenship, language, political stance, gender, sex or age, in accordance with the Statutes of the International Council for Science.