[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[ddlm-group] Characterset of non-delimited strings inside compounddata items

  • To: ddlm-group <ddlm-group@iucr.org>
  • Subject: [ddlm-group] Characterset of non-delimited strings inside compounddata items
  • From: James Hester <jamesrhester@gmail.com>
  • Date: Tue, 9 Nov 2010 11:21:59 +1100
Dear DDLm group,

John Bollinger has alerted me to a glitch in the current DDLm
specification, to wit: (i) close bracket characters are allowed as
non-final characters in a non-delimited string, and (ii) there is no
requirement for whitespace between a datavalue and the close bracket
symbol that denotes the end of a table or list value.

This means that, in order to decide whether a close bracket character
terminates a list or is just another character in the non-delimited
string, the parser must look ahead, potentially many levels of
nesting.  For example:

_t            [outer [inner1 inner2]]


The parser does not know that the first close bracket closes the inner
list until it has read past the second close bracket.   Or even more
confusingly:

_t            [ depth_1 [ depth_2 [ depth_3 x=a1[a2[a3[a4[i]]]];]]]


While this behaviour is not intractable, it is also not possible to
use simple lexing tools (e.g. flex) to handle such syntax.  I would
therefore like to propose the following change to the current draft
specification:

"The characters ']' and '}' may not appear anywhere in a non-delimited
datavalue"

James.
-- 
T +61 (02) 9717 9907
F +61 (02) 9717 3145
M +61 (04) 0249 4148
_______________________________________________
ddlm-group mailing list
ddlm-group@iucr.org
http://scripts.iucr.org/mailman/listinfo/ddlm-group

Reply to: [list | sender only]