Discussion List Archives

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[ddlm-group] Datanames and [] - the final(?) outstanding syntaxissue.

The final outstanding issue is dataname syntax. On the last draft this is
where we were at

In CIF2 the tags referred to as data names are comprised of characters only
from the ASCII set. A data name begins with an ASCII _ and may be followed
by any number of characters in the regular expression [A-Za-z0-9_.] (the .
is the explicit ASCII period character).

This solved the problem within dREL of having to map a dataname string to a
programming language identifier. But JW and others need the support for a
trailing [] syntax. What we focussed on was that one could have a qualified
data name in a CIF data file that was (borrowed and changed from JHR)

<qualified dataname> = <dataname> <square bracket expression>+
<square bracket expression= <'['> <digits> <"]"> |
                            <'['> <'\"'> <chars>+ <'\"'><']'>

A qualified data name can exist in a data file and the file can be parsed
irrespective of the presence or otherwise of a dictionary. I am not to sure
if James is suggesting we can attribute to this object that it is a
list/table element.

What we agreed(?) on was that the qualified data name could not appear in a
dictionary, only the data name and it was implicit that the qualified data
name values were the elements of the parent data name object.

I am not sure how to word this is a syntax specification document but we
will iterate around that will hopefully mean what we intend.

Finally what was the initial list index? It was decided this could be
specified and there were a number of approaches. The idea of specifying the
range on indices has appeal, you specify the first and the last (and
implicitly an increment of 1) and you have all you need. This is really just
an offset. HOWEVER (and James commented on this in a later post) I too
assume that this is a syntactic definition about appearance, and hence
_data_object[4] could be the first element of _data_object.

dREL itself is a fully defined language whose lists are intialised at 0.
Hence access to the object is achieved by

_data_object[index + offset] = valueof(data_object[index])

This means that while there is a qualified data name in a CIF data file that
is _atom_site_aniso.U[1][1], and offsets of -1 and -1, within dREL the
_atom_site_aniso.U compound object has its initial element at [0][0] and
access is controlled by the offsets.

Is this what you were saying James?

On 22/12/09 11:19 AM, "James Hester" <jamesrhester@gmail.com> wrote:

> Rereading the email chain allows three alternative proposals for square
> bracket notation to be discerned:
> 1. "Implicit definition": Tags with trailing square brackets are implicitly
> defined when a list item is defined in a dictionary. Note that under this
> proposal the only syntax change is to allow square brackets in the list of
> allowed characters for data item tags. So from the parsers' point of view
> such tags have the same meaning as tags that do not contain square brackets.
> 2. "Dictionary-driven syntax change": Rereading Nick's proposal (as opposed to
> our real life conversation here in Sydney at the time) it appears that he is
> proposing that, if and only if an item is defined as a list in a dictionary,
> the parser would interpret square brackets at the end of a dataname tag and
> assign the value to the appropriate element of the root dataname, so
> _blah_blah[2] xyz
> to mean that element '2' of _blah_blah is equal to 'xyz'.
> 3. "Pure syntax": (what I thought Nick meant): as for (2), except this syntax
> applies regardless of dictionary contents.
> Variant 2 requires knowledge of the dictionary contents at lexing time in
> order to flag a syntax error. This is unworkable, because even the simplest
> CIF2 parser now needs access to all possible dictionaries applying to a given
> datablock, and that datablock must contain a complete list of relevant
> dictionaries. It destroys the model of separate stages for parsing and
> dictionary application. In light of this complexity, I expect most
> application writers will revert to (3), that is, using some stratagem to
> assign to elements of the root dataname, and then later comparing with a
> dictionary (if necessary).
> Variant (1) can be seen as 'dictionary driven', with the minimum change to the
> syntax specification, and variant (3) can be seen as syntax driven, with
> almost no changes to the dictionary.
> Would anyone like to express preferences for any of these three alternatives?



Associate Professor N. Spadaccini, PhD
School of Computer Science & Software Engineering

The University of Western Australia    t: +61 (0)8 6488 3452
35 Stirling Highway                    f: +61 (0)8 6488 1089
CRAWLEY, Perth,  WA  6009 AUSTRALIA   w3: www.csse.uwa.edu.au/~nick
MBDP  M002

CRICOS Provider Code: 00126G

e: Nick.Spadaccini@uwa.edu.au

ddlm-group mailing list

Reply to: [list | sender only]
International Union of Crystallography

Scientific Union Member of the International Science Council (admitted 1947). Member of CODATA, the ISC Committee on Data. Partner with UNESCO, the United Nations Educational, Scientific and Cultural Organization in the International Year of Crystallography 2014.

International Science Council Scientific Freedom Policy

The IUCr observes the basic policy of non-discrimination and affirms the right and freedom of scientists to associate in international scientific activity without regard to such factors as ethnic origin, religion, citizenship, language, political stance, gender, sex or age, in accordance with the Statutes of the International Council for Science.