Discussion List Archives

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [ddlm-group] CIF2 Syntax all wrapped up?

Rereading the email chain allows three alternative proposals for square bracket notation to be discerned:

1. "Implicit definition": Tags with trailing square brackets are implicitly defined when a list item is defined in a dictionary.  Note that under this proposal the only syntax change is to allow square brackets in the list of allowed characters for data item tags.  So from the parsers' point of view such tags have the same meaning as tags that do not contain square brackets.

2. "Dictionary-driven syntax change": Rereading Nick's proposal (as opposed to our real life conversation here in Sydney at the time) it appears that he is proposing that, if and only if an item is defined as a list in a dictionary, the parser would interpret square brackets at the end of a dataname tag and assign the value to the appropriate element of the root dataname, so

_blah_blah[2]    xyz

to mean that element '2' of _blah_blah is equal to 'xyz'.

3. "Pure syntax": (what I thought Nick meant): as for (2), except this syntax applies regardless of dictionary contents.

Variant 2 requires knowledge of the dictionary contents at lexing time in order to flag a syntax error.  This is unworkable, because even the simplest CIF2 parser now needs access to all possible dictionaries applying to a given datablock, and that datablock must contain a complete list of relevant dictionaries.   It destroys the model of separate stages for parsing and dictionary application.  In light of this complexity, I expect most application writers will revert to (3), that is, using some stratagem to assign to elements of the root dataname, and then later comparing with a dictionary (if necessary).

Variant (1) can be seen as 'dictionary driven', with the minimum change to the syntax specification, and variant (3) can be seen as syntax driven, with almost no changes to the dictionary.

Would anyone like to express preferences for any of these three alternatives?

On Mon, Dec 21, 2009 at 10:13 PM, Herbert J. Bernstein <yaya@bernstein-plus-sons.com> wrote:
In view of the many messages in that last thread that included boht syntax and reasons forthe choices, it might be a good idea to recap the syntax portion ofthat dicussion:

1.  It begain on 9 Deb ith Biran asking us to reconsider allowing
punctiation characters in data names.
2.  I responded by suggesting allowing _all_ non-conflicting punctuation
in data names.
3.  David Brown focused the discussion on square brackets
4.  John Westbrook also asked for square bracket support
5.  James and Nick opposed including any additional characters
6.  After a long exchange about aliases, I suggested we simply
extend the definition of all array and list tags to include the
automatic definition of the tags referencing their elements as
a way to bring in the square brackets
7.  David and Joe agreed on the need for square brackets
8.  John agreed with the automatic definition of the tags for the
9.  There was a discussion of the initial index issue and how to specify
an initial index -- with a tag or with a range or both
10.   Nick initially opposed the idea of automatically defining the
element tags, but yielded, provided we did not define the element
tags explicitly in the dictionary
11.  James then supported the idea under the mistaken assumption
that Nick had proposed it.
12. Nick then opposed the idea of having a starting index

So where we are is:

There now seems to be general agreement that when we define an
array or list in the dictionary, all the element tags will
automatically be available to the users

We still have not settled on how/if to specify the necessary
starting index, with the following alternatives on the table:

1.  Don't specifiy the starting index array-by-array, just lock
it in at 0 or 1 for all arrays and lists; or
2.  Do specify the starting index with a range or with a separate
tag or both.

I support allowing an array-dimension by array-dimension specification
of a starting index.  I can live with either a range or a separate tag
or both.
 -- Herbert
 Herbert J. Bernstein, Professor of Computer Science
  Dowling College, Kramer Science Center, KSC 121
       Idle Hour Blvd, Oakdale, NY, 11769


On Mon, 21 Dec 2009, James Hester wrote:

Dear DDLm-ers,

Am I correct in assuming that everyone is satisfied with the square-bracket
syntax recently proposed by Nick?  If so, I believe there is only one
significant outstanding issue, that of being allowed to put whitespace
between the key and the full colon, and the full colon and the value, in a
table.  I agree with Joe that no extra syntactic complexity is introduced by
allowing (not mandating) whitespace to appear in these locations.   If
nobody objects, I would like to suggest that we alter the standard to allow
such whitespace.

T +61 (02) 9717 9907
F +61 (02) 9717 3145
M +61 (04) 0249 4148

ddlm-group mailing list

T +61 (02) 9717 9907
F +61 (02) 9717 3145
M +61 (04) 0249 4148
ddlm-group mailing list

Reply to: [list | sender only]
International Union of Crystallography

Scientific Union Member of the International Science Council (admitted 1947). Member of CODATA, the ISC Committee on Data. Partner with UNESCO, the United Nations Educational, Scientific and Cultural Organization in the International Year of Crystallography 2014.

International Science Council Scientific Freedom Policy

The IUCr observes the basic policy of non-discrimination and affirms the right and freedom of scientists to associate in international scientific activity without regard to such factors as ethnic origin, religion, citizenship, language, political stance, gender, sex or age, in accordance with the Statutes of the International Council for Science.