Discussion List Archives

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [ddlm-group] Space as a list item separator

Dear Herbert:

I was merely playing around with Simon's ideas to see how different they are to what we have already decided upon (a 'gap analysis' if you will).  Upon reflection, I can see that my proposed additional principle is still not sufficient to reproduce the basic CIF syntax we would expect (as [xxx]loop_ would be legitimate).  I think it is therefore best to drop the line of thought that I introduced this morning (my time) and instead wait for Simon to explain the advantages of what he is proposing.

James.

On Mon, Nov 30, 2009 at 3:27 PM, Herbert J. Bernstein <yaya@bernstein-plus-sons.com> wrote:
Dear James,

 Could you please clarify


"strings which have no meaning beyond their significance as tokens are not required to
be separated by whitespace from the preceding or succeeding strings"

we remove the requirement for whitespace around brackets, commas and 'loop_'.  Of
course, insofar as strings neighbouring these will require whitespace around them, this
does not spoil our grammar at all.  (Note that in lexing/parsing terms, the condition
that "strings are only significant as tokens" is supposed to be equivalent to discarding
the 'value' assigned to a token when it is returned by the lexing stage.) 

I am not aware of any prior discussion of discarding space around 'loop', and I don't understand the implications of the comment on "strings are only significant as tokens" relative to the values of strings.

Also, could someone please state what problem we are trying to solve.  I hope we are just trying to design some useful tools to support crystallographic data management.  Perhaps we need to revert to basic software engineering practice and start by defining who our stakeholders are and trying to get them to agree on the major elements of a "user requirements document".


Regards,
 Herbert

=====================================================
 Herbert J. Bernstein, Professor of Computer Science
  Dowling College, Kramer Science Center, KSC 121
       Idle Hour Blvd, Oakdale, NY, 11769

                +1-631-244-3035
                yaya@dowling.edu
=====================================================

On Mon, 30 Nov 2009, James Hester wrote:

OK: so could you take us through the advantages of what you are suggesting compared to
what we have come up with?  And perhaps why 'the man who writes the cheques' has nudged
you in this direction?

I would make the following point: if we add to your list the condition that:

"strings which have no meaning beyond their significance as tokens are not required to
be separated by whitespace from the preceding or succeeding strings"

we remove the requirement for whitespace around brackets, commas and 'loop_'.  Of
course, insofar as strings neighbouring these will require whitespace around them, this
does not spoil our grammar at all.  (Note that in lexing/parsing terms, the condition
that "strings are only significant as tokens" is supposed to be equivalent to discarding
the 'value' assigned to a token when it is returned by the lexing stage.) 

The insight I'd draw out of this for our current discussion is that, by taking your
manifesto plus my above condition, we have a general statement of what we would like the
surface syntax of a CIF file to look like.  The only difference from our current
discussion is that we have restricted the charactersets of the non-delimited string and
dataname tag more than strictly necessary - is there some part of that characterset
discussion that you'd like to reopen...in a different thread?

On Sun, Nov 29, 2009 at 9:29 PM, SIMON WESTRIP <simonwestrip@btinternet.com> wrote:
     Yes that summarizes the differences. Unfortunately, the single-byte
     non-delimited strings have to be separated by
     white space in this approach, which is perhaps counter-intuitive and mght
     have some legacy issues?

________________________________________________________________________________________
From: James Hester <jamesrhester@gmail.com>
To: Group finalising DDLm and associated dictionaries <ddlm-group@iucr.org>
Sent: Sunday, 29 November, 2009 3:45:18

Subject: Re: [ddlm-group] Space as a list item separator

Hi Simon: I'm trying to read between the lines here as to how the syntax we have
been discussing diverges from what you have described, and have come up with the
following list:

1. Presumably the []{} characters must be surrounded by whitespace in your version
2. We have restricted the character sets of the non-delimited strings and tags
more than strictly necessary.
3. Comma might be included in the single-byte non-delimited string list

Are there any other differences that you would identify?

On Sat, Nov 28, 2009 at 10:58 PM, SIMON WESTRIP <simonwestrip@btinternet.com>
wrote:
     Dear all

     I was chatting with the man who 'writes the cheques' yesterday about
     some of the
     changes he might expect with CIF2, and based on this I feel I ought to
     at least have
     a go at exploring a 'minimally disruptive' approach, so at the risk of
     being shouted at,
     here goes at a slightly different way of looking at CIF:

     CIF contains a list of strings separated by whitespace.

     A string can be nondelimited or delimited.

     Nondelimited strings have a restricted character set (minimally
     whitespace is excluded)

     A nondelimited string cannot start with any of the delimiters
     (obviously)

     Nondelimited strings can have special meaning governing what follows
     them:

         reserved words, e.g. loop_

         tags, e.g. data_ , _foo

         single-byte nondelimited strings, e.g. [ ] { } :

     All other strings are treated as raw data values


     There, least I can say I tried :-)

     Cheers

     Simon

________________________________________________________________________________________




--
T +61 (02) 9717 9907
F +61 (02) 9717 3145
M +61 (04) 0249 4148


_______________________________________________
ddlm-group mailing list
ddlm-group@iucr.org
http://scripts.iucr.org/mailman/listinfo/ddlm-group




--
T +61 (02) 9717 9907
F +61 (02) 9717 3145
M +61 (04) 0249 4148
_______________________________________________
ddlm-group mailing list
ddlm-group@iucr.org
http://scripts.iucr.org/mailman/listinfo/ddlm-group

Reply to: [list | sender only]
International Union of Crystallography

Scientific Union Member of the International Science Council (admitted 1947). Member of CODATA, the ISC Committee on Data. Partner with UNESCO, the United Nations Educational, Scientific and Cultural Organization in the International Year of Crystallography 2014.

International Science Council Scientific Freedom Policy

The IUCr observes the basic policy of non-discrimination and affirms the right and freedom of scientists to associate in international scientific activity without regard to such factors as ethnic origin, religion, citizenship, language, political stance, gender, sex or age, in accordance with the Statutes of the International Council for Science.