Discussion List Archives

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [ddlm-group] A modest addition to the DDLm spec. .

In paragraph 46, the CIF 1.1 spec says:

46. White space comprises all appropriate combinations of spaces, tabs, 
ends of lines and comments, as well as the beginning of the file. 
<WhiteSpace> are the characters able to delimit the lexical tokens.

So, as things now stand the proposed CIF2 spec _would_ allow
comments within bracketed construct because it explcitly permits
whitespace in bracketed constructs.

=====================================================
  Herbert J. Bernstein, Professor of Computer Science
    Dowling College, Kramer Science Center, KSC 121
         Idle Hour Blvd, Oakdale, NY, 11769

                  +1-631-244-3035
                  yaya@dowling.edu
=====================================================

On Thu, 30 Sep 2010, Herbert J. Bernstein wrote:

> Dear Simon,
>
>  Working from the original DDLm/dREL spec, I did not see anything
> precluding comments within bracketed costructs and have therefore
> programmed on the assumption that they have to be allowed.  The CIF2
> spec simply does not discuss the issue, but explcitly allows
> whitespace within bracketed constructs.
>
>  We should decide what we really want here and document it
>
>  Regards,
>    Herbert
>
> =====================================================
> Herbert J. Bernstein, Professor of Computer Science
>   Dowling College, Kramer Science Center, KSC 121
>        Idle Hour Blvd, Oakdale, NY, 11769
>
>                 +1-631-244-3035
>                 yaya@dowling.edu
> =====================================================
>
> On Thu, 30 Sep 2010, SIMON WESTRIP wrote:
>
>> Dear Herbert
>> 
>> I wasnt aware that "comments ... are allowed in the middle of bracketed
>> constructs".
>> Is this true?
>> 
>> This is a genuine question (its not readily apparent from the spec I've 
>> been
>> referring to, but I do
>> seem to recall an earlier example that suggested this).
>> 
>> On the matter of the string concatenation issue, despite the fact that much
>> of my last contribution was
>> light-hearted (in the spirit of celebrating the fact that the encoding 
>> issue
>> might be nearing closure), my
>> opening sentiment was genuine, i.e. I can see a real use for such a
>> mechanism. The major drawback
>> (apart from this whole thing being quite a major change compared with 
>> CIF1),
>> is the further restriction on
>> the use of + in non-delimited strings. My immediate thoughts are that this
>> doesn't respect the spirit of
>> compromise that led to the solution to the encoding issue, and would 
>> present
>> a huge hurdle before even starting on
>> accepting the possibility of such a mechanism. I light-heartedly threw in a
>> single underscore as an alternative just
>> because it was the first token that sprung to mind (i.e. a character in an
>> isolated state that has no other meaning?),
>> but I haven't investigated this further at this stage and might well be
>> 'talking rubbish'.
>> 
>> Anyway, I'd welcome some info on the use of comments in the bracketed
>> constructs.
>> 
>> Cheers
>> 
>> Simon
>> 
>> 
>> ____________________________________________________________________________
>> From: Herbert J. Bernstein <yaya@bernstein-plus-sons.com>
>> To: Group finalising DDLm and associated dictionaries <ddlm-group@iucr.org>
>> Sent: Thursday, 30 September, 2010 18:23:02
>> Subject: Re: [ddlm-group] A modest addition to the DDLm spec. .
>> 
>> It reduces the incompatability with CIF1 introduced by the change
>> in string quoting syntax, allowing the resulting CIF2 CIFS to
>> be much closer to their CIF1 originals, fills that gap
>> created by not dealing with elides for lone folding in
>> a simpler way, and conforms to well-established pratice in
>> multiple programming languages.  C manages to deal with this
>> using the blank as the concatenation operator at the C preprocessor
>> level, so we should be able to handle it at the lexical level.
>> 
>> =====================================================
>>   Herbert J. Bernstein, Professor of Computer Science
>>     Dowling College, Kramer Science Center, KSC 121
>>         Idle Hour Blvd, Oakdale, NY, 11769
>> 
>>                   +1-631-244-3035
>>                   yaya@dowling.edu
>> =====================================================
>> 
>> On Thu, 30 Sep 2010, Bollinger, John C wrote:
>> 
>> >
>> > On Thursday, September 30, 2010 8:59 AM, Herbert J. Bernstein wrote:
>> >
>> >> The following issue came up during the encodings group discussion, but
>> >> is more properly a DDLm issue.  In order to simplify algorithmic
>> >> conversion of existing CIF1 quoted strings to valid CIF2 strings,
>> >> I propose the addition of the python string concatenation operator, "+",
>> >> in CIF2 documents.  The main value of this addition is to permit a
>> >> simple algorithmic conversion of CIF1 strings with embedded quote
>> >> marks to CIF2 strings that end on the first occurrence of the initial
>> >> quote.  While the use of text fields will suffice in many cases,
>> >> for regular expressions it is clearer and simpler to just break the
>> >> string, insert the terminal quote mark, insert a "+" and then restart
>> >> the string with a different quote mark.
>> >>
>> >> Formally the proposal is:
>> >>
>> >> When a quoted string is given as a data value in a CIF2 document,
>> >> it may be presented as multiple quoted strings concatenated by the
>> >> "+" operator.  [...]
>> >
>> > Would this issue be addressed well enough by converting single-quoted
>> > strings to triple-quoted form?  I guess that wouldn't allow for breaking
>> > up regexes, so maybe it's addressed by the remark about text fields.
>> >
>> > I recognize that from time to time it is convenient to break up long,
>> > single-line values, but I'm not yet persuaded that that is sufficient
>> > justification for this feature.  Adopting it would add an incremental
>> > complication to CIF parsing, and would add another incompatibility with
>> > CIF1, so the benefit should offset those costs.
>> >
>> > If breaking up regexes in particular is the motivation for this
>> > suggestion, then could that objective adequately be met by having DDLm
>> > use a regex language that allows non-significant whitespace, as Perl's
>> > comments mode does?
>> >
>> >
>> > Regards,
>> >
>> > John
>> > --
>> > John C. Bollinger, Ph.D.
>> > Department of Structural Biology
>> > St. Jude Children's Research Hospital
>> >
>> >
>> > Email Disclaimer:  www.stjude.org/emaildisclaimer
>> >
>> > _______________________________________________
>> > ddlm-group mailing list
>> > ddlm-group@iucr.org
>> > http://scripts.iucr.org/mailman/listinfo/ddlm-group
>> >
>> _______________________________________________
>> ddlm-group mailing list
>> ddlm-group@iucr.org
>> http://scripts.iucr.org/mailman/listinfo/ddlm-group
>> 
>
_______________________________________________
ddlm-group mailing list
ddlm-group@iucr.org
http://scripts.iucr.org/mailman/listinfo/ddlm-group

Reply to: [list | sender only]
International Union of Crystallography

Scientific Union Member of the International Science Council (admitted 1947). Member of CODATA, the ISC Committee on Data. Partner with UNESCO, the United Nations Educational, Scientific and Cultural Organization in the International Year of Crystallography 2014.

International Science Council Scientific Freedom Policy

The IUCr observes the basic policy of non-discrimination and affirms the right and freedom of scientists to associate in international scientific activity without regard to such factors as ethnic origin, religion, citizenship, language, political stance, gender, sex or age, in accordance with the Statutes of the International Council for Science.