Discussion List Archives

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [ddlm-group] A modest addition to the DDLm spec. .. .

Try working with some of the longer regexes for a while and you may
come to appreciate having something like the plus or the line-folding
backslash to allow you to present what really in one very long
single-line character string with lots of funny stuff in it
over a series of line and broken with some whitespace that is
not part of the string.  To take one which does not involve quote
marks:


"[0-9]?[0-9]?[0-9][0-9]-[0-9]?[0-9]-[0-9]?[0-9]((T[0-2][0-9](:[0-5][0-9](:[0-5][0-9](.[0-9]+)?)?)?)?\([+-][0-5][0-9]:[0-5][0-9]))?"

is much harder to read and understand than


"[0-9]?[0-9]?[0-9][0-9]-[0-9]?[0-9]-[0-9]?[0-9]"+
"((T[0-2][0-9](:[0-5][0-9](:[0-5][0-9](.[0-9]+)?)?)?)?"+
"([+-][0-5][0-9]:[0-5][0-9]))?"

which I normally do with the CIF1 line folding protocol, and the
treble quote add nothing of value, and to return to quotes

"""[][ \n\t()_,.;:"&<>/\{}'`~!@#$%?+=*A-Za-z0-9|^-]*"""

is not as clear to me as

"[][ \n\t()_,.;:" + '"&<>/\{}' + "'`~!@#$%?+=*"+
"A-Za-z0-9|^-]*"

which helps to organize the data for me and calls out the
troublesome case.

However, the question is not one of your taste or mine or Nic's, nor
even whether the feature is useful to everybody, but whether
it is useful to some reasonable number of people and
whether having it causes some kind of problem for other people.

I assume we all agree that CIF is intended to be a useful tool to get
work involved with crystallography done.  The string concatenation 
operator is one of the most generally useful operators, and there is no 
sound reason not to support it top to bottom, from the top level CIF
data down to the dREL language itself.  It is in dREL right now
(see 7.1.2), so we are going to have to handle it down there.
(Indeed, dREL has 2 string concatenations, one with blank for
string literals and one with "+" for string objects).

Why is it such a big issue to also handle the "+" at the top level?




=====================================================
  Herbert J. Bernstein, Professor of Computer Science
    Dowling College, Kramer Science Center, KSC 121
         Idle Hour Blvd, Oakdale, NY, 11769

                  +1-631-244-3035
                  yaya@dowling.edu
=====================================================

On Thu, 30 Sep 2010, Bollinger, John C wrote:

>
> On Thursday, September 30, 2010 12:23 PM, Herbert J. Bernstein wrote:
>
>> It reduces the incompatability with CIF1 introduced by the change
>> in string quoting syntax, allowing the resulting CIF2 CIFS to
>> be much closer to their CIF1 originals,
>
> I don't buy that one.  Which is closer to 'O'Donnell said, "Pshaw"'?
>
> '''O'Donnell said, "Pshaw"'''
>
> or
>
> "O'Donnell said, " + '"Pshaw"'
>
> For me, it's the former, and that becomes much more the case the more concatenations are involved.  Consider, for example, how the concatenation approach would look for this slight variation: 'O'Donnell said, "P'shaw"'.
>
>> fills that gap
>> created by not dealing with elides for lone folding in
>> a simpler way,
>
> Are you saying that it provides a superior line-folding approach than the one already used with CIF1?  I'll have to think about that.  Does your argument apply in general, or only for particular cases such as regex?
>
> Is a line-folding mechanism that is incompatible with CIF1 even relevant?  I have always thought that the most important reason for having a line-folding protocol at all was for compatibility with CIF readers that implement the old 80-character line limit.  An alternative that is incompatible with CIF1 is useless for that purpose.
>
> Any way around, I'm hesitant to promote line folding from a semantic consideration to a syntactic one.
>
>> and conforms to well-established pratice in
>> multiple programming languages.
>
> That is not germane as far as I am concerned.  CIF is not a programming language, and its audience (as opposed to the audience of the specification) contains many non-programmers.
>
>>  C manages to deal with this
>> using the blank as the concatenation operator at the C preprocessor
>> level, so we should be able to handle it at the lexical level.
>
> Certainly we *can* handle it.  And doing so will make our code a little bit more complex, and a little bit more difficult to maintain.
>
> Also, this proposal would either make CIF diverge (further?) from STAR, or would require STAR to adopt the same change.  If we want the latter then we cannot settle the question here.
>
> I continue to reserve judgment, but right now it looks like more down side than up side to me.  Furthermore, it still seems that this could be addressed as well or better at the DDL and/or dictionary level, but perhaps that would have impacts that I do not presently appreciate.
>
>
> Regards,
>
> John
> --
> John C. Bollinger, Ph.D.
> Department of Structural Biology
> St. Jude Children's Research Hospital
>
>
>
>
> Email Disclaimer:  www.stjude.org/emaildisclaimer
>
> _______________________________________________
> ddlm-group mailing list
> ddlm-group@iucr.org
> http://scripts.iucr.org/mailman/listinfo/ddlm-group
>
_______________________________________________
ddlm-group mailing list
ddlm-group@iucr.org
http://scripts.iucr.org/mailman/listinfo/ddlm-group

Reply to: [list | sender only]
International Union of Crystallography

Scientific Union Member of the International Science Council (admitted 1947). Member of CODATA, the ISC Committee on Data. Partner with UNESCO, the United Nations Educational, Scientific and Cultural Organization in the International Year of Crystallography 2014.

International Science Council Scientific Freedom Policy

The IUCr observes the basic policy of non-discrimination and affirms the right and freedom of scientists to associate in international scientific activity without regard to such factors as ethnic origin, religion, citizenship, language, political stance, gender, sex or age, in accordance with the Statutes of the International Council for Science.