Discussion List Archives

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [ddlm-group] A modest addition to the DDLm spec. .. .

I don't think anything here is any more readable than anything else. But
then I am assuming no-one is actually *reading* this stuff. I have
applications that do all this for me.

In writing this stuff for a dictionary, it will always be hard to do
syntactically when these regexes are so long. But then again you are only
going to do it once for each data item that needs it.

I don't know what are the great advantages to this approach that Simon
alludes to, but remember CIF files are a syntax to define values. The
overloading of the + operator for programming languages is used for the
*construction* of a string value, invariably at runtime when the components
aren't known until runtime.

Hence this all makes sense in dREL (which supports it) but not in a
declaration of a string literal as is done in CIF data files.

I certainly do not teach my Java students to declare a string literal as the
concatenation of several string literals.


On 1/10/10 5:04 AM, "Herbert J. Bernstein" <yaya@bernstein-plus-sons.com>
wrote:

> Try working with some of the longer regexes for a while and you may
> come to appreciate having something like the plus or the line-folding
> backslash to allow you to present what really in one very long
> single-line character string with lots of funny stuff in it
> over a series of line and broken with some whitespace that is
> not part of the string.  To take one which does not involve quote
> marks:
> 
> 
> "[0-9]?[0-9]?[0-9][0-9]-[0-9]?[0-9]-[0-9]?[0-9]((T[0-2][0-9](:[0-5][0-9](:[0-5
> ][0-9](.[0-9]+)?)?)?)?\([+-][0-5][0-9]:[0-5][0-9]))?"
> 
> is much harder to read and understand than
> 
> 
> "[0-9]?[0-9]?[0-9][0-9]-[0-9]?[0-9]-[0-9]?[0-9]"+
> "((T[0-2][0-9](:[0-5][0-9](:[0-5][0-9](.[0-9]+)?)?)?)?"+
> "([+-][0-5][0-9]:[0-5][0-9]))?"
> 
> which I normally do with the CIF1 line folding protocol, and the
> treble quote add nothing of value, and to return to quotes
> 
> """[][ \n\t()_,.;:"&<>/\{}'`~!@#$%?+=*A-Za-z0-9|^-]*"""
> 
> is not as clear to me as
> 
> "[][ \n\t()_,.;:" + '"&<>/\{}' + "'`~!@#$%?+=*"+
> "A-Za-z0-9|^-]*"
> 
> which helps to organize the data for me and calls out the
> troublesome case.
> 
> However, the question is not one of your taste or mine or Nic's, nor
> even whether the feature is useful to everybody, but whether
> it is useful to some reasonable number of people and
> whether having it causes some kind of problem for other people.
> 
> I assume we all agree that CIF is intended to be a useful tool to get
> work involved with crystallography done.  The string concatenation
> operator is one of the most generally useful operators, and there is no
> sound reason not to support it top to bottom, from the top level CIF
> data down to the dREL language itself.  It is in dREL right now
> (see 7.1.2), so we are going to have to handle it down there.
> (Indeed, dREL has 2 string concatenations, one with blank for
> string literals and one with "+" for string objects).
> 
> Why is it such a big issue to also handle the "+" at the top level?
> 
> 
> 
> 
> =====================================================
>   Herbert J. Bernstein, Professor of Computer Science
>     Dowling College, Kramer Science Center, KSC 121
>          Idle Hour Blvd, Oakdale, NY, 11769
> 
>                   +1-631-244-3035
>                   yaya@dowling.edu
> =====================================================
> 
> On Thu, 30 Sep 2010, Bollinger, John C wrote:
> 
>> 
>> On Thursday, September 30, 2010 12:23 PM, Herbert J. Bernstein wrote:
>> 
>>> It reduces the incompatability with CIF1 introduced by the change
>>> in string quoting syntax, allowing the resulting CIF2 CIFS to
>>> be much closer to their CIF1 originals,
>> 
>> I don't buy that one.  Which is closer to 'O'Donnell said, "Pshaw"'?
>> 
>> '''O'Donnell said, "Pshaw"'''
>> 
>> or
>> 
>> "O'Donnell said, " + '"Pshaw"'
>> 
>> For me, it's the former, and that becomes much more the case the more
>> concatenations are involved.  Consider, for example, how the concatenation
>> approach would look for this slight variation: 'O'Donnell said, "P'shaw"'.
>> 
>>> fills that gap
>>> created by not dealing with elides for lone folding in
>>> a simpler way,
>> 
>> Are you saying that it provides a superior line-folding approach than the one
>> already used with CIF1?  I'll have to think about that.  Does your argument
>> apply in general, or only for particular cases such as regex?
>> 
>> Is a line-folding mechanism that is incompatible with CIF1 even relevant?  I
>> have always thought that the most important reason for having a line-folding
>> protocol at all was for compatibility with CIF readers that implement the old
>> 80-character line limit.  An alternative that is incompatible with CIF1 is
>> useless for that purpose.
>> 
>> Any way around, I'm hesitant to promote line folding from a semantic
>> consideration to a syntactic one.
>> 
>>> and conforms to well-established pratice in
>>> multiple programming languages.
>> 
>> That is not germane as far as I am concerned.  CIF is not a programming
>> language, and its audience (as opposed to the audience of the specification)
>> contains many non-programmers.
>> 
>>>  C manages to deal with this
>>> using the blank as the concatenation operator at the C preprocessor
>>> level, so we should be able to handle it at the lexical level.
>> 
>> Certainly we *can* handle it.  And doing so will make our code a little bit
>> more complex, and a little bit more difficult to maintain.
>> 
>> Also, this proposal would either make CIF diverge (further?) from STAR, or
>> would require STAR to adopt the same change.  If we want the latter then we
>> cannot settle the question here.
>> 
>> I continue to reserve judgment, but right now it looks like more down side
>> than up side to me.  Furthermore, it still seems that this could be addressed
>> as well or better at the DDL and/or dictionary level, but perhaps that would
>> have impacts that I do not presently appreciate.
>> 
>> 
>> Regards,
>> 
>> John
>> --
>> John C. Bollinger, Ph.D.
>> Department of Structural Biology
>> St. Jude Children's Research Hospital
>> 
>> 
>> 
>> 
>> Email Disclaimer:  www.stjude.org/emaildisclaimer
>> 
>> _______________________________________________
>> ddlm-group mailing list
>> ddlm-group@iucr.org
>> http://scripts.iucr.org/mailman/listinfo/ddlm-group
>> 
> _______________________________________________
> ddlm-group mailing list
> ddlm-group@iucr.org
> http://scripts.iucr.org/mailman/listinfo/ddlm-group

cheers

Nick

--------------------------------
Associate Professor N. Spadaccini, PhD
School of Computer Science & Software Engineering

The University of Western Australia    t: +61 (0)8 6488 3452
35 Stirling Highway                    f: +61 (0)8 6488 1089
CRAWLEY, Perth,  WA  6009 AUSTRALIA   w3: www.csse.uwa.edu.au/~nick
MBDP  M002

CRICOS Provider Code: 00126G

e: Nick.Spadaccini@uwa.edu.au




_______________________________________________
ddlm-group mailing list
ddlm-group@iucr.org
http://scripts.iucr.org/mailman/listinfo/ddlm-group

Reply to: [list | sender only]
International Union of Crystallography

Scientific Union Member of the International Council for Science (admitted 1947). Member of CODATA, the ICSU Committee on Data. Member of ICSTI, the International Council for Scientific and Technical Information. Partner with UNESCO, the United Nations Educational, Scientific and Cultural Organization in the International Year of Crystallography 2014.

ICSU Scientific Freedom Policy

The IUCr observes the basic policy of non-discrimination and affirms the right and freedom of scientists to associate in international scientific activity without regard to such factors as ethnic origin, religion, citizenship, language, political stance, gender, sex or age, in accordance with the Statutes of the International Council for Science.