Discussion List Archives

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [ddlm-group] String concatenation operator in CIF2. .

Dear All,

On Thursday, October 14, 2010 11:11 AM, SIMON WESTRIP wrote:

>I believe that a string concatenation operator should meet the following criteria:
>1) totally unambiguous (not open to interpretation as any other CIF element)
>2) not require further restrictions on character sets etc.

I have similar preferences.

>To this end, I think we are left with a couple of options:
>i) work with the characters that currently have no syntactic meaning
>ii) introduce a new keyword
>Approach (i) boils down to using characters that cannot commence a non-delimited string.
>Obviously the delimiter characters ' " are of no use, nor the [ { list delimiters - these have syntactic >meaning.
>This leaves the underscore or a dollar to commence the operator.
>If we use an underscore followed by any other characters it could be read as a dataname
>(note that *dictionaries* place restrictions on the character sets of datanames - i.e.
>at a higher level - beyond syntax).
>So that leaves us with the 'lonely' underscore (which to my mind works unquestionably).

I agree with that analysis.

>If we use the dollar followed by other characters, we do open up the possibility of defining as
>many operators as we like (I've mentioned before that I have plans for the $ in this respect, though
>more along the lines of its perhaps familiar role in identifying variables :-).

The dollar sign is available only if we cease to reserve strings beginning with it for possible future use as save frame references.  Even if we don't foresee ever allowing save frame references, I would prefer to sustain the reservation to avoid divergence from STAR in this area.  Perhaps, however, the operator could still be a lone dollar sign, which, though currently not a valid data value, also cannot be a save frame reference.

>Approach (ii) again opens up possibilities to define all sorts of operators; however, I think
>there should be a distinction between 'keywords' in the traditional STAR/CIF sense and
>these operators (i.e. such an 'operator' does not really have the same fundamental significance as a
>So, as I see it, we're left with:
>(a) _ (i.e. solitary underscore)
>(b) $ (solitary)
>(c) $ followed by some other character(s) (e.g. $// ...)
>Options (b) and (c) still have the drawback that they may be valid CIF1 values - so if we use the dollar
>I would suggest using it as in (c), to create a token that is highly unlikely to be found in
>legacy CIFs (i.e. respecting that legacy as we have tried to do in many other aspects of CIF2).

Unquoted strings starting with the dollar sign are reserved in CIF 1 as well, so options (b) and (c) at least would not risk colliding with valid CIF1 data values.

With that said, if we do add a concatenation operator then I like your option (a) much better than your options (b) and (c).  I've even decided that the lone underscore is somewhat mnemonic in that role, on the basis of it resembling a tag but not providing a name, which I can construe as something like 'additional data for the same name'.


John C. Bollinger, Ph.D.
Department of Structural Biology
St. Jude Children's Research Hospital

Email Disclaimer:  www.stjude.org/emaildisclaimer
ddlm-group mailing list

Reply to: [list | sender only]
International Union of Crystallography

Scientific Union Member of the International Science Council (admitted 1947). Member of CODATA, the ISC Committee on Data. Partner with UNESCO, the United Nations Educational, Scientific and Cultural Organization in the International Year of Crystallography 2014.

International Science Council Scientific Freedom Policy

The IUCr observes the basic policy of non-discrimination and affirms the right and freedom of scientists to associate in international scientific activity without regard to such factors as ethnic origin, religion, citizenship, language, political stance, gender, sex or age, in accordance with the Statutes of the International Council for Science.