Discussion List Archives

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [ddlm-group] Data names

On 8/12/09 2:04 AM, "Joe Krahn" <krahn@niehs.nih.gov> wrote:

> When I first read the spec for STAR, my view was that the leading
> underscore for data item names is a prefix, the same as "data_" is a
> prefix and not part of the data block name. Obviously, CIF has taken the
> approach that the underscore is part of the name, so the leading
> underscores are everywhere, and not just in the CIF data-name tokens.
> 
> The CIF approach is unlikely to change, but perhaps dREL expressions
> would be easier to read if the leading underscore was optional. The
> relevance here is that this idea would require restrictions on the first
> character following the underscore not being a 2nd underscore.

James and I have discussed this and we both think that dREL is easier to
read and parse because of the leading underscore, not harder. They clearly
indicate identifiers that are CIF data names as opposed to local
identifiers. I don't mind restrictions if they make things easier, but the
requirement that we don't interpret the leading _ leads to the requirement
the second can't be an _ and I can't see any great benefit in the
interpretation and subsequent restrictions.
 
> Also, some implementations may want to treat the period like a structure
> member access. If so, should the period require adjacent non-period
> characters? For example, should these be allowed:
> 
>    a...b
>    .
>    ..
> 
> I assume that these should be allowed, but perhaps it could cause
> problems in dREL?

These are not great data name choices, but on the other hand don't matter. I
would definitely like certain guidelines be followed in data name choices
but guidelines would be guidelines, not rules.

In CIF2 we adopt the same stance as with CIF1. The character sequence that
makes up a data name, including the (multiple) embedded _ and . have no
meaning. It may look like the . identifies member access and indeed parsers
are probably implemented that way but strictly it means nothing. The
category and attribute associated with the data name is explicitly defined
in DDLm. We have chosen the data name to exactly match the
category.attribute string but it doesn't have to. That's how the aliases
work.

Example

save_cell.formula_units_Z
    _definition.id             '_cell.formula_units_Z'
    _name.category_id            cell
    _name.object_id              formula_units_Z

The two attributes _name.category_id and _name.object_id together tell me
the dREL access is cell.formula_units_Z.

The definition.id indicates what the CIF data name will be. Here it is
exactly the same string as given by the _name.category_id and
_name.object_id pair, which we encourage (but is not absolutely required).
The save frame name used the same string, but doesn't have to and can be
completely different again.

Here is an equivalent definition

save_ABunchOfCharacters
    _definition.id             '_ADifferent.BunchOf..Characters'
    _name.category_id            cell
    _name.object_id              formula_units_Z

The dREL access remains cell.formula_units_Z. In the CIF file it will have
the data name _ADifferent.BunchOf..Characters. The save frame name is an
arbitrary string.

CLEARLY I will argue the guidelines state they should all match as in the
first example so that we can build parsers much more easily, at least in the
first instance.

Even my parser assumes definition.id is exactly the same as
_name.category_id and _name.object_id (but I am being lazy, in the first
instance).

cheers

Nick

--------------------------------
Associate Professor N. Spadaccini, PhD
School of Computer Science & Software Engineering

The University of Western Australia    t: +61 (0)8 6488 3452
35 Stirling Highway                    f: +61 (0)8 6488 1089
CRAWLEY, Perth,  WA  6009 AUSTRALIA   w3: www.csse.uwa.edu.au/~nick
MBDP  M002

CRICOS Provider Code: 00126G

e: Nick.Spadaccini@uwa.edu.au




_______________________________________________
ddlm-group mailing list
ddlm-group@iucr.org
http://scripts.iucr.org/mailman/listinfo/ddlm-group

Reply to: [list | sender only]
International Union of Crystallography

Scientific Union Member of the International Science Council (admitted 1947). Member of CODATA, the ISC Committee on Data. Partner with UNESCO, the United Nations Educational, Scientific and Cultural Organization in the International Year of Crystallography 2014.

International Science Council Scientific Freedom Policy

The IUCr observes the basic policy of non-discrimination and affirms the right and freedom of scientists to associate in international scientific activity without regard to such factors as ethnic origin, religion, citizenship, language, political stance, gender, sex or age, in accordance with the Statutes of the International Council for Science.