Discussion List Archives

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [ddlm-group] LOOP versus LIST

I generally agree with Joe's analysis. A scalar should never be assumed equivalent to a single-element list, and lists and loops are fundamentally different in that lists have a specific order, unlike loops which assign no significance to column or row ordering.

Some comments:

On Tue, Jan 12, 2010 at 4:37 AM, Joe Krahn <krahn@niehs.nih.gov> wrote:
It makes sense to distinguish a scalar from a loop of size=1 because it
is easier in a programming context for loop items to always be stored in
an array. However, handling data in a programming context requires a
DDL, which can define how the data is stored.

I don't understand the logic behind that last sentence.  The syntax specification itself provides enough information to convert an input file into an abstract datastructure, with no recourse to a DDL.  As you note below, the DDL cannot (in a reasonable world) override the logical structure derived from the syntax specification, so it cannot dictate how the datafile is stored. 

I believe that these issues are properly dealt with under the heading of 'infoset' (borrowed that term from XML) andcan be developed in tandem with the DDL.  The DDL is restricted to manipulations that are consistent with the infoset.

If the distinction between
a scalar and single-row loop is not made at the CIF syntax level, the
DDL should not be able to dictate the use of loops. For example, CIF
defines ordering of items as not significant, and DDL cannot override
this. It makes sense for a DDL to suggest a preferred ordering, but it
is only a suggestion, unless CIF format rules change.

If there is a desire for DDL to mandate loop and non-loop items, then
CIF2 should make an explicit distinction.

To avoid similar conflicts with list items, CIF2 should state that loop
and list items are not interchangeable, so that the following two pairs
are not equivalent:


example1:

_loop
item.name
data1
data2
data3

item.name [data1 data2 data3]


example2:

item.name [data1]

item.name data1


Of course, these are implementation details that can be worked out after
the lexing syntax is finalized.


I agree with the statement that these pairs are not equivalent.

Joe

Herbert J. Bernstein wrote:
> A list with lists nested to arbitrary depth can be a single data
> value either in a loop or just for a single tag.
>
> DDL2 make no distinction between a one-element loop and the same unlooped
> tag with the same value.  DDL1 (see _list) and DDLm (see
> _definition.class) try to make a distinctions among things that are
> and are not permitted to be looped.
>
> I do not understand why it is desirable to make such a distinction for
> a single row table, following the DDL2 approach of allowing it
> to be handled as either
>
>    _xxx.aaa data1
>    _xxx.bbb data2
>    _xxx.ccc data3
>
> or
>
> loop_
>    _xxx.aaa
>    _xxx.bbb
>    _xxx.ccc
>
>    data1  data2  data3
>
> seem harmless to me, but DDL1 and DDLm make the distinction and a proper
> parser should note violations of what was specified for the category.
>
> An index key is not a name, but a string, so I think it reasonable to
> accept the empty string as a table index value.
>
> Case sensitivity is an interesting question.  I would prefer case sensitive
> table indices, but I suppose that matter should be discussed.
>
>
>
> At 12:41 PM -0500 1/5/10, Joe Krahn wrote:
>> I assume that a list of items defined via a loop is distinct from a list
>> of items defined by a list. Is that correct?
>>
>> Likewise, is a list of one item distinct from a scalar value?
>>
>> Currently, CIF files don't differentiate between a one-element loop and
>> a scalar. For example, RCSB components.cif does not use loops for atom
>> data when there is only one atom. Is this stated anywhere?
>>
>> Also, is an empty string a valid TABLE index? Other CIF names require at
>> least one character, but my understanding is that a TABLE index is any
>> valid string, which includes an empty string. Strings are also
>> case-sensitive, so I assume that TABLE indices are also case-sensitive.
>>
>> Thanks,
>> Joe Krahn
>> _______________________________________________
>> ddlm-group mailing list
>> ddlm-group@iucr.org
>> http://scripts.iucr.org/mailman/listinfo/ddlm-group
>
>

_______________________________________________
ddlm-group mailing list
ddlm-group@iucr.org
http://scripts.iucr.org/mailman/listinfo/ddlm-group



--
T +61 (02) 9717 9907
F +61 (02) 9717 3145
M +61 (04) 0249 4148
_______________________________________________
ddlm-group mailing list
ddlm-group@iucr.org
http://scripts.iucr.org/mailman/listinfo/ddlm-group

Reply to: [list | sender only]
International Union of Crystallography

Scientific Union Member of the International Council for Science (admitted 1947). Member of CODATA, the ICSU Committee on Data. Member of ICSTI, the International Council for Scientific and Technical Information. Partner with UNESCO, the United Nations Educational, Scientific and Cultural Organization in the International Year of Crystallography 2014.

ICSU Scientific Freedom Policy

The IUCr observes the basic policy of non-discrimination and affirms the right and freedom of scientists to associate in international scientific activity without regard to such factors as ethnic origin, religion, citizenship, language, political stance, gender, sex or age, in accordance with the Statutes of the International Council for Science.