[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [ddlm-group] LOOP versus LIST

To: Group finalising DDLm and associated dictionaries <ddlm-group@iucr.org>
Subject: Re: [ddlm-group] LOOP versus LIST
From: Nick Spadaccini <nick@csse.uwa.edu.au>
Date: Mon, 08 Feb 2010 10:53:13 +0800
Authentication-Results: postfix;
In-Reply-To: <279aad2a1002022227u1e20ae54u915a520df66a4770@mail.gmail.com>

Title: Re: [ddlm-group] LOOP versus LIST

I agree here with Joe’s and James’ analyses. Lopped items for which there is one packet (row) is different to unlooped items.

In fact there has been a case(s) where looped categories for which there is only one packet have been stored as unlooped data items. With the new DDL there are strong schema requirements much like the relational DM. If a category is defined as loopable, then it must appear as such AND must have a unique key. I know it can appear simpler to simply delete the loop altogether, but this infact creates massive problems for access.

As for implementation issues coming from the dictionary, these are not specified in the dictionary. Only the abstract data types are. A loopable category is not implemented as an indexed array in my systems because its behaviour is that of a Table (associative array, or relational table). It will have a unique key that identifies it, and that key will not be a number. Of course at an implementation level you can implement this as an indexed array, but that would not change its behaviour.

Of couse^2, this becomes much more complicated as the DDLm is extended to support categories for which row ordering is relevant (JW requested this), and for cases when data name ordering has a certain precedence (HB requested this). All doable, but requires a shift in implementation.

On 3/02/10 2:27 PM, "James Hester" <jamesrhester@gmail.com> wrote:

I generally agree with Joe's analysis. A scalar should never be assumed equivalent to a single-element list, and lists and loops are fundamentally different in that lists have a specific order, unlike loops which assign no significance to column or row ordering.

Some comments:

On Tue, Jan 12, 2010 at 4:37 AM, Joe Krahn <krahn@niehs.nih.gov> wrote:

It makes sense to distinguish a scalar from a loop of size=1 because it
is easier in a programming context for loop items to always be stored in
an array. However, handling data in a programming context requires a
DDL, which can define how the data is stored.

I don't understand the logic behind that last sentence.� The syntax specification itself provides enough information to convert an input file into an abstract datastructure, with no recourse to a DDL.� As you note below, the DDL cannot (in a reasonable world) override the logical structure derived from the syntax specification, so it cannot dictate how the datafile is stored.�

I believe that these issues are properly dealt with under the heading of 'infoset' (borrowed that term from XML) andcan be developed in tandem with the DDL.� The DDL is restricted to manipulations that are consistent with the infoset.

If the distinction between
a scalar and single-row loop is not made at the CIF syntax level, the
DDL should not be able to dictate the use of loops. For example, CIF
defines ordering of items as not significant, and DDL cannot override
this. It makes sense for a DDL to suggest a preferred ordering, but it
is only a suggestion, unless CIF format rules change.

If there is a desire for DDL to mandate loop and non-loop items, then
CIF2 should make an explicit distinction.

To avoid similar conflicts with list items, CIF2 should state that loop
and list items are not interchangeable, so that the following two pairs
are not equivalent:

example1:

_loop
item.name <http://item.name>
data1
data2
data3

item.name <http://item.name> [data1 data2 data3]

example2:

item.name <http://item.name> [data1]

item.name <http://item.name> data1

Of course, these are implementation details that can be worked out after
the lexing syntax is finalized.

I agree with the statement that these pairs are not equivalent.

Joe

Herbert J. Bernstein wrote:
> A list with lists nested to arbitrary depth can be a single data
> value either in a loop or just for a single tag.
>
> DDL2 make no distinction between a one-element loop and the same unlooped
> tag with the same value. �DDL1 (see _list) and DDLm (see
> _definition.class) try to make a distinctions among things that are
> and are not permitted to be looped.
>
> I do not understand why it is desirable to make such a distinction for
> a single row table, following the DDL2 approach of allowing it
> to be handled as either
>
> � �_xxx.aaa data1
> � �_xxx.bbb data2
> � �_xxx.ccc data3
>
> or
>
> loop_
> � �_xxx.aaa
> � �_xxx.bbb
> � �_xxx.ccc
>
> � �data1 �data2 �data3
>
> seem harmless to me, but DDL1 and DDLm make the distinction and a proper
> parser should note violations of what was specified for the category.
>
> An index key is not a name, but a string, so I think it reasonable to
> accept the empty string as a table index value.
>
> Case sensitivity is an interesting question. �I would prefer case sensitive
> table indices, but I suppose that matter should be discussed.
>
>
>
> At 12:41 PM -0500 1/5/10, Joe Krahn wrote:
>> I assume that a list of items defined via a loop is distinct from a list
>> of items defined by a list. Is that correct?
>>
>> Likewise, is a list of one item distinct from a scalar value?
>>
>> Currently, CIF files don't differentiate between a one-element loop and
>> a scalar. For example, RCSB components.cif does not use loops for atom
>> data when there is only one atom. Is this stated anywhere?
>>
>> Also, is an empty string a valid TABLE index? Other CIF names require at
>> least one character, but my understanding is that a TABLE index is any
>> valid string, which includes an empty string. Strings are also
>> case-sensitive, so I assume that TABLE indices are also case-sensitive.
>>
>> Thanks,
>> Joe Krahn
>> _______________________________________________
>> ddlm-group mailing list
>> ddlm-group@iucr.org
>> http://scripts.iucr.org/mailman/listinfo/ddlm-group
>
>

_______________________________________________
ddlm-group mailing list
ddlm-group@iucr.org
http://scripts.iucr.org/mailman/listinfo/ddlm-group

cheers

Nick

--------------------------------
Associate Professor N. Spadaccini, PhD
School of Computer Science & Software Engineering

The University of Western Australia    t: +61 (0)8 6488 3452
35 Stirling Highway                    f: +61 (0)8 6488 1089
CRAWLEY, Perth, WA 6009 AUSTRALIA   w3: www.csse.uwa.edu.au/~nick
MBDP M002

CRICOS Provider Code: 00126G

e: Nick.Spadaccini@uwa.edu.au

_______________________________________________
ddlm-group mailing list
ddlm-group@iucr.org
http://scripts.iucr.org/mailman/listinfo/ddlm-group

Reply to: [list | sender only]

References:

Re: [ddlm-group] LOOP versus LIST (James Hester)

Prev by Date: Re: [ddlm-group] LOOP versus LIST

Next by Date: Re: [ddlm-group] LOOP versus LIST

Prev by thread: Re: [ddlm-group] LOOP versus LIST

Next by thread: Re: [ddlm-group] LOOP versus LIST

Index(es):

Date

Thread

Discussion List Archives

Re: [ddlm-group] LOOP versus LIST