Discussion List Archives

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [ddlm-group] Finalizing DDLm

Title:
Just for the record, all I wanted to know is whether we have to use the loop structure for list categories when there is only one row or whether is is optional.  I was open minded when we started this discussion, but now I think it safer always to use the loop structure.  What Herbert is talking about is an implementation that reads earlier CIFs and clearly it has to be much more flexible, but this does not mean that we should not aim for a simple and more rigorous standard for DDLm/  And I assume that

_refln.wavelength  1.56

loop_
_refln_refln.h
_refln_refln.k
_refln_refln.l
_refln_refln.intensity
1 0 0 1543.27

implies that the wavelength belongs to the listed reflections.  I guess this is really defined in the description of _refln.wavelength but there would seem to be an implicit relationship that would not be there if the name of the first item were _refln_wavelength.actual since it would then not be in an ancestral category.

David



Herbert J. Bernstein wrote:
The advantage is allowing existing data CIFs to be used without the
need to re-edit them for a purely mechanical change of presentation.

As for the increase in program complexity -- the only place where any
program changes is in preparing data for use by DDLm for validation,
and even in that case, inasmuch as under Nick's rules we would be
required to detect the case, all we are doing is changing from
issuing a fatal error to issuing a warning and then using the
data we have just analyzed (or analysed, and example of the same
process) in a very obvious way.

I understand this body has decided to be "maximally disruptive", but
can we not show at least a little consideration for users with existing data and do a few of the edits we are imposing for them?

=====================================================
 Herbert J. Bernstein, Professor of Computer Science
   Dowling College, Kramer Science Center, KSC 121
        Idle Hour Blvd, Oakdale, NY, 11769

                 +1-631-244-3035
                 yaya@dowling.edu
=====================================================

On Wed, 31 Mar 2010, James Hester wrote:

DDLm and dREL as presented by Nick and Syd is a carefully thought out,
consistent model for describing ontologies.  I don't think anyone is
proposing to actually change this model.  All Herbert is advocating is
altering the *presentation* of single-row loops so that they appear to be
separate tag-value pairs, without actually altering their dictionary
classification.  However, as I have argued previously, providing this
flexibility (i) requires increased program complexity to reassemble the loop
structure internally, and (ii) this is only possible if a dictionary is
available.

We so far lack any concrete example of a benefit to be gained from this
increase in complexity and ambiguity, beyond perhaps a slightly more flowing
presentation for the human reader of some dictionary content (where
single-row loops are common, especially in 'example' stanzas).

On Wed, Mar 31, 2010 at 2:22 PM, Nick Spadaccini <nick@csse.uwa.edu.au>
wrote:



      On 31/03/10 10:51 AM, "Herbert J. Bernstein"
      <yaya@bernstein-plus-sons.com>
      wrote:

      > I really don't understand why
      >
      > _atom_site_label Cu
      > _atom_site.fract_x 0.0
      > _atom_site.fract_y 0.0
      > _atom_site.fract_z 0.0
      >
      > is a problem.  It looks very clear and is easily coerced to
      >
      > loop_
      > _atom_site_label
      > _atom_site.fract_x
      > _atom_site.fract_y
      > _atom_site.fract_z
      >   Cu 0.0 0.0 0.0
      >
      > when a dictionary is available to tell you that this is
      supposed to be a
      > looped list.

While I am a big fan of using dictionaries I am repeatedly told there
are
many users who don't and won't. Those people will not know it is a
List
category in the former example.

David's other example below that you refer to is just de-normalising
the
data. Considering DB courses all over the world spend so much time on
normalisation begs the question why it is attractive to undo all that.
But
if people want to repeat data unnecessary so be it.

But I think what David wants is a consequence of the
category-subcategory
semantics built in to DDLm. In the specification category-subcategory
loops
can appear separately or as an outer/inner join. The former makes mm
people
happy, the latter makes small molecule people happy.

Because we have focused on List categories there is a key on which to
join.
Though Syd and I didn't consider the case, a semantically consistent
view
for a Set subcategory of a List parent category, would be a Cartesian
Product. This would repeat a Set data for every row of the List. In
some
sense this is already catered for within the semantics of
category-subcategory relationships.

However the inverse case, unrolling List data into Set data is not
part of
DDLm. But as I have repeatedly stated, the IUCr is free to extend the
specification of DDLm to suit its purposes for local implementation.


>
> And, while
>
> _atom_site_label Cu
> _atom_site.fract_x 0.0
>
> loop_
> _atom_site.fract_y 0.0
> _atom_site.fract_z 0.0
>
> is a little strange it is just as clear.  The more interesting case
is
> related to what David suggested, of mixing a single tag value pair
with a
> loop with more than one row.  That seems very useful and echoes some
of
> what was already done in the mmCIF dictionary.
>
> If a sensible coercion is clear from the dictionary, why not just do
it?
>
> =====================================================
>   Herbert J. Bernstein, Professor of Computer Science
>     Dowling College, Kramer Science Center, KSC 121
>          Idle Hour Blvd, Oakdale, NY, 11769
>
>                   +1-631-244-3035
>                   yaya@dowling.edu
> =====================================================
>
> On Wed, 31 Mar 2010, Nick Spadaccini wrote:
>
>>
>>
>>
>> On 31/03/10 12:23 AM, "Herbert J. Bernstein"
<yaya@bernstein-plus-sons.com>
>> wrote:
>>
>>> 1.  Existing DDL2 style dictionaries have many looped categories
with
>>> subcategories.
>>
>> As do the dictionaries written in the DDLm.
>>
>>> 2.  I do not understand why there is any special need to forbid
the
>>> presentation of a single row list category as separate tags and
>>> values, nor why there is any special need to forbid the
presentation
>>> of an unlooped catagory as a single row loop.  Any necessary
coercion
>>> is easily done in either case, and deserves at most a warning, if
even
>>> that.
>>
>> The reasoning is that you are enforcing its "type" specification.
It is a
>> List category object. ALL List category objects MUST be
syntactically
>> presented by the loop_ keyword, followed by a sequence of tags,
followed by
>> a list of values whose type then matches the tag type. Anybody
reading the
>> data in the absence of a dictionary will immediately know it is a
List
>> object.
>>
>> I must say Syd and my reasoning is pretty clear as to why we
enforce it the
>> way we do. However I can't see the reasoning behind your and
David's desire
>> to allow for,
>>
>> _atom_site_label Cu
>> _atom_site.fract_x 0.0
>>
>> loop_
>> _atom_site.fract_y 0.0
>>
>> _atom_site.fract_z 0.0
>>
>> The above is a logical consequence of what is being suggested. You
may not
>> intend it, but "when you'se open the can, you'se eat the worms".
>>
>> Again (and again and again) I repeat IF the IUCr wishes this to be
an
>> extension in its implementation of DDLm that is for it to decide.
>>
>>
>>>
>>>
>>> =====================================================
>>>   Herbert J. Bernstein, Professor of Computer Science
>>>     Dowling College, Kramer Science Center, KSC 121
>>>          Idle Hour Blvd, Oakdale, NY, 11769
>>>
>>>                   +1-631-244-3035
>>>                   yaya@dowling.edu
>>> =====================================================
>>>
>>> On Tue, 30 Mar 2010, David Brown wrote:
>>>
>>>> James seems to have summarized matters pretty well.  The
implication is
>>>> that
>>>> a list category must be the end of the line - it cannot have a
>>>> subcategory. 
>>>> My real questions was whether a list category must explicitly
included as a
>>>> loop, or whether the loop structure is unnecessary if it only
contained a
>>>> single row.  It is easy enough to be safe by always inculding the
loop, and
>>>> I will probably arrange to do this in the dictionaries.  There
are likely
>>>> to
>>>> be several places with single fow loops appear, e.g., in examples
or
>>>> aliases.
>>>>
>>>> David
>>>>
>>>>
>>>>
>>>> James Hester wrote:
>>>>       Thanks Nick for clarifying this.  We then return to David's
>>>>       question.  If we assume that a 'Set' category cannot be a
child
>>>>       of a 'List' category (I hope this is written down somewhere
if
>>>>       it is the case) then my originally proposed solution would
be
>>>>       impossible.  Therefore, what David should do is to put the
>>>>       invariant items into a *parent* 'Set' category and state
that
>>>>       the child 'List' category.  That would solve the immediate
issue
>>>>       of separating out looped and unlooped datanames.  If some
>>>>       convenience is desired for dREL processing, the child
'List'
>>>>       category could be made joinable to that parent category,
thereby
>>>>       making both invariant and looped items available in
shorthand
>>>>       form by looping over the parent category.  Of course, even
if
>>>>       the child 'List' category is not explicitly joined to the
parent
>>>>       'Set' category, the parent category can be explicitly
referenced
>>>>       in any dREL method using the full dataname.
>>>>
>>>>       Nick may wish to confirm that I have correctly understood
the
>>>>       proposed behaviour of DDLm.
>>>>
>>>>       James.
>>>>
>>>>       On Thu, Mar 25, 2010 at 3:18 PM, Nick Spadaccini
>>>>       <nick@csse.uwa.edu.au> wrote:
>>>>             I have not had any time to respond to David?s
>>>>             original email or the
>>>>             subsequent discussions. However I have discussed it
>>>>             with Syd.
>>>>
>>>>             DDLm defines loopable categories strictly and
>>>>             sub-categories of those have
>>>>             strictly enforced outer joins (given we have ONLY
>>>>             considered sub-categories
>>>>             that are List). This is how we overcome the split
>>>>             versus non-split versions
>>>>             of atom_site and atom_site_aniso loops. List is
>>>>             strictly looped, Set is
>>>>             strictly non-looped - contrary to your reading, and
>>>>             possibly Syd?s original
>>>>             text. He has since re-read what he wrote and
>>>>             clarified the ambiguity. I will
>>>>             send you that re-write shortly.
>>>>
>>>>             We had not considered a SET category being a
>>>>             sub-category of a List
>>>>             category, but if it is allowed then it would not be
>>>>             an outer-join as you
>>>>             suggested in your previous contribution but a
>>>>             relational Cartesian product
>>>>             (which is very different).
>>>>
>>>>             The DDLm specification has that List category data
>>>>             MUST appear in a loop
>>>>             (irrespective of how many rows there are), and SET
>>>>             categories are strictly
>>>>             singular (non-looped data). The formal specification
>>>>             will formally remain
>>>>             that way.
>>>>
>>>>             HOWEVER the IUCr is free to ?extend? the
>>>>             specification of the DDLm for its
>>>>             own internal and private use, so long as you
>>>>             appreciate that the FORMAL
>>>>             published specification of what Syd and I have
>>>>             created can?t include it.
>>>>
>>>>             You might explain as to why you feel you have
>>>>             trouble with
>>>>
>>>>             loop_
>>>>              _atom_site.label
>>>>              _atom_site.frac_x
>>>>              _atom_site.frac_y
>>>>              _atom_site.frac_z
>>>>              Cu 0 0 0
>>>>
>>>>             And yet
>>>>
>>>>              _atom_site.label  Cu
>>>>              _atom_site.frac_x  0.
>>>>              _atom_site.frac_y  0.
>>>>              _atom_site.frac_z  0.
>>>>
>>>>             Is so much more obvious? Given that people
>>>>             understand what the loop is, I
>>>>             can't see what they would gain from the unrolled
>>>>             version (apart from
>>>>             confusion). The real danger is those less
>>>>             experienced who DON'T read a
>>>>             dictionary and read the latter form may be
>>>>             encouraged to replicate it when
>>>>             there is more that 1 atom (thus corrupting the CIF
>>>>             structure).
>>>>
>>>>             However these are just personal observations, and if
>>>>             the IUCr wants to
>>>>             qualify the use of DDLm with its own tweaks there is
>>>>             nothing stopping them
>>>>             from doing so.
>>>>
>>>>
>>>>
>>>>>>>>
>>>>>>>> At 10:35 AM -0500 3/11/10, David Brown wrote:
>>>>>>>>>
>>>>>>>>> Dear Colleagues,
>>>>>>>>>
>>>>>>>>> I assume that we are essentially finished in
>>>>             resolving syntax
>>>>>>>>> problems, but in that discussion some items
>>>>             were identified as being
>>>>>>>>> related to DDLm rather than syntax, so before
>>>>             we settle into serious
>>>>>>>>> dictionary writing we need to understand the
>>>>             DDLm rules.
>>>>>>>>>
>>>>>>>>> One item that I believe was raised under this
>>>>             heading was whether,
>>>>>>>>> if a loop contained a single set of items, it
>>>>             was necessary to
>>>>>>>>> formally include this in a loop structure.  If
>>>>             this is deemed to be
>>>>>>>>> necessary, then there has to be some way of
>>>>             identifying the items
>>>>>>>>> that must appear in a loop.  The presence in
>>>>             the dictionary of a
>>>>>>>>> _category_key.* item would seem to flag this,
>>>>             but it is applied at
>>>>>>>>> the level of the category rather than at the
>>>>             level of an individual
>>>>>>>>> item.  If the requirement that the loop
>>>>             structure must always be
>>>>>>>>> used, then all the items in the category must
>>>>             be loopable, i.e., the
>>>>>>>>> category cannot include items that would not
>>>>             normally be included in
>>>>>>>>> the loop, items for example that apply equally
>>>>             to all the listed
>>>>>>>>> items such as a scale factor that is the same
>>>>             for all the structure
>>>>>>>>> factors in a loop.  This seems to be workable,
>>>>             but I am not sure how
>>>>>>>>> the legacy CIFs would fit in, since categories
>>>>             may include some
>>>>>>>>> listable item and some non-listable items, and
>>>>             I am sure the
>>>>>>>>> listable items do not always appear in a loop
>>>>             if there is only one
>>>>>>>>> set of such items reported in the CIF.
>>>>>>>>>
>>>>>>>>> Is this something that can be clarified fairly
>>>>             easily?  It has an
>>>>>>>>> important bearing on how the CIF dictionaries
>>>>             are written.
>>>>>>>>>
>>>>>>>>> David
>>>>>>>>>
>>>>>>>>> Attachment converted: Macintosh HD:idbrown
>>>>             55.vcf (TEXT/ttxt) (0046DFC7)
>>>>>>>>>
>>>>             _______________________________________________
>>>>>>>>> ddlm-group mailing list
>>>>>>>>> ddlm-group@iucr.org
>>>>>>>>>
>>>>             http://scripts.iucr.org/mailman/listinfo/ddlm-group
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>>
>>>>             =====================================================
>>>>>>>>  Herbert J. Bernstein, Professor of Computer
>>>>             Science
>>>>>>>>    Dowling College, Kramer Science Center, KSC
>>>>             121
>>>>>>>>         Idle Hour Blvd, Oakdale, NY, 11769
>>>>>>>>
>>>>>>>>                  +1-631-244-3035
>>>>>>>>                  yaya@dowling.edu
>>>>>>>>
>>>>             =====================================================
>>>>>>>> _______________________________________________
>>>>>>>> ddlm-group mailing list
>>>>>>>> ddlm-group@iucr.org
>>>>>>>>
>>>>             http://scripts.iucr.org/mailman/listinfo/ddlm-group
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> T +61 (02) 9717 9907
>>>>>>> F +61 (02) 9717 3145
>>>>>>> M +61 (04) 0249 4148
>>>>>>> _______________________________________________
>>>>>>> ddlm-group mailing list
>>>>>>> ddlm-group@iucr.org
>>>>>>>
>>>>             http://scripts.iucr.org/mailman/listinfo/ddlm-group
>>>>>>
>>>>>> _______________________________________________
>>>>>> ddlm-group mailing list
>>>>>> ddlm-group@iucr.org
>>>>>>
>>>>             http://scripts.iucr.org/mailman/listinfo/ddlm-group
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>
>>>> cheers
>>>>
>>>> Nick
>>>>
>>>> --------------------------------
>>>> Associate Professor N. Spadaccini, PhD
>>>> School of Computer Science & Software Engineering
>>>>
>>>> The University of Western Australia    t: +61 (0)8 6488 3452
>>>> 35 Stirling Highway                    f: +61 (0)8 6488 1089
>>>> CRAWLEY, Perth,  WA  6009 AUSTRALIA   w3:
>>>> www.csse.uwa.edu.au/~nick
>>>> MBDP  M002
>>>>
>>>> CRICOS Provider Code: 00126G
>>>>
>>>> e: Nick.Spadaccini@uwa.edu.au
>>>>
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> ddlm-group mailing list
>>>> ddlm-group@iucr.org
>>>> http://scripts.iucr.org/mailman/listinfo/ddlm-group
>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> T +61 (02) 9717 9907
>>>> F +61 (02) 9717 3145
>>>> M +61 (04) 0249 4148
>>>>
>>>>    
____________________________________________________________________
>>>>
>>>> _______________________________________________
>>>> ddlm-group mailing list
>>>> ddlm-group@iucr.org
>>>> http://scripts.iucr.org/mailman/listinfo/ddlm-group
>>>>
>>>>
>>>>
>>>>
>>>>
>>> _______________________________________________
>>> ddlm-group mailing list
>>> ddlm-group@iucr.org
>>> http://scripts.iucr.org/mailman/listinfo/ddlm-group
>>
>> cheers
>>
>> Nick
>>
>> --------------------------------
>> Associate Professor N. Spadaccini, PhD
>> School of Computer Science & Software Engineering
>>
>> The University of Western Australia    t: +61 (0)8 6488 3452
>> 35 Stirling Highway                    f: +61 (0)8 6488 1089
>> CRAWLEY, Perth,  WA  6009 AUSTRALIA   w3: www.csse.uwa.edu.au/~nick
>> MBDP  M002
>>
>> CRICOS Provider Code: 00126G
>>
>> e: Nick.Spadaccini@uwa.edu.au
>>
>>
>>
>>
>> _______________________________________________
>> ddlm-group mailing list
>> ddlm-group@iucr.org
>> http://scripts.iucr.org/mailman/listinfo/ddlm-group
>>

cheers

Nick

--------------------------------
Associate Professor N. Spadaccini, PhD
School of Computer Science & Software Engineering

The University of Western Australia    t: +61 (0)8 6488 3452
35 Stirling Highway                    f: +61 (0)8 6488 1089
CRAWLEY, Perth,  WA  6009 AUSTRALIA   w3: www.csse.uwa.edu.au/~nick
MBDP  M002

CRICOS Provider Code: 00126G

e: Nick.Spadaccini@uwa.edu.au




_______________________________________________
ddlm-group mailing list
ddlm-group@iucr.org
http://scripts.iucr.org/mailman/listinfo/ddlm-group




--
T +61 (02) 9717 9907
F +61 (02) 9717 3145
M +61 (04) 0249 4148




_______________________________________________ ddlm-group mailing list ddlm-group@iucr.org http://scripts.iucr.org/mailman/listinfo/ddlm-group




begin:vcard
fn:I.David Brown
n:Brown;I.David
org:McMaster University;Brockhouse Institute for Materials Research
adr:;;King St. W;Hamilton;Ontario;L8S 4M1;Canada
email;internet:idbrown@mcmaster.ca
title:Professor Emeritus
tel;work:+905 525 9140 x 24710
tel;fax:+905 521 2773
version:2.1
end:vcard

_______________________________________________
ddlm-group mailing list
ddlm-group@iucr.org
http://scripts.iucr.org/mailman/listinfo/ddlm-group

Reply to: [list | sender only]
International Union of Crystallography

Scientific Union Member of the International Science Council (admitted 1947). Member of CODATA, the ISC Committee on Data. Partner with UNESCO, the United Nations Educational, Scientific and Cultural Organization in the International Year of Crystallography 2014.

International Science Council Scientific Freedom Policy

The IUCr observes the basic policy of non-discrimination and affirms the right and freedom of scientists to associate in international scientific activity without regard to such factors as ethnic origin, religion, citizenship, language, political stance, gender, sex or age, in accordance with the Statutes of the International Council for Science.