Discussion List Archives

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [ddlm-group] (no subject)

Hi Simon,

 

The _enumeration.default values I presented are neither 'null'  nor 'undefined'.  They are empty strings, which is entirely different (unless you’re Larry Ellison).  Another value could be chosen instead, as long as the ones chosen for the parent and child keys match; I selected the empty string on the basis that it was unlikely to collide with or be confused with any key presented explicitly.  Even if a given data file used that key value explicitly, however, that would not in itself cause that file to be invalid.

 

In any case, there is no magic here.  A default value for an item is the value the item takes if it is not presented explicitly.  By assigning a default value for a category key, that category key need not be explicitly presented when there is no need to differentiate instances / rows / packets in the same category (i.e. when there is only one).  That works fine for the usual case of only one space group being presented in a given data block, and it does not rely on any DDLm changes.  Dictionary-driven software (for DDLm) should handle all existing data files just fine with those definitions, whether they express a single space group or several, and, in the single space group case, whether they explicitly express the category key or not.  Non-dictionary driven software written against the given definitions is likewise informed how to handle the no-explicit-key case, though it needs to handle that case manually.

 

I like your metaphor: providing default key values indeed affords some flexibility to data files, and the (non-exclusive) alternative is to provide a means to change the shape of data definitions.  As for metaphorical pegs and holes, however, if a DDL1 or DDL2 peg does not fit the corresponding DDLm hole then that is an inherent problem.  The definitions in the DDLm core dictionary define the _same items_ that are defined by our DDL1 core dictionary and by a subset of mmCIF.  Because they disagree, either the definition of the SPACE_GROUP category in the DDLm version of the core is wrong, or those in the DDL1 core and mmCIF are wrong.

 

Note also that this is all related to the old SYMMETRY and SYMMETRY_EQUIV_POS categories, which had already been deprecated in favor of SPACE_GROUP and SPACE_GROUP_SYMOP when ITVG was published in 2005.  SYMMETRY has no category key and cannot be looped, whereas SPACE_GROUP does have a category key and can be looped (in the DDL1 core and mmCIF and symCIF), so although data represented via the old categories can also be represented via the new, the switch, now more than ten years ago, opened the possibility for data file misinterpretation that has now captured our attention.  It is because the DDLm core fails to faithfully reproduce the DDL1 core’s pre-existing definitions of those categories that I assert that a change along the lines I presented needs to be made to the DDLm core (not to DDLm itself), regardless of how we decide to handle future key problems.

 

 

Cheers,

 

John

 

 

 

 

From: ddlm-group [mailto:ddlm-group-bounces@iucr.org] On Behalf Of SIMON WESTRIP
Sent: Thursday, June 16, 2016 1:13 PM
To: Group finalising DDLm and associated dictionaries <ddlm-group@iucr.org>
Subject: Re: [ddlm-group] (no subject)

 

Hi John

 

Please bear with me as an 'observer' (rather than someone who can speak with any authority about ddlm/drel),

I assume the use of 'enumeration.default' as 'null'  or 'undefined' is basically enabling an 'implicit/explicit' approach to key definitions for a looped category?

 

If this is something that can be readily worked into ddlm and (perhaps most importantly) drel, then I would welcome it.

 

Currently, it seems to me that what we are trying to do is fit a square peg (i.e. ddl1 space_group) into a round hole (ddlm space_group) - James's proposal enables the hole to become bigger to accommodate the peg... but requires a chisel (audit.schema) to do it, while an 'implicit/explicit' approach adds elasticity to the materials we're dealing with...

 

(please forgive the metaphors - long day!)

 

Thanks

 

Simon

 

 

 


From: "Bollinger, John C" <John.Bollinger@STJUDE.ORG>
To: Group finalising DDLm and associated dictionaries <ddlm-group@iucr.org>
Sent: Thursday, June 16, 2016 5:19 PM
Subject: Re: [ddlm-group] (no subject)


Dear James and Colleagues,

I’m going to respond to several aspects of this discussion in separate messages to (I hope) make the individual threads easier to follow.  This first response is directed to these comments of James's:

> I don't actually think default values for keys are necessary until multiple-packet loops are set up. Also, I would need to see a clear formulation of how you would propose to convert Set to Loop in order to comment sensibly in light of all of the other constraints we are operating under. I do appreciate that I am favouring a particular application by giving 'Set' categories such significance. Apart from being bound to do this to keep compatibility with legacy applications, there are non-trivial efficiencies available by being able to make certain values 'global', and the DDL1, DDL2 entry.id, and 'Set' behaviour provides a neat way to do this.

With respect to efficiency, it is not clear to me how a Set with a category key would be any more efficient than a Loop, and it is also not clear whether any efficiency gain that did accrue would be sufficient to justify the required change to DDLm.  I would be interested in hearing a fuller explanation.

In any case, whether it is classified as a Set or a Loop, if a category has a category key, then surely it is necessary for every instance / packet / row of that category to have a value for that key.  If one does not want to oblige data files to provide an *explicit* value for the category key, then the only alternative is to permit them to rely on a default value.  If we are going to do that -- as we must do to avoid changes such as we are discussing from invalidating existing data files -- then I don't see what is gained by making a special rule allowing the key to be defaulted in certain cases, as opposed to simply defining default values for keys where that is warranted, so that existing dictionary semantics provide for data files that present only one packet to omit the key.  In particular, that does not interfere with requiring the key to be presented explicitly when multiple packets are presented, for if a file that presents multiple packets allows them to all take the key's default value then the resulting duplicate keys make the file invalid.

Let's consider the SPACE_GROUP category, since it sparked this whole discussion.  I append a cut at what I think we should do with it (only frames containing modifications are presented); I think I have marked all the changes and additions within via CIF comments.  I rarely wrangle dictionaries, so I apologize for any errors I have committed.  The key defaulting presented within formalizes how, when, and why SPACE_GROUP's category key and the associated child key in SPACE_GROUP_SYMOP can be omitted from data files.  To the best of my knowledge, nothing within relies on any DDLm changes.

Note, by the way, that I think the particular changes presented, or something very like them, are needed regardless of what we choose for the general case, because the DDL1 core and mmCIF are already structured this way.


Best regards,

John

----

save_SPACE_GROUP

_definition.id                          SPACE_GROUP
_definition.scope                      Category
_definition.class                      Loop        # CHANGED
_definition.update                      2016-06-16  # CHANGED
_description.text
;
    The CATEGORY of data items used to specify space group
    information about the crystal used in the diffraction measurements.
;
_name.category_id                      EXPTL
_name.object_id                        SPACE_GROUP

####
# ADDED:

_category.key_id                        '_space_group.key'
loop_
  _category_key.name
        '_space_group.id'

# ... end of additions
####

save_

save__space_group.id

_definition.id                          '_space_group.id'
loop_
  _alias.definition_id
        '_space_group.id'
        '_space_group_id'
_definition.update                      2016-06-16  # CHANGED
_description.text
;
    Code identifying a space group if multiple symmetries.
    See _exptl_crystals.key.
;
_name.category_id                      space_group
_name.object_id                        id
_type.purpose                          Encode
_type.source                            Assigned
_type.container                        Single
_type.contents                          Code

# Take note of this (ADDED):
_enumeration.default                    ''

save_

####
# ADDED:

save__space_group.key

_definition.id                          '_space_group.key'
loop_
  _alias.definition_id
        '_space_group.key'
_definition.update                      2016-06-16
_description.text
;
    Value is a unique key to a set of space_group items
    in a looped list.
;
_name.category_id                      space_group
_name.object_id                        key
_type.purpose                          Key
_type.source                            Related
_type.container                        Single
_type.contents                          Code
loop_
  _method.purpose
  _method.expression
        Evaluation          '_space_group.key = _space_group.id'

save_

# ... end of additions
####

save_SPACE_GROUP_SYMOP

_definition.id                          SPACE_GROUP_SYMOP
_definition.scope                      Category
_definition.class                      Loop
_definition.update                      2013-09-08
_description.text
;
    The CATEGORY of data items used to describe symmetry equivalent sites
    in the crystal unit cell.
;
_name.category_id                      SPACE_GROUP
_name.object_id                        SPACE_GROUP_SYMOP
_category.key_id                        '_space_group_symop.key'
loop_
  _category_key.name
        '_space_group_symop.sg_id'    # ADDED
        '_space_group_symop.id'

save_

save__space_group_symop.key

_definition.id                          '_space_group_symop.key'
loop_
  _alias.definition_id
        '_space_group_symop.key'
_definition.update                      2016-06-16    # CHANGED
_description.text
;
    Value is a unique key to a set of space_group_symop items
    in a looped list.
;
_name.category_id                      space_group_symop
_name.object_id                        key
_type.purpose                          Key
_type.source                            Related
_type.container                        List          # CHANGED
_type.contents                          'Code,Index'  # CHANGED
loop_
  _method.purpose
  _method.expression
        # CHANGED:
        Evaluation          '_space_group_symop.key = [_space_group_symop.sg_id, _space_group_symop.id]'

save_

####
# ADDED:
# Note: this item is needed in any case because mmCIF and
# the DDL1 core define it

save__space_group_symop.sg_id

_definition.id                          '_space_group_symop.sg_id'
loop_
  _alias.definition_id
        '_space_group_symop.sg_id'
        '_space_group_symop_sg_id'
_definition.update                      2016-06-16
_description.text
# copied from mmCIF:
;
  This must match a particular value of _space_group.id, allowing
  the symmetry operation to be identified with a particular space
  group.
;
_name.category_id                      space_group_symop
_name.object_id                        sg_id
_name.linked_item_id                    '_space_group.id'
_type.purpose                          Link
_type.source                            Related
_type.container                        Single
_type.contents                          Code

# Take note of this:
_enumeration.default                    ''

save_

# ... end of additions
####

________________________________

Email Disclaimer: www.stjude.org/emaildisclaimer
Consultation Disclaimer: www.stjude.org/consultationdisclaimer


_______________________________________________
ddlm-group mailing list
ddlm-group@iucr.org
http://mailman.iucr.org/cgi-bin/mailman/listinfo/ddlm-group

 

_______________________________________________
ddlm-group mailing list
ddlm-group@iucr.org
http://mailman.iucr.org/cgi-bin/mailman/listinfo/ddlm-group

Reply to: [list | sender only]
International Union of Crystallography

Scientific Union Member of the International Science Council (admitted 1947). Member of CODATA, the ISC Committee on Data. Partner with UNESCO, the United Nations Educational, Scientific and Cultural Organization in the International Year of Crystallography 2014.

International Science Council Scientific Freedom Policy

The IUCr observes the basic policy of non-discrimination and affirms the right and freedom of scientists to associate in international scientific activity without regard to such factors as ethnic origin, religion, citizenship, language, political stance, gender, sex or age, in accordance with the Statutes of the International Council for Science.