Discussion List Archives

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [ddlm-group] Clarifying semantics of _type.purpose 'Key'

Appreciate the push for clarity and simplicity. I seem to have been lax in my descriptions. The distinction between 'Key' and 'Encode' is that a 'Key' dataname is significant *only* as a single-dataname key (a "surrogate key" in relational terms), with all of the machine-actionability that that brings with it. 'Encode' describes a dataname whose value carries machine-actionable information internally, separate from that available due to any uniqueness of the value.  Below are candidate definitions for these items; note that 'Encode' has not changed from current ddl.dic.

              Encode
;                  Used to type items with values that are text or codes
                   that are formatted to be machine parsable.
;


              Key
;                  Used to type an item with a value that is unique within
                   the looped list of these items, and does not contain encoded
                   information. 
;

For reference, the original definition of Key was:

              Key
;                  Used to type an item with a value that is unique within
                   the looped list of these items, and may be used as a
                   reference "key" to identify a specific packet of items
                   within the category.
;

Herbert, I don't see your reasoning for deprecation and starting over. Please explain why you think deprecation is a better course of action than the above change to the 'Key' definition.

thanks
James.


On Wed, 9 Jan 2019 at 23:54, Herbert J. Bernstein <yayahjb@gmail.com> wrote:
I beg to differ.  It is extremely difficult to ensure the negative purpose of _not_ containing any machine
interpretable information, especially when the very word key does normally imply an important machine
interpretable function.  I would suggest deprecating this tag and starting over with a clear affirmative statement of the purposes.  We should try for clarity and simplicity.

On Tue, Jan 8, 2019 at 10:58 PM James Hester <jamesrhester@gmail.com> wrote:
Dear DDLm-group,

It has been proposed (see https://github.com/COMCIFS/cif_core/issues/108) that the values 'Key' and 'Encode' for _type.purpose would have their meanings slightly clarified as follows: any dataname with a '_type.purpose' of 'Key' is considered to be strictly an opaque key that does not carry any machine-interpretable information. So, for example, _atom_site.label, which encodes both atom type and number, would not have a _type.purpose of 'Key', even if it were unique in its loop. Instead '_type.purpose' would be 'Encode' (as it is at the moment). The vital information that it does act as a key within its loop is still conveyed by the _category_key.name information in the atom_site category definition. On the other hand, _exptl_crystal.id, which has no machine-extractable information, would be better described as 'Key' instead of 'Encode'.

The advantages of this change are:
(1) CIF processors would know that data names identified as 'Key' can be missing or ignored when only a single row in a category is present
(2) Currently, both 'Encode' and 'Key' can be reasonable descriptions of a dataname's function. This ambiguity is removed.

The original cif_core dictionary provided by the Perth group appears to have used _type.purpose='Key' purely for artificially-constructed, single-dataname keys to be used by the dREL system for loop access using the category[keyvalue] construction.  This role has been taken over now by _category_key.name, so the present change is not relevant to dREL, except insofar as dREL systems can take advantage of the information that a dataname value is irrelevant for single-row loops.

I believe that there are no downsides, as the current core dictionary uses both 'Encode' and 'Key' for key datanames, and so there would be no software that relies on values of _type.purpose = 'Key' to make decisions.

James. 
--
T +61 (02) 9717 9907
F +61 (02) 9717 3145
M +61 (04) 0249 4148
_______________________________________________
ddlm-group mailing list
ddlm-group@iucr.org
http://mailman.iucr.org/cgi-bin/mailman/listinfo/ddlm-group


--
T +61 (02) 9717 9907
F +61 (02) 9717 3145
M +61 (04) 0249 4148
_______________________________________________
ddlm-group mailing list
ddlm-group@iucr.org
http://mailman.iucr.org/cgi-bin/mailman/listinfo/ddlm-group

Reply to: [list | sender only]
International Union of Crystallography

Scientific Union Member of the International Council for Science (admitted 1947). Member of CODATA, the ICSU Committee on Data. Member of ICSTI, the International Council for Scientific and Technical Information. Partner with UNESCO, the United Nations Educational, Scientific and Cultural Organization in the International Year of Crystallography 2014.

ICSU Scientific Freedom Policy

The IUCr observes the basic policy of non-discrimination and affirms the right and freedom of scientists to associate in international scientific activity without regard to such factors as ethnic origin, religion, citizenship, language, political stance, gender, sex or age, in accordance with the Statutes of the International Council for Science.