Discussion List Archives

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [ddlm-group] Clarifying semantics of _type.purpose 'Key'

Will we have a meeting at the ACA or at the ECM this year?  -- Herbert

On Wed, Jan 30, 2019 at 12:33 AM James Hester <jamesrhester@gmail.com> wrote:
Dear All,

As no scenarios have been provided where this change would cause any issues, I will proceed with the change. Herbert's objections are noted.

James.

On Thu, 10 Jan 2019 at 14:58, James Hester <jamesrhester@gmail.com> wrote:
I agree that such changes must be approached with caution. However, given we are dealing with DDLm attribute tags and not domain dictionary datanames, we are affecting a much smaller group of users, that is, the group of users who write code to machine-process DDLm dictionaries. Given the current overlap between the meanings of 'Key' and 'Encode', these users cannot rely on these values to deduce anything, and so therefore adjustments to the meaning of 'Key' are even lower impact. Indeed, the suggestion for this change was produced by one such user who saw the room for improvement.  Note that no actual dataname meanings will change, simply those datanames that really are opaque identifiers become 'Key', and those that can be pulled apart for information become/remain 'Encode'. The actual composition of the category key is not determined by _type.purpose, but by _category_key.name, so that vital relational aspect is untouched.

Perhaps I am missing something. Can you or anybody else conceive a scenario where the particular change being proposed will cause a problem?



On Thu, 10 Jan 2019 at 13:31, Herbert J. Bernstein <yayahjb@gmail.com> wrote:
My objection is that you are changing the meaning of an existing value of an existing tag in an incompatible manner.  Either the existing tag or the existing value or both should be deprecated to avoid misunderstandings.  If you really need to keep the tag, at least change the value, say to "content_free_key",
but keeping the existing value of "key" with an incompatible meaning is, in my opinion, a very bad idea,
especially when the meaning being replaced is actually the common existing meaning.

On Wed, Jan 9, 2019 at 6:05 PM James Hester <jamesrhester@gmail.com> wrote:
Appreciate the push for clarity and simplicity. I seem to have been lax in my descriptions. The distinction between 'Key' and 'Encode' is that a 'Key' dataname is significant *only* as a single-dataname key (a "surrogate key" in relational terms), with all of the machine-actionability that that brings with it. 'Encode' describes a dataname whose value carries machine-actionable information internally, separate from that available due to any uniqueness of the value.  Below are candidate definitions for these items; note that 'Encode' has not changed from current ddl.dic.

              Encode
;                  Used to type items with values that are text or codes
                   that are formatted to be machine parsable.
;


              Key
;                  Used to type an item with a value that is unique within
                   the looped list of these items, and does not contain encoded
                   information. 
;

For reference, the original definition of Key was:

              Key
;                  Used to type an item with a value that is unique within
                   the looped list of these items, and may be used as a
                   reference "key" to identify a specific packet of items
                   within the category.
;

Herbert, I don't see your reasoning for deprecation and starting over. Please explain why you think deprecation is a better course of action than the above change to the 'Key' definition.

thanks
James.


On Wed, 9 Jan 2019 at 23:54, Herbert J. Bernstein <yayahjb@gmail.com> wrote:
I beg to differ.  It is extremely difficult to ensure the negative purpose of _not_ containing any machine
interpretable information, especially when the very word key does normally imply an important machine
interpretable function.  I would suggest deprecating this tag and starting over with a clear affirmative statement of the purposes.  We should try for clarity and simplicity.

On Tue, Jan 8, 2019 at 10:58 PM James Hester <jamesrhester@gmail.com> wrote:
Dear DDLm-group,

It has been proposed (see https://github.com/COMCIFS/cif_core/issues/108) that the values 'Key' and 'Encode' for _type.purpose would have their meanings slightly clarified as follows: any dataname with a '_type.purpose' of 'Key' is considered to be strictly an opaque key that does not carry any machine-interpretable information. So, for example, _atom_site.label, which encodes both atom type and number, would not have a _type.purpose of 'Key', even if it were unique in its loop. Instead '_type.purpose' would be 'Encode' (as it is at the moment). The vital information that it does act as a key within its loop is still conveyed by the _category_key.name information in the atom_site category definition. On the other hand, _exptl_crystal.id, which has no machine-extractable information, would be better described as 'Key' instead of 'Encode'.

The advantages of this change are:
(1) CIF processors would know that data names identified as 'Key' can be missing or ignored when only a single row in a category is present
(2) Currently, both 'Encode' and 'Key' can be reasonable descriptions of a dataname's function. This ambiguity is removed.

The original cif_core dictionary provided by the Perth group appears to have used _type.purpose='Key' purely for artificially-constructed, single-dataname keys to be used by the dREL system for loop access using the category[keyvalue] construction.  This role has been taken over now by _category_key.name, so the present change is not relevant to dREL, except insofar as dREL systems can take advantage of the information that a dataname value is irrelevant for single-row loops.

I believe that there are no downsides, as the current core dictionary uses both 'Encode' and 'Key' for key datanames, and so there would be no software that relies on values of _type.purpose = 'Key' to make decisions.

James. 
--
T +61 (02) 9717 9907
F +61 (02) 9717 3145
M +61 (04) 0249 4148
_______________________________________________
ddlm-group mailing list
ddlm-group@iucr.org
http://mailman.iucr.org/cgi-bin/mailman/listinfo/ddlm-group


--
T +61 (02) 9717 9907
F +61 (02) 9717 3145
M +61 (04) 0249 4148
_______________________________________________
ddlm-group mailing list
ddlm-group@iucr.org
http://mailman.iucr.org/cgi-bin/mailman/listinfo/ddlm-group


--
T +61 (02) 9717 9907
F +61 (02) 9717 3145
M +61 (04) 0249 4148


--
T +61 (02) 9717 9907
F +61 (02) 9717 3145
M +61 (04) 0249 4148
_______________________________________________
ddlm-group mailing list
ddlm-group@iucr.org
http://mailman.iucr.org/cgi-bin/mailman/listinfo/ddlm-group

Reply to: [list | sender only]
International Union of Crystallography

Scientific Union Member of the International Science Council (admitted 1947). Member of CODATA, the ISC Committee on Data. Partner with UNESCO, the United Nations Educational, Scientific and Cultural Organization in the International Year of Crystallography 2014.

International Science Council Scientific Freedom Policy

The IUCr observes the basic policy of non-discrimination and affirms the right and freedom of scientists to associate in international scientific activity without regard to such factors as ethnic origin, religion, citizenship, language, political stance, gender, sex or age, in accordance with the Statutes of the International Council for Science.