Discussion List Archives

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [ddlm-group] Second proposal to allow looping of'Set' categories

Dear All,

 

Here are what I see as the essential issues we are wrangling over with respect to the Set / Loop problem:

 

1. Whether we require a solution that prevents future data files from being misinterpreted by current software.

 

Inasmuch as proposal #2 is not such a solution, we seem to have settled on "no".

 

 

2. Whether we require a solution that allows software to insulate itself against future Set / Loop changes, and if so, how.

 

That this is a desirable characteristic seems uncontroversial, but the "how" part is not settled.

 

In particular, proposal #2 does not provide a complete solution to this issue.  It provides for declaring what Set categories have been or may have been presented with multiple values, but it does not provide for defining the dimension(s) along which the values vary, and there could be more than one alternative for that.  The existing audit_conform category does offer a complete solution (or would do if changed to a Loop to match its mmCIF and the DDL1 Core analogs), but it is not as precise as proposal #2’s _audit.schema would be.

 

 

3. Whether we want to provide for Sets of items that can take multiple values, or whether we must convert Set categories to Loops to enable their items to take multiple values.

 

This is to some extent a philosophical difference; it is not particularly relevant to actually writing or reading data files, though it does bear on the next issue.  Having a category key is a defining characteristic of Loop categories, as evidenced by DDLm’s definitions of _definition.class, _category.key_id, and _category_key.name.  Having at most one value per item is a defining characteristic of Set categories.  I disfavor changing that, especially to support a use case expected to be uncommon, and I see no particular need to do so.  I would rather convert Sets to Loops, either as-needed or proactively.

 

James has argued that keeping current Set categories as Sets but giving them category keys where needed would make the implicit assertion that that providing multiple values for the items in such categories is exceptional.  I don’t disagree with that, but I think the same assertion is implicit in defining a default value for the keys of such categories, which presumably we would want to do whether we convert Sets to Loops or not.  Moreover, making assumptions about what is or is not normal or expected is exactly how we got into this situation.  If we are going to double down on that, then I think we need to first formulate a clearer strategy on when and how to make such assumptions.

 

 

4. Whether we need to change DDLm itself, or whether the needed changes can be restricted to dictionaries.

 

It’s not clear to me that we can resolve the issue without modifying DDLm, but I would prefer to avoid modifying it if that is possible.

 

 

5. Whether all category attributes need to be global

 

In particular, I raised the possibility that some category attributes, especially keys and therefore the nature of some of the relationships among categories, could be specified on a per-dictionary basis instead of globally.  This approach promotes dictionaries over individual definitions as the vehicle for addressing the problem, and it’s not so far away from proposal #2 in that prop2 also involves multiple dictionaries in providing full definitions for each category.  The main difference is that under prop2 there is (only) a single aggregate definition for each category, and that definition contains all possible keys, whereas per-dictionary category relationships allow for a subset appropriate to the data domain to be selected and used, simply by choice of dictionary.

 

 

Regards,

 

John

 

--

John C. Bollinger, Ph.D.

Computing and X-Ray Scientist

Department of Structural Biology

St. Jude Children's Research Hospital

John.Bollinger@StJude.org

(901) 595-3166 [office]

www.stjude.org

 

 

 



Email Disclaimer: www.stjude.org/emaildisclaimer
Consultation Disclaimer: www.stjude.org/consultationdisclaimer
_______________________________________________
ddlm-group mailing list
ddlm-group@iucr.org
http://mailman.iucr.org/cgi-bin/mailman/listinfo/ddlm-group

Reply to: [list | sender only]
International Union of Crystallography

Scientific Union Member of the International Science Council (admitted 1947). Member of CODATA, the ISC Committee on Data. Partner with UNESCO, the United Nations Educational, Scientific and Cultural Organization in the International Year of Crystallography 2014.

International Science Council Scientific Freedom Policy

The IUCr observes the basic policy of non-discrimination and affirms the right and freedom of scientists to associate in international scientific activity without regard to such factors as ethnic origin, religion, citizenship, language, political stance, gender, sex or age, in accordance with the Statutes of the International Council for Science.