Discussion List Archives

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [ddlm-group] _enumerated_set.table_id

Hi James,


I agree that _enumeration_set.table_id seems a misfit.  Moreover, I observe that it is not documented in the 2008 DDLm paper.  That paper is aging a bit, but I take the attribute’s omission as an additional signal that it does not serve a role of any major import.


Moreover, I agree that the particular usage you found is troublesome.  It might well be sensible to describe the allowed keys of a particular table via an enumerated set, but in that case those keys would be the *values* (states) expressed by the enumeration, hence the table_id attribute is superfluous.  (I guess that’s pretty much what you said, too; please bear with me as I get my DDLm brain engaged.)


More generally, I agree that there should be a mechanism for DDLm dictionaries to constrain, on a per-item basis, the form that tables may take.  The greatest expressive power in that area would involve being able to specify which keys are allowed (including the possibility of free-form keys), which of those are required, and what type of value must be associated with each key. To do that in full generality would require allowing the types of values inside a table to be defined in terms of other _type definitions in the dictionary, or something equivalently powerful.  Inasmuch as keys must be strings, I think the existing enumeration facility is probably strong enough to express constraints on keys.


Doug’s suggestion doesn’t provide the full expressiveness described above, but it may be reasonable and sufficient for the requirements of any dictionary we currently contemplate supporting.  It is limited at least in that it can express only mappings that _must_ be present or mappings that _may_ be present, but not both.  It appears also to be somewhat limited with regard to the constraints it allows to be placed on values in the defined table type.  Those may be limitations we can live with.


As a practical matter, though, does DDLm have a way to define that the value for item _type.contents is either a table or a member of an enumeration_set?  In other words, can we write a definition of the proposed extended _type.contents item that DDLm can validate, without changing or adding other definitions?  If not, then perhaps that’s a good reason to consider a more comprehensive solution.






John C. Bollinger, Ph.D.

Computing and X-Ray Scientist

Department of Structural Biology

St. Jude Children's Research Hospital


(901) 595-3166 [office]




From: ddlm-group [mailto:ddlm-group-bounces@iucr.org] On Behalf Of James Hester
Sent: Sunday, April 19, 2015 9:45 PM
To: ddlm-group
Subject: [ddlm-group] _enumerated_set.table_id


Dear DDLm group,

(originally sent Feb 5th)

I have been going through ddl.dic with an eye to writing automated dictionary checking routines and came across _enumerated_set.table_id.  This attribute is used precisely once in all the draft DDLm dictionaries (which include all of the previous DDL1 dictionaries): and that is in ddl.dic itself in the definition for the DDLm _import.get attribute.   This attribute is intended to specify in a machine-readable way the possible values of CIF2 Table keys. In this particular case the CIF2 tables are themselves within a List:

    _type.purpose                Import
    _type.source                 Assigned
    _type.container              List
    _type.contents               Table(Code)
    _type.dimension              [{}]
              1             'filename/URI of source dictionary'      file     
              2             'save framecode of source definition'    save     
              3             'mode for including save frames'         mode     
              4             'option for duplicate entries'   dupl   
              5             'option for missing duplicate entries'   miss
     With  i  as  import

    _import.get = [{"file":i.file_id, "save":i.frame_id, "mode":i.mode,
                    "dupl":i.if_dupl, "miss":i.if_miss}]

Because it is in the _enumerated_set category, the category key _enumerated_set.state must be present when listing these table keys, but instead of _enumerated_set.state listing the actual permitted values, it contains meaningless dummy values; table_id then lists table keys, not values, and so the restraints on the values of the keys are absent.  This looks like an abuse of the enumerated_set category when the natural solution as proposed by Doug du Boulay is to simply enhance _type.contents, i.e.


_type.contents = {"file":URL "save":Code "mode":Code "dupl":Code "miss":Code}

Note that _type.contents is implicitly interpreted (in the demonstration DDLm dictionaries) to describe the contents of Lists, not the whole list, so the above use is in line with this. I therefore suggest that we drop _enumerated_set.table_id from DDLm completely as there is no use case.

Are we in agreement on this?



T +61 (02) 9717 9907
F +61 (02) 9717 3145
M +61 (04) 0249 4148

ddlm-group mailing list

Reply to: [list | sender only]
International Union of Crystallography

Scientific Union Member of the International Science Council (admitted 1947). Member of CODATA, the ISC Committee on Data. Partner with UNESCO, the United Nations Educational, Scientific and Cultural Organization in the International Year of Crystallography 2014.

International Science Council Scientific Freedom Policy

The IUCr observes the basic policy of non-discrimination and affirms the right and freedom of scientists to associate in international scientific activity without regard to such factors as ethnic origin, religion, citizenship, language, political stance, gender, sex or age, in accordance with the Statutes of the International Council for Science.