[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [ddlm-group] Managing deprecation in DDLm

Dear DDLm group,

There having been no objections, I have incorporated the new '_definition.replaced_by' dataname into the DDLm dictionary (see https://github.com/COMCIFS/cif_core/blob/cif2-conversion/ddl.dic) and immediately put it to use to deprecate _symmetry.cell_setting in cif_core.

James.

On 15 May 2017 at 17:01, James Hester <jamesrhester@gmail.com> wrote:
Dear DDLm-group,

Please see the below a proposed definition for a new DDLm attribute to flag deprecation.

save_definition.replaced_by
_definition.id    '_definition.replaced_by'
_name.category_id  definition
_name.object_id    replaced_by
_description.text
;
     A dataname that should be used instead of the defined dataname. The defined dataname
     is deprecated.
;
_type.container  Single
_type.purpose    Encode
_type.contents   Name
_type.source     Assigned

save_

Note that this definition assumes that a deprecated dataname will have a clear replacement.  The only current use case is for 'symmetry_cell_setting', which
allowed 8 different settings and appears to mix the concepts of 'crystal system' (7 members) and 'Bravais system' (7 members), leading to potential ambiguity as to which space groups belong to 'hexagonal'. No dataname is currently a direct substitute for 'symmetry_cell_setting', but 'space_group.crystal_system' appears closest in meaning.

An alternative approach would add another enumerated state to type.purpose (e.g. 'Retired'), but this would mean both that (i) the original purpose was obscured, which may interfere with validation routines applied to old datafiles, and (ii) that there is no opportunity to suggest a replacement.

Please comment,

James.




On 28 April 2017 at 22:15, john.westbrook@rcsb.org <john.westbrook@rcsb.org> wrote:


On 4/27/17 9:11 PM, James Hester wrote:
Dear DDLm-group,

In the case of direct dataname equivalents, '_alias.deprecation_date' is suitable as a way of flagging deprecation.  However, if
there is no one-for-one substitution, there is no easy way to deal with deprecation. For example, on the cif_core discussion list we
have been talking about how to deprecate _cell_symmetry_setting, which has no direct equivalent in the new core dictionary.  Having
no equivalent clearly requires that we keep the dataname in the dictionary in order to interpret legacy files. For such definitions,
it would be good to (i) have an attribute that directly flags deprecation (ii) where an algorithm exists to convert values to an
alternative dataname (e.g. unit conversion), that this algorithm could be specified. While such deprecation happens very rarely, it
would seem prudent to allow for occasional mistakes in dataname definition.

Note that in DDL2 the _item_related.function_code dataname has values that indicate deprecation (Vol G table 2.6.5.1) and conversion
by multiplication: "replaces", "replacedby", "conversion_constant", "conversion_arbitrary".  This is not a particularly good match
for us, as simple replacement is already accomplished by aliases, and simple constant multiplication is not always sufficient.  We
also have dREL at our disposal for describing arbitrary transformations.

Aliases as used in DDL2 provide correspondences for semantically identical items.  The role replaces/replacedby is identify
cases of deprecation and/or preferred usage which seems to be what you are seeking to represent.

Regards,

John

I propose the following:
(i) a new DDLm attribute '_definition.replaced_by' which would have the value of a dataname that should be used instead (or default
value 'None').
(ii) a new DDLm '_method.purpose' tag 'FromDeprecated' which could be used in the definition of the dataname that replaces the
deprecated definition. The method associated with this purpose would calculate the value of the new dataname from the old dataname
(and any other datanames that are necessary).

Does this scheme seem reasonable to you?  If so, I will work up a proper definition.

all the best,
James.

--
T +61 (02) 9717 9907
F +61 (02) 9717 3145
M +61 (04) 0249 4148


_______________________________________________
ddlm-group mailing list
ddlm-group@iucr.org
http://mailman.iucr.org/cgi-bin/mailman/listinfo/ddlm-group


--
John Westbrook, Ph.D.
RCSB, Protein Data Bank
Rutgers, The State University of New Jersey
Department of Chemistry and Chemical Biology
174 Frelinghuysen Rd
Piscataway, NJ 08854-8087
e-mail: john.westbrook@rcsb.org
Ph: (848) 445-4290 Fax: (732) 445-4320
_______________________________________________
ddlm-group mailing list
ddlm-group@iucr.org
http://mailman.iucr.org/cgi-bin/mailman/listinfo/ddlm-group



--
T +61 (02) 9717 9907
F +61 (02) 9717 3145
M +61 (04) 0249 4148



--
T +61 (02) 9717 9907
F +61 (02) 9717 3145
M +61 (04) 0249 4148
_______________________________________________
ddlm-group mailing list
ddlm-group@iucr.org
http://mailman.iucr.org/cgi-bin/mailman/listinfo/ddlm-group

Reply to: [list | sender only]