Discussion List Archives

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: A managed phase-out of DDL1 dictionaries

Thanks David for the detailed explanation.

Part of the problem arises because DDL1 allows "both" as a description of the looping behaviour of a dataname.  The issues around the need to better describe the meaning of "both" resulted in the '_audit.schema' proposal last year. That proposal allows items that are not looped in the DDLm core dictionary to become looped in a controlled fashion, and I'd refer any interested parties to the finally approved proposal at http://comcifs.github.io/looping_proposal. 

Given that DDLm forces us to clarify our thinking ("Set" or "Loop", but not "both"), the DDL1 dictionaries can be brought into line by also disallowing "both". Uses of '_list' = 'both' in the cif core dictionary (exptl_crystal_*, publ_author_*, audit_conform_*, space_group_*) can be inferred to have the meaning "looping is allowed, but if there is only one item you don't have to". Where looping doesn't have carry-over effects on other categories, these can become 'Loop' categories (publ_author, audit_conform). Where looping does have an effect (multiple space groups mean that multiple cells and multiple sets of atomic positions are also in theory required) we restrict these to 'Set' categories in DDLm and define a new '_audit.schema' to cover the looped situation. 

space_group_name_H-M-alt presents an additional issue, as it can be looped even for a single spacegroup, so should be a separate sub-category of space_group and is probably not suitable for the core dictionary in any case as it appears from David's comment to be of purely theoretical interest. I will adjust the forthcoming revamped DDLm symmetry dictionary accordingly.

I hope it is clear from this that we are not disenfranchising the theoretical community, but simply making sure that we don't stomp on each other's toes.

James.



On 6 May 2017 at 05:31, Brown, David <idbrown@mcmaster.ca> wrote:
Perhaps an explanation why "space_group_name_H-M_alt" was defined as loopable is in order. When we put together the symmetry dictionary we were accommodating two schools with very different interest in symmetry. For most crystallographers symmetry is a convenient shortcut for defining all the atomic positions in the unit cell when symmetry is present, but there is another school that is interested in the theory of symmetry as a rigorous mathematical subject, and those of us who tend to be a little sloppy in our use of symmetry (see _symmetry_cell_setting as an egregious example) do well to pay attention to what the theorists tell us. They needed a unique reference setting to be defined and this is given in "_space_group_H-M_ref". The choice of reference setting is necessarily arbitrarily and is defined in its enumeration list.

Structure solvers choose the setting that is most natural to the structure they are describing. They are the people most likely to use CIF, but we wanted to make sure that the theorists could, for example, list all the settings for a given space group in the same CIF, as explained in the last sentence of section 3.8.2 in the original version of ITG. We therefore included "space_group_name_H-M_alt" as an opportunity for one or more non-reference settings to be given. This would require "space_group_name_H-M_alt" to be looped. It is hardly surprising that one does not find it looped in the Acta Cryst. database of structures, but I would be sorry to see its potential to describe symmetry in a more general and theoretical way excluded from CIF2. Its inclusion as a loopable item was deliberate and should  be respected.

David

I. David Brown
Professor Emeritus
Department of Physics and Astronomy
McMaster University
Hamilton, Ontario, Canada
From: comcifs [comcifs-bounces@iucr.org] on behalf of James Hester [jamesrhester@gmail.com]
Sent: May 3, 2017 01:02
To: Discussion list of the IUCr Committee for the Maintenance of the CIF Standard (COMCIFS)
Subject: Re: A managed phase-out of DDL1 dictionaries

Dear Matthew,

We definitely need to take into account the significant investment of various groups in software based on the DDL1 framework. COMCIFS have guaranteed that datanames defined in DDL1 dictionaries retain the same meaning when appearing in DDLm dictionaries, so software that works with DDL1 datanames can function unchanged. If changes to the legacy dictionaries actually required you to spend any time rewriting software, then they are highly unlikely to be acceptable.  All changes will be drafted and published first in the Github repository (https://github.com/COMCIFS/DDL1-legacy-dictionaries), and I'd encourage you to monitor that and participate in discussions, particularly if any proposed changes would have a negative impact on your activities.

A typical example of the types of changes being entertained would be "space_group_name_H-M_alt" being defined as loopable in the DDL1 core dictionary but strictly unlooped in the DDLm dictionary (discussed at https://github.com/COMCIFS/cif_core/issues/20).  Examination of public CIF corpuses (e.g. IUCr, COD) suggest that nobody has ever looped this dataname, therefore aligning it with the more rigorous DDLm definition would seem feasible.

best wishes,
James.


On 3 May 2017 at 01:24, Matthew Towler <towler@ccdc.cam.ac.uk> wrote:

Dear James

 

Thanks for the more detailed explanation.  I agree with seeing what the state of the community is in two years rather than deciding now.

 

I remain slightly worried about the DDLm matching changes to DDL1 – as code that reads historic CIF already has to cope with quite a few different ways of specifying space groups (for example) and adding further improved methods will increase complexity.  We may reach a point where parsing a CIF is simple, but writing code that will reliably interpret what is intended by the values in a majority of extant CIF requires quite a steep learning curve involving many previous versions of the dictionaries.

 

I am interested in understanding the benefits in back porting DDLm changes to DDL1, and the trade-off of these against the cost of change.  What I am wondering is whether it would be better to have DDL1 remain as-is, and keep the better representations only in DDLm; the advantage being that if DDLm support is being added to existing code, then that would also be a good point to add support for improvements in semantics.  Back porting to DDL1 risks imposes an otherwise unrelated change ahead of the need to add DDLm support, which might actually detract from the effort required to add DDLm support.

 

Best wishes,

 

Matthew


_______________________________________________
comcifs mailing list
comcifs@iucr.org
http://mailman.iucr.org/cgi-bin/mailman/listinfo/comcifs




--
T +61 (02) 9717 9907
F +61 (02) 9717 3145
M +61 (04) 0249 4148

_______________________________________________
comcifs mailing list
comcifs@iucr.org
http://mailman.iucr.org/cgi-bin/mailman/listinfo/comcifs




--
T +61 (02) 9717 9907
F +61 (02) 9717 3145
M +61 (04) 0249 4148

Reply to: [list | sender only]