Re: [ddlm-group] Dictionary conformance (was Re: Second proposal toallow looping of 'Set' categories)
- To: Group finalising DDLm and associated dictionaries <ddlm-group@iucr.org>
- Subject: Re: [ddlm-group] Dictionary conformance (was Re: Second proposal toallow looping of 'Set' categories)
- From: "Bollinger, John C" <John.Bollinger@STJUDE.ORG>
- Date: Wed, 15 Jun 2016 15:03:59 +0000
- Accept-Language: en-US
- authentication-results: spf=none (sender IP is )smtp.mailfrom=John.Bollinger@STJUDE.ORG;
- DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=SJCRH.onmicrosoft.com; s=selector1-stjude-org;h=From:Date:Subject:Message-ID:Content-Type:MIME-Version;bh=viYV7fboqjNhbNeyvhhJ5Gar/DLN74hFSIa1cAZUs8M=;b=G+cPUqsjyOTcyhyfQKSpWXUi56RASOI7U9zehFDHIKyrXElpe+rTvpoOe7cKKIcpxmYTAsQmv8xqfynUn0YXpRWOJffLRjH6GDosEkZ4icnoSexEAeX8swJELpPOY12ZkoZVS36g0FQrUi+oYFib3ek7ob5ZPv2SP7JfkTaRZr0=
- In-Reply-To: <1573396202.6231045.1465999233495.JavaMail.yahoo@mail.yahoo.com>
- References: <CAM+dB2e-p0gORJGKUXMf8m+h4jhqEaGKbVgQuTPKG1KQobysew@mail.gmail.com><1699752094.6098741.1465993722575.JavaMail.yahoo@mail.yahoo.com><BY2PR0401MB09361117BEC19164C92892B2E0550@BY2PR0401MB0936.namprd04.prod.outlook.com><1573396202.6231045.1465999233495.JavaMail.yahoo@mail.yahoo.com>
- spamdiagnosticmetadata: NSPM
- spamdiagnosticoutput: 1:99
Yes. It is a bug in the DDLm version of the core dictionary that its definition of the audit_conform category is inconsistent with mmCIF and the DDL1 core.
That bug should be fixed. Regards, John From: ddlm-group [mailto:ddlm-group-bounces@iucr.org]
On Behalf Of SIMON WESTRIP So you are in favour of making audit_conform a Loop in DDLm? Cheers Simon From: "Bollinger, John C" <John.Bollinger@STJUDE.ORG> Dear All, Remember that audit_conform is not a DDLm category but rather a core CIF category. I don’t see why the availability of _import.get in
DDLm has any bearing on whether definitions in the DDLm core dictionary should be consistent with definitions of the same items in the mmCIF and DDL1 core dictionaries. In the case of audit_conform, the DDLm core disagrees with the others, so there can be
data files that are valid against the DDL1 core or against mmCIF that are not valid against the DDLm core. Few CIFs actually use audit_conform, so there probably aren’t many for which such a validity mismatch occurs, but the disagreement is nevertheless undesirable
and inconsistent with our intent to keep definitions stable. In any case, although with DDLm you can indeed use _import.get to form a dictionary that suits you, ad hoc dictionaries formed in this
manner do not benefit from well-known names or version codes. The only thing one can do with an audit_conform entry that references an unknown dictionary is load that dictionary and validate against it. That serves only a functional purpose, whereas with
well-known dictionary names, audit_conform also serves an informational purpose. John From: ddlm-group [mailto:ddlm-group-bounces@iucr.org]
On Behalf Of SIMON WESTRIP Thanks James Nothing really to discuss here, except perhaps whether audit_conform is Set or Loop, but as you point out, using the import mechanism one can create a dictionary that in turn imports any number of dictionaries. So whereas many ddl1 CIFs actually conform to cif_core and the iucr and/or ccdc 'local' dictionaries, which are rarely (never) declared in the CIF instance, moving forward we might look at creating a dictionary that pulls these in as well as cif_core. On the other hand, perhaps it would be cleaner to declare such dictionaries separately in an audit_conform loop so that readers can fetch them if they're really interested, rather than fetching an unfamiliar dictionary only to find out its basically cif_core but
with a bunch of extra items they are not interested in anyway. I'm inclined toward the latter approach. Anyway, thanks again for your clarifications - all very useful Cheers Simon PS I've added a couple of trivial comments below... From: James Hester <jamesrhester@gmail.com> Hi Simon, I'll have a go at answering your questions. This is an interesting line of questions, but a bit to the side of the other discussion, so I've put them
into a different thread. On 15 June 2016 at 09:16, SIMON WESTRIP <simonwestrip@btinternet.com>
wrote:
The scope would be everything that COMCIFS managed. To anticipate your question below, we could/should have a very short 'IUCr core' dictionary with datanames
that apply regardless of the particular domain, and which all DDLm dictionaries import. Probably cif_core would import it, and all others would import cif_core.
SPW: that sounds logical
Excuse my pedantry here, but I want to make sure we understand each other. DDLm is a set of attributes which you can use to define the meaning of datanames.
A CIF (i.e. datafile) does not conform to a DDLm dictionary, it conforms to a dictionary. That dictionary is written in a DDL. So we can write a dataname, and it has the same meaning (perhaps after application of aliases) regardless of the particular DDL
in which that meaning is described. If I have misunderstood you, and you simply meant, "is there a dictionary that all datafiles should draw on?", then the current answer is "no". For example, mmCIF datanames completely replace datanames in cif_core. The
intention of proposal #2 is that _audit.schema would be a universal dataname, ideally defined in the separate 'IUCr core' dictionary described above. SPW: "is there a dictionary that all datafiles should draw on?" - yes thats what I meant
You can't. A fundamental assumption, that COMCIFS attempts to fulfill, is that the meaning of datanames does not change. In particular, the meaning
of the dictionary conformance datanames does not change, so the programmer can hard-code a dataname to output dictionary conformance and not worry that somehow in the future this dataname will have a different meaning. Hopefully this underlines that the *primary*
audience of our dictionaries is the human software programmer, *not* the software itself. The programmer (*not* the program) is the one that has to read the text definition to work out e.g. what dataname contains the atomic positions. The "dictionary driven
software" part can only relate to those things that software can understand and use and which *do not* relate to a change in meaning (because we're not supposed to change the meaning) - aliases and dREL come to mind. All of the other machine-readable stuff
can only be used for validation, which is why I assert that a lot of casual CIF-reading software doesn't bother with dictionaries at execution time - the information available at software creation time is guaranteed to be sufficient to 'get the job done' now
and in future.
No, it will not. What you need is for CIF authors to include the already-defined datanames specifying dictionary conformance. The _audit.schema proposal
will also not help. I suggest that each time (or once a day, to stay sane) that you find a file that does not contain the dataname _audit_conform_dict_name (old cif_core)
or _audit_conform.dict_name (mmCIF, and in draft DDLm core dictionary), you send a gentle message to the software author (if you can identify them), asking them to include one of these datanames in the output template that they distribute with their software
(or adjust their code). You could even provide a line in the email for them to cut and paste. Perhaps you can get checkCIF to issue a level C alert if they are missing (if it doesn't already) with a suggestion to include a particular line in the CIF file
to make the alert go away. Perhaps we can wait until DDLm cif_core is accepted before pushing this. SPW: I think I'd better make sure my own CIF writing software starts doing a more robust job in this respect before asking others to! The most I do is
look for the audit_conform items - I don't add them if not there. On a side note, DDLm includes the dictionary import formalism, which means that there is only one 'master' dictionary that imports all the rest (so e.g.
pdCIF would internally import 'cif_core'). This improves on the old dictionary merging formalism I discussed before, and is why audit_conform is a 'Set' category in DDLm but a multi-packet loop category in DDL2. The 'Set' designation for _audit_conform may
be worth revisiting.
T +61 (02) 9717 9907
_______________________________________________ |
_______________________________________________ ddlm-group mailing list ddlm-group@iucr.org http://mailman.iucr.org/cgi-bin/mailman/listinfo/ddlm-group
Reply to: [list | sender only]
- Follow-Ups:
- References:
- [ddlm-group] Dictionary conformance (was Re: Second proposal toallow looping of 'Set' categories) (James Hester)
- Re: [ddlm-group] Dictionary conformance (was Re: Second proposal toallow looping of 'Set' categories) (SIMON WESTRIP)
- Re: [ddlm-group] Dictionary conformance (was Re: Second proposal toallow looping of 'Set' categories) (Bollinger, John C)
- Re: [ddlm-group] Dictionary conformance (was Re: Second proposal toallow looping of 'Set' categories) (SIMON WESTRIP)
- Prev by Date: Re: [ddlm-group] Dictionary conformance (was Re: Second proposal toallow looping of 'Set' categories)
- Next by Date: Re: [ddlm-group] Second proposal to allow looping of'Set' categories
- Prev by thread: Re: [ddlm-group] Dictionary conformance (was Re: Second proposal toallow looping of 'Set' categories)
- Next by thread: Re: [ddlm-group] Dictionary conformance (was Re: Second proposal toallow looping of 'Set' categories)
- Index(es):