Discussion List Archives

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [ddlm-group] Dictionary conformance (was Re: Second proposal toallow looping of 'Set' categories)

  • To: James Hester <jamesrhester@gmail.com>, Group finalising DDLm and associated dictionaries <ddlm-group@iucr.org>
  • Subject: Re: [ddlm-group] Dictionary conformance (was Re: Second proposal toallow looping of 'Set' categories)
  • From: SIMON WESTRIP <simonwestrip@btinternet.com>
  • Date: Wed, 15 Jun 2016 12:28:42 +0000 (UTC)
  • DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=btinternet.com; s=s2048;t=1465993975; bh=4NS4GMMeEw4xuKDo9RM0tGrIX+jMJsPwpALtev4nXUc=;h=Date:From:Reply-To:To:In-Reply-To:References:Subject:From:Subject;b=gJMLA1yY/DzBS8NqdRp3TVeFCoiF4o+Z6v0/q79dEFx1X8IFL70v9OYqd6PS0FUYAY6dkxQfpO1ydIh9cCB2ik4wKpait9Ml9EL+0fxOJ/WAt0nFt0AgCzLgIDTmnLMmFiA9p/8TENkTg+ljxpjJxTxJIDG6xhMNqg+IPso/uNOephW5F9d3FKJzZfmLkwjG7Cs55LX8wv0cVdSd79/zami+XMIp2q6LZlAyEQ4SXmdoUjTC6AO4TPIFSXFoTdbCiEuR7SUWUoqhxGZtqUTfefRAGuWyFRmBfTtY1n25rbVXkZMuLDvASRUusIbWfOhDJxgTXs1Q9qlqzObUMK5U5w==
  • In-Reply-To: <CAM+dB2e-p0gORJGKUXMf8m+h4jhqEaGKbVgQuTPKG1KQobysew@mail.gmail.com>
  • References: <CAM+dB2e-p0gORJGKUXMf8m+h4jhqEaGKbVgQuTPKG1KQobysew@mail.gmail.com>
Thanks James

Nothing really to discuss here, except perhaps whether audit_conform is Set or Loop,
but as you point out, using the import mechanism one can create a dictionary that in turn
imports any number of dictionaries. So whereas many ddl1 CIFs actually conform to cif_core
and the iucr and/or ccdc 'local' dictionaries, which are rarely (never) declared in the CIF instance,
moving forward we might look at creating a dictionary that pulls these in as well as cif_core.
On the other hand, perhaps it would be cleaner to declare such dictionaries separately in an audit_conform loop
so that readers can fetch them if they're really interested, rather than fetching an unfamiliar dictionary only to find out its basically cif_core but with a bunch of extra items they are not interested in anyway.
I'm inclined toward the latter approach.


Anyway, thanks again for your clarifications - all very useful

Cheers

Simon
PS I've added a couple of trivial comments below...


From: James Hester <jamesrhester@gmail.com>
To: SIMON WESTRIP <simonwestrip@btinternet.com>; Group finalising DDLm and associated dictionaries <ddlm-group@iucr.org>
Sent: Wednesday, June 15, 2016 1:45 AM
Subject: Dictionary conformance (was Re: Second proposal to allow looping of 'Set' categories)

Hi Simon,

I'll have a go at answering your questions.  This is an interesting line of questions, but a bit to the side of the other discussion, so I've put them into a different thread.

On 15 June 2016 at 09:16, SIMON WESTRIP <simonwestrip@btinternet.com> wrote:
Dear John et al.

Struggling to keep-up again - so going back to basics:

1) I assume the scope of the proposed 'schema' definition is restricted to the dictionary in which its definition appears
(i.e. directs to alternative definitions that are already in that particular dictionary)?

The scope would be everything that COMCIFS managed. To anticipate your question below, we could/should have a very short 'IUCr core' dictionary with datanames that apply regardless of the particular domain, and which all DDLm dictionaries import.  Probably cif_core would import it, and all others would import cif_core.

SPW: that sounds logical

2) Is the intention that all CIFs (whatever domain) conform to a 'global' ddlm dictionary?

Excuse my pedantry here, but I want to make sure we understand each other. DDLm is a set of attributes which you can use to define the meaning of datanames.  A CIF (i.e. datafile) does not conform to a DDLm dictionary, it conforms to a dictionary. That dictionary is written in a DDL.  So we can write a dataname, and it has the same meaning (perhaps after application of aliases) regardless of the particular DDL in which that meaning is described.  If I have misunderstood you, and you simply meant, "is there a dictionary that all datafiles should draw on?", then the current answer is "no".  For example, mmCIF datanames completely replace datanames in cif_core.  The intention of proposal #2 is that _audit.schema would be a universal dataname, ideally defined in the separate 'IUCr core' dictionary described above.

SPW: "is there a dictionary that all datafiles should draw on?" - yes thats what I meant

3) How does one declare dictionary conformance in a CIF instance without using a dictionary-defined dataname?

You can't.  A fundamental assumption, that COMCIFS attempts to fulfill, is that the meaning of datanames does not change.  In particular, the meaning of the dictionary conformance datanames does not change, so the programmer can hard-code a dataname to output dictionary conformance and not worry that somehow in the future this dataname will have a different meaning.  Hopefully this underlines that the *primary* audience of our dictionaries is the human software programmer, *not* the software itself. The programmer (*not* the program)  is the one that has to read the text definition to work out e.g. what dataname contains the atomic positions. The "dictionary driven software" part can only relate to those things that software can understand and use and which *do not* relate to a change in meaning (because we're not supposed to change the meaning) - aliases and dREL come to mind.  All of the other machine-readable stuff can only be used for validation, which is why I assert that a lot of casual CIF-reading software doesn't bother with dictionaries at execution time - the information available at software creation time is guaranteed to be sufficient to 'get the job done' now and in future.

Basically I am certain that I am not alone in having to rely on heuristics based on prior knowledge just to identify that e.g. my molecular graphics program is dealing with a ddl2 CIF (mmCIF) rather than a ddl1 CIF, and whether its a pdCIF, msCIF, rhoCIF... is ddlm/CIF2 going to help at this rather fundamental level?

No, it will not.  What you need is for CIF authors to include the already-defined datanames specifying dictionary conformance.  The _audit.schema proposal will also not help. 

I suggest that each time (or once a day, to stay sane) that you find a file that does not contain the dataname _audit_conform_dict_name (old cif_core) or _audit_conform.dict_name (mmCIF, and in draft DDLm core dictionary), you send a gentle message to the software author (if you can identify them), asking them to include one of these datanames in the output template that they distribute with their software (or adjust their code).  You could even provide a line in the email for them to cut and paste.  Perhaps you can get checkCIF to issue a level C alert if they are missing (if it doesn't already) with a suggestion to include a particular line in the CIF file to make the alert go away.  Perhaps we can wait until DDLm cif_core is accepted before pushing this.

SPW: I think I'd better make sure my own CIF writing software starts doing a more robust job in this respect before asking others to! The most I do is look for the audit_conform items - I don't add them if not there.
 
On a side note, DDLm includes the dictionary import formalism, which means that there is only one 'master' dictionary that imports all the rest (so e.g. pdCIF would internally import 'cif_core'). This improves on the old dictionary merging formalism I discussed before, and is why audit_conform is a 'Set' category in DDLm but a multi-packet loop category in DDL2.  The 'Set' designation for _audit_conform may be worth revisiting.










--
T +61 (02) 9717 9907
F +61 (02) 9717 3145
M +61 (04) 0249 4148


_______________________________________________
ddlm-group mailing list
ddlm-group@iucr.org
http://mailman.iucr.org/cgi-bin/mailman/listinfo/ddlm-group

Reply to: [list | sender only]
International Union of Crystallography

Scientific Union Member of the International Science Council (admitted 1947). Member of CODATA, the ISC Committee on Data. Partner with UNESCO, the United Nations Educational, Scientific and Cultural Organization in the International Year of Crystallography 2014.

International Science Council Scientific Freedom Policy

The IUCr observes the basic policy of non-discrimination and affirms the right and freedom of scientists to associate in international scientific activity without regard to such factors as ethnic origin, religion, citizenship, language, political stance, gender, sex or age, in accordance with the Statutes of the International Council for Science.