[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[ddlm-group] Dictionary conformance (was Re: Second proposal toallow looping of 'Set' categories)

  • To: SIMON WESTRIP <simonwestrip@btinternet.com>, Group finalising DDLm and associated dictionaries <ddlm-group@iucr.org>
  • Subject: [ddlm-group] Dictionary conformance (was Re: Second proposal toallow looping of 'Set' categories)
  • From: James Hester <jamesrhester@gmail.com>
  • Date: Wed, 15 Jun 2016 10:45:39 +1000
  • DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;h=mime-version:from:date:message-id:subject:to;bh=JRTI5kPdTQ4rANutIPJfgxMNqdMfZwOigi4gWywEpIk=;b=Vijzi2ijErnkFdvb5dl+FRKy6gRGXn7jQMXtWlvy0fNLON+9i3YVWO9EBFyb5JZCwN6bG/MmgS/cquhRiV7LffVcbq6GqL66iNgDL4zeA+6w1g0b9HFT/nE6UCDpGN7Vd7Kp+RpPXNuvs9pNgr0tGJFPZChfMaHOreOH0Pj3625DCzYE3zkF1pNt+Wja6qaKdlJ1yEiY/a0IiZ2nngNehWBV4usYzeO82BwrfvBQFah0XUxE6y4MFjPK7VPtxCBhOolnM+kkOCOt2UeObL2cX+qmwDGMS/k3zjvdMfPvLIyMxZ9W2Z1oGvNCkevRZjYmnx3vGEZRHS6rhjL9f5pTfA==
Hi Simon,

I'll have a go at answering your questions.  This is an interesting line of questions, but a bit to the side of the other discussion, so I've put them into a different thread.

On 15 June 2016 at 09:16, SIMON WESTRIP <simonwestrip@btinternet.com> wrote:
Dear John et al.

Struggling to keep-up again - so going back to basics:

1) I assume the scope of the proposed 'schema' definition is restricted to the dictionary in which its definition appears
(i.e. directs to alternative definitions that are already in that particular dictionary)?

The scope would be everything that COMCIFS managed. To anticipate your question below, we could/should have a very short 'IUCr core' dictionary with datanames that apply regardless of the particular domain, and which all DDLm dictionaries import.  Probably cif_core would import it, and all others would import cif_core.

2) Is the intention that all CIFs (whatever domain) conform to a 'global' ddlm dictionary?

Excuse my pedantry here, but I want to make sure we understand each other. DDLm is a set of attributes which you can use to define the meaning of datanames.  A CIF (i.e. datafile) does not conform to a DDLm dictionary, it conforms to a dictionary. That dictionary is written in a DDL.  So we can write a dataname, and it has the same meaning (perhaps after application of aliases) regardless of the particular DDL in which that meaning is described.  If I have misunderstood you, and you simply meant, "is there a dictionary that all datafiles should draw on?", then the current answer is "no".  For example, mmCIF datanames completely replace datanames in cif_core.  The intention of proposal #2 is that _audit.schema would be a universal dataname, ideally defined in the separate 'IUCr core' dictionary described above.

3) How does one declare dictionary conformance in a CIF instance without using a dictionary-defined dataname?

You can't.  A fundamental assumption, that COMCIFS attempts to fulfill, is that the meaning of datanames does not change.  In particular, the meaning of the dictionary conformance datanames does not change, so the programmer can hard-code a dataname to output dictionary conformance and not worry that somehow in the future this dataname will have a different meaning.  Hopefully this underlines that the *primary* audience of our dictionaries is the human software programmer, *not* the software itself. The programmer (*not* the program)  is the one that has to read the text definition to work out e.g. what dataname contains the atomic positions. The "dictionary driven software" part can only relate to those things that software can understand and use and which *do not* relate to a change in meaning (because we're not supposed to change the meaning) - aliases and dREL come to mind.  All of the other machine-readable stuff can only be used for validation, which is why I assert that a lot of casual CIF-reading software doesn't bother with dictionaries at execution time - the information available at software creation time is guaranteed to be sufficient to 'get the job done' now and in future.

Basically I am certain that I am not alone in having to rely on heuristics based on prior knowledge just to identify that e.g. my molecular graphics program is dealing with a ddl2 CIF (mmCIF) rather than a ddl1 CIF, and whether its a pdCIF, msCIF, rhoCIF... is ddlm/CIF2 going to help at this rather fundamental level?

No, it will not.  What you need is for CIF authors to include the already-defined datanames specifying dictionary conformance.  The _audit.schema proposal will also not help. 

I suggest that each time (or once a day, to stay sane) that you find a file that does not contain the dataname _audit_conform_dict_name (old cif_core) or _audit_conform.dict_name (mmCIF, and in draft DDLm core dictionary), you send a gentle message to the software author (if you can identify them), asking them to include one of these datanames in the output template that they distribute with their software (or adjust their code).  You could even provide a line in the email for them to cut and paste.  Perhaps you can get checkCIF to issue a level C alert if they are missing (if it doesn't already) with a suggestion to include a particular line in the CIF file to make the alert go away.  Perhaps we can wait until DDLm cif_core is accepted before pushing this.
 
On a side note, DDLm includes the dictionary import formalism, which means that there is only one 'master' dictionary that imports all the rest (so e.g. pdCIF would internally import 'cif_core'). This improves on the old dictionary merging formalism I discussed before, and is why audit_conform is a 'Set' category in DDLm but a multi-packet loop category in DDL2.  The 'Set' designation for _audit_conform may be worth revisiting.








--
T +61 (02) 9717 9907
F +61 (02) 9717 3145
M +61 (04) 0249 4148
_______________________________________________
ddlm-group mailing list
ddlm-group@iucr.org
http://mailman.iucr.org/cgi-bin/mailman/listinfo/ddlm-group

Reply to: [list | sender only]