[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [ddlm-group] (no subject)

  • To: Group finalising DDLm and associated dictionaries <ddlm-group@iucr.org>
  • Subject: Re: [ddlm-group] (no subject)
  • From: "Bollinger, John C" <John.Bollinger@STJUDE.ORG>
  • Date: Thu, 16 Jun 2016 22:47:20 +0000
  • Accept-Language: en-US
  • authentication-results: spf=none (sender IP is )smtp.mailfrom=John.Bollinger@STJUDE.ORG;
  • DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=SJCRH.onmicrosoft.com; s=selector1-stjude-org;h=From:Date:Subject:Message-ID:Content-Type:MIME-Version;bh=ifDbN2MYOdWL3L1ARvzI3D++F0NFo/EGj2h359VVRNw=;b=t7fhUeV4DSeax7Lk82km7K9E2PnFYM3ySOrfNJjFxdJksHSB4HBGzwcPD3aX4V7awgDL/Z0BAc+e9QUeABxErI6uUrpCQ7fzK7ladHR+dvi1l77iBsD2fYYUL4tuNdnt045ZdezVdD1d2Y/UHDNPydltAtMtSk67It/KP/1sixc=
  • In-Reply-To: <CAM+dB2fdHvdZ-UUMuy6rbGXWvY567b769nO4R4c35kLvAYXyJg@mail.gmail.com>
  • References: <CAM+dB2fdHvdZ-UUMuy6rbGXWvY567b769nO4R4c35kLvAYXyJg@mail.gmail.com>
  • spamdiagnosticmetadata: NSPM
  • spamdiagnosticoutput: 1:99
Dear Colleagues,

Comments in-line.

On Wednesday, June 15, 2016 6:39 PM, James Hester wrote:
> [...]
> I disagree that proposal #2 is not complete (see previous email). Proposal #2 is indeed more precise, and this is important. Many dictionaries do not alter 'Set' categories. So software written with cif_core in mind could actually handle datafiles written in accordance with pd_CIF, ms_CIF, and the future magCIF dictionaries just fine.  Should this software reject a perfectly good crystal structure found in a file that conforms to pdCIF?  No. What about some future dictionary that just adds some infra-red measurements to the structure?  Again, probably not, but you can't at software creation time specify that your software will accept this dictionary, because it doesn't yet exist. Thus audit_conform.dictionary is not workable because the software author, at creation time, cannot know how future (or local) dictionaries will or will not fiddle with Set categories.  Of course, the author could write software (or find a library) that actually downloaded and analysed the dictionaries given in _audit_conform, but hopefully the examples in my previous email or common sense would suggest that the bulk of CIF reading authors will never program a complete dictionary parser and analyse the key and loop structure just to get a few atom sites.

I acknowledge that I may have misunderstood proposal #2.  In light of James's subsequent comments I'm now interpreting it to hinge in part on the assertion that undefined categories are necessarily Sets, so that if they are initially defined as Loops or as Sets-with-key then that constitutes a change that may need to be advertised via _audit.schema.  I trust that if I still have it wrong then James will supply further clarification.  Whether I accept that undefined categories are Sets or not, I can accept that P2 provides a more complete solution than I previously gave it credit for, provided we find a suitable specification for which category names should be listed in _audit.schema.

James raises some good points about audit_conform.  My previous comments about dictionary versioning are applicable here, but they do not constitute a rebuttal.

On the other hand, similar can be applied to the proposed usage model for _audit.schema: if a data file expresses via _audit.schema that a certain category has been modified, and a given piece of software does not have that category in its list of acceptable schema changes, then the software will reject the data, even if it does not use the modified category in any way.  This is analogous to the example of software rejecting a data file that uses _audit_conform.dictionary to express conformance with pd_CIF, even though that does not change the interpretation of the items the software actually uses.  I believe this is the same issue James acknowledged:


> The audit.schema system requires some extra work in non-default cases: software cannot trivially determine that an unknown (at software creation time) value of audit.schema was acceptable for some given datablock contents and use cases. This case can be dealt with if the software is prepared to parse and analyse the dictionaries provided in _audit_conform, or if the software is prepared to run a generic conversion utility to transform to its own schema.
> [...]


I considered that it might be possible to solve that problem by inverting the usage model for _audit.schema: instead of applications listing values that they can accept, they could instead list values that they must reject.  That would no longer be as easy to use, but I don't think it would be prohibitive.  But that isn't sufficient either, however, because it doesn't provide for rejecting some changes to a given category but not others.  As James suggests, the problem could be solved instead by analyzing dictionaries or converting schemas, but if it comes to that then most practical utility of _audit.schema has been lost.  An application prepared to do that would more simply just start in with the dictionary analysis straightaway.

Ultimately, I would not oppose adding _audit.schema to the DDLm core dictionary with a definition that makes it advisory, rather than prescriptive.  Inasmuch as we seem to agree that _audit.schema cannot replace consulting the relevant dictionaries, I think that's the most reasonable form for its definition to take if we do define it.  Although I think it would be possible to define items that express schema characteristics in sufficient detail for applications to determine whether they are prepared to understand the file, such items would not have any of the ease of use that _audit.schema enjoys.

That does not, however, constitute agreement to Proposal #2 overall.


John


________________________________

Email Disclaimer: www.stjude.org/emaildisclaimer
Consultation Disclaimer: www.stjude.org/consultationdisclaimer
_______________________________________________ddlm-group mailing listddlm-group@iucr.orghttp://mailman.iucr.org/cgi-bin/mailman/listinfo/ddlm-group

Reply to: [list | sender only]