Discussion List Archives

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Adding datanames covering database information

  • To: Distribution list of the IUCr COMCIFS Core Dictionary Maintenance Group <coredmg@iucr.org>
  • Subject: Adding datanames covering database information
  • From: James Hester <jamesrhester@gmail.com>
  • Date: Thu, 12 Apr 2018 15:59:49 +1000
  • DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025;h=mime-version:from:date:message-id:subject:to;bh=kkuaITeBN51WiP4s4ts8J0eznIh3BKHRx41REgjBUQc=;b=enSuzGQRuGvBdsDI/RKn2xaLguhGlYp3JooJDQ/xefXAN8wAJ1ht5jfh8EyVNruGVmgojqVyYXjAWqTuvFydAD58oAEiqoCBoc8pGFkD+1ZbqsbG6p3ZDzTWApd4WYZen/7Fs6hAEgYmtnCGf8Ekl2XSjJPeaUujUcFb8sxsOY9LP2sR0WyUqqkPe107lg6rKPcbnIRReKzyJSz2C4pCnOb2l8+o0NuKFluiNGqRTtYwMmB2Sf+FLfqG6MK97mP+dHjQ6aPqpHbaiXJMSxMnuDvQyQfKis1qVr9B7elrGvCgPNdOnWegeZyPUUO3LMBsl3tpriYyxdCKMn2cNJsqDQ==
Dear Core CIF users and experts,

The current core CIF provides the DATABASE and DATABASE_CODE categories for identifying a database entry corresponding to the structure contained in the data block, for a variety of pre-determined databases.  These are both Set categories, that is, their datanames can only take a single value in a single data block.  This restriction is reasonable if the database content for that entry is seen as coincident with the data block contents, as has been the case for structural databases.

However, it is possible for multiple entries from a single database to be more broadly relevant to the contents of a data block. For example, multiple structures may correspond to a single topology.  So I would like you to consider the creation of a (looped) DATABASE_RELATED category that would simply list entry codes for databases in the same way as CITATION simply lists literature references.  Other categories in other dictionaries may then reference these entries for their own uses.  This is not intended to replace the current DATABASE categories, which would still be preferred for use by structural databases upon deposition and delivery of CIF files.  The new category would instead align with the mmCIF DATABASE_2 category.

The proposed data names are as follows, with short summaries of their meanings:

_database_related.id           'An arbitrary identifier for this entry'
_database_related.database_id            'An identifier for the database from an enumerated list (e.g. CCDC, PDB, ICSD, COD ...)
_database_related.reference   'A code used by the database given in _database_related.database_id'
_database_related.relation      'The way in which the database entry is related to the contents of the data block, from an enumerated list. Initial suggestions include "identical","component","derived","common source" '
_database_related.special_details   'Optional free-form description of the relationship between this entry and the data block contents"
An example of use in a data file would then be:

1    COD              1234                   identical                            'As deposited structure'
2    COD              6789                   'common source'            'Curated version of this structure'
3    CCDC            qrst-12               'common source'            'Curated version of this structure'
4    ICSD              lll-ppp                 .                                         'An earlier version of the structure with missing H atoms'

Please provide your thoughts on this general scheme, and any further data names that you think might be useful in this context.  If there are no objections, I will prepare formal definitions and advise this group when they are ready for inclusion.

best wishes,
James Hester.
T +61 (02) 9717 9907
F +61 (02) 9717 3145
M +61 (04) 0249 4148
coreDMG mailing list

[Send comment to list secretary]
[Reply to list (subscribers only)]