Discussion List Archives

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Additional fast-track definitions for Core 2.4.2

  • To: comcifs@iucr.org
  • Subject: Additional fast-track definitions for Core 2.4.2
  • From: Brian McMahon <bm@iucr.org>
  • Date: Sun, 17 Apr 2011 17:52:00 +0100
Dear COMCIFS colleagues

If you are wondering about the delay in publishing the recent round of
updates to the Core dictionary, this is because, in implementing the
recently approved changes, I discovered a number of other "fast-track"
proposals that were overlooked, owing to an early misunderstanding
between myself and James Hester regarding the mechanism of the approval
process. I have subsequently asked the core dictionary management
group to review and accept these proposals, and I now ask voting
members of COMCIFS to complete the process by indicating their
approval - or, of course, reservations or other comments.

I shall be on vacation for the next week, but if the relevant approvals
are received during that time, should be in a position to bundle
these changes along with the others and release version 2.4.2 of the
core within a few days of my return.

The relevant items are
  _database_code_COD
  _chemical_identifier_inchi
  _chemical_identifier_inchi_version
  _chemical_identifier_inchi_key
  _diffrn_radiation_wavelength_details

Full details for each proposal may be found at
  http://www.iucr.org/resources/cif/dictionaries/new-item/cif_core
but I'll repeat the salient points below.

I apologise again for this oversight; it is not a good advertisement
for the "fast-tracking" process; but the fault is entirely mine.

I vote in favour of all five of the proposed new definitions.

Best regards
Brian

Further details of the proposed definitions:

------------------------------------------------------------------------------

_database_code_COD

Definition:

Identification code assigned by Crystallography Open Database (COD).

Example:
	

Category: database

An identifier of the structure as assigned by the Crystallography Open
Database. This was requested by Saulius Grazulis by analogy with existing
database identifiers such as _database_code_CSD for the Cambridge 
Structural
Database.  

------------------------------------------------------------------------------

_chemical_identifier_inchi

Definition:

The IUPAC International Chemical Identifier (InChI) is
a textual identifier  for chemical substances, designed 
to provide a standard and human-readable way to 
encode molecular information and to facilitate the
search for such information in databases and on the
web.

Example:
'InChI=1/C18H21NO3/c1-19-8-7-18-11-4-5-13(20)17(18)22-16-14(21-2)6-3-10(15(16)18)9-12(11)19/h3-6,11-13,17,20H,7-9H2,1-2H3/t11-,12+,13-,17-,18-/m0/s1'
codeine

Category: chemical

------------------------------------------------------------------------------

_chemical_identifier_inchi_version

Definition:

Version of the InChI standard.

Example:
1.02 	

Category: chemical

InChI is a changing standard. It is necessary to track the current version
in order to parse it correctly.

------------------------------------------------------------------------------

_chemical_identifier_inchi_key

Definition:

The 25-character hashed version of the full InChI (IUPAC
International Chemical Identifier), designed to allow for
easy web searches of chemical compounds.

Example:
InChIKey=OROGSEYTTFOCAN-DNJOTXNNBG 	codeine

Category: _chemical_identifier_inchi_key

The InChIKey is a 25-character hashed version of the full InChI, designed 
to
allow for easy web searches of chemical compounds. InChIKeys consist of 14
characters resulting from a hash of the connectivity information of the
InChI, followed by a hyphen, followed by 8 characters resulting from a hash
of the remaining layers of the InChI, followed by a single character
indicating the version of InChI used, followed by single checksum 
character.
There is a finite, but very small probability of finding two structures 
with
the same InChIKey. However the probability for duplication of only the 
first
block of 14 characters has been estimated as one duplication in 75 
databases
each containing one billion unique structures; such duplication therefore
appears unlikely at present. Further information on the InChIKey is
available at http://www.iupac.org/inchi/release102.html. 

------------------------------------------------------------------------------

_diffrn_radiation_wavelength_details

Definition:

Information about the determination of the radiation wavelength that is not
conveyed completely by an enumerated value of
_diffrn_radiation_wavelength_determination.

Example:
	

Category: diffrn_radiation_wavelength



Any other attributes suggested by the submitter are listed here:

_type                        char
   _list                        both
   _list_reference            '_diffrn_radiation_wavelength_id'


When adding the recently approved 
_diffrn_radiation_wavelength_determination
to the core dictionary, it was noticed that the suggested definition for
this new data name was:

The method of determination of incident wavelength. 
Further information may be provided in _diffrn_radiation_special_details

However, _diffrn_radiation_special_details does not exist. This data name 
is
suggested in order to fulfil that role; note that as a member of this
category it would necessarily be looped if there are multiple wavelengths
listed. An alternative (or perhaps complementary) approach would be to
create a "scalar" _diffrn_radiation_details item in the parent category.

------------------------------------------------------------------------------
_________________________________________________________________________
Brian McMahon                                       tel: +44 1244 342878
Research and Development Officer                    fax: +44 1244 314888
International Union of Crystallography            e-mail:  bm@iucr.org


Reply to: [list | sender only]