Discussion List Archives

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Reactivating mmCIF DMG (was discussion of dictionary update procedure)

  • To: James Hester <jamesrhester@gmail.com>
  • Subject: Re: Reactivating mmCIF DMG (was discussion of dictionary update procedure)
  • From: "Herbert J. Bernstein" <yaya@bernstein-plus-sons.com>
  • Date: Wed, 19 Aug 2009 22:30:52 -0400 (EDT)
  • Cc: "Discussion list of the IUCr Committee for the Maintenance of the CIFStandard \(COMCIFS\)" <comcifs@iucr.org>
  • In-Reply-To: <279aad2a0908191843i49f900c5ocd56725a290c4fba@mail.gmail.com>
  • References: <279aad2a0908191843i49f900c5ocd56725a290c4fba@mail.gmail.com>
Dear James,

   The statement that "the PDB additionally include many 'pdbx' items which 
have no meaning outside the PDB, but which enable them to freely convert 
between the database and a CIF file" is incorrect. There certainly are 
tags in pdbx which are only needed for internal purposes of the PDB, but 
the issue at hand is not the set of tags for internal use by the pdb, but 
the very large number of tags from the pdbx dictionary that are essential 
to the crystallographic description of the molecule.

   Consider for example, the secondary structure tags.  The pdbx secondary 
structure tags are not just some augmentation for database management, 
they are a major recasting of the approach to secondary structure from the 
rather elegant approach adopted as part of mmCIF to a compromise between 
the old PDB secondary structure description and the new mmCIF description.

   Consider also, _atom_site.pdbx_PDB_model_num, which is essential to the 
understanding of multiple model entries, especially because the PDB 
repeats atom serial numbers between models.  This is not a database 
management tag.

   There are many more such tags.  If they were purely for internal 
database management, they would not have to be part of every so-called 
mmCIF entry released by the PDB.  These are, quite literally, de facto 
standards for crystallographic macromolecular data, and should be 
carefully considered by the crystallographic community in that context.

   I urge everyone to read 
http://mmcif.pdb.org/dictionaries/ascii/mmcif_pdbx.dic

   Regards,
     Herbert
=====================================================
  Herbert J. Bernstein, Professor of Computer Science
    Dowling College, Kramer Science Center, KSC 121
         Idle Hour Blvd, Oakdale, NY, 11769

                  +1-631-244-3035
                  yaya@dowling.edu
=====================================================

On Thu, 20 Aug 2009, James Hester wrote:

> To first address Herbert's comments below:  A PDB mmCIF file does
> contain a lot of mmCIF data items, so I would argue that calling it an
> 'mmCIF' file is reasonable. The PDB additionally include many 'pdbx'
> items which have no meaning outside the PDB, but which enable them to
> freely convert between the database and a CIF file.  I might add in
> passing that this relational database <-> CIF interconvertibility is a
> rather remarkable attribute of the CIF standard, and we should
> recognise the PDB for the work that they have put into realising this.
>
> It is not correct to state that the 'pdbx' tags are being proposed as
> de-facto standards.  If I include 'anbf' tags in powder diffraction
> CIF file and call the result a 'pdCIF' file, am I proposing these as
> defacto pdCIF standards?  I think not.  I am simply stating that
> software that works with pdCIF (eg CMPR) will be able to process this
> file.
>
> On the other hand, it is clear that there are plenty of pdbx dataitems
> of relevance to the macromolecular community, and work on bringing
> these into the mmCIF dictionary would be welcome.  I would certainly
> support reactivation of the mmCIF DMG and tasking it with updating
> mmCIF.  What do other members think?
>
> Best wishes,
> James.
>
> On Thu, Aug 20, 2009 at 10:53 AM, Herbert J.
> Bernstein<yaya@bernstein-plus-sons.com> wrote:
>> Dear Colleagues,
>>
>> James has written:
>>
>>> My understanding is that imgCIF and mmCIF are within the purvey of
>>> COMCIFS, but we have no responsibility for pdbx and so this procedure
>>> would not apply to it.
>>
>>  I am unable to see any justification for exclusion of pdbx, when that,
>> rather than mmCIF, is what the PDB uses for its crystallographic
>> macromolecular file releases, and even calls those pdbx files mmCIF files.
>>
>>  For example, when I display the "mmCIF" file for 4ins, I get a file
>> that contains the following pdbx items:
>>
>> _audit_conform.dict_name       mmcif_pdbx.dic
>> _audit_conform.dict_version    1.0670
>> _audit_conform.dict_location
>> http://mmcif.pdb.org/dictionaries/ascii/mmcif_pdbx.dic
>>
>> #
>> _pdbx_database_PDB_obs_spr.id               SPRSDE
>> _pdbx_database_PDB_obs_spr.date             1990-04-15
>> _pdbx_database_PDB_obs_spr.pdb_id           4INS
>> _pdbx_database_PDB_obs_spr.replace_pdb_id   1INS
>> #
>> _pdbx_database_status.status_code    REL
>> _pdbx_database_status.entry_id       4INS
>> _pdbx_database_status.deposit_site   ?
>> _pdbx_database_status.process_site   ?
>> _pdbx_database_status.SG_entry       .
>> #
>>
>> loop_
>> _audit_author.name
>> _audit_author.pdbx_ordinal
>> 'Dodson, G.G.'  1
>> 'Dodson, E.J.'  2
>> 'Hodgkin, D.C.' 3
>> 'Isaacs, N.W.'  4
>> 'Vijayan, M.'   5
>>
>>
>> ....
>>
>> and many, many more
>>
>> It is a serious abdication of COMCIFS responsibility to the crystallographic
>> community for COMCIFS to fail to consider each of the pdbx tags that are
>> implicitly being proposed as de facto revisions to the crystallographic
>> mmCIF dictionary.
>>
>> I propose that a DMG be reactivated for mmCIF and that it be asked by
>> COMCIFS to make a proposal to COMCIFs on updating the mmCIF dictionary so
>> that it can actually be used for crystallographic macromolecular structures.
>>
>> Regards,
>>  Herbert
>>
>> P.S.  An alternative would simply be to discard the mmCIF dictionary,
>> inasmuch as it is not being used.
>>
>> =====================================================
>>  Herbert J. Bernstein, Professor of Computer Science
>>   Dowling College, Kramer Science Center, KSC 121
>>        Idle Hour Blvd, Oakdale, NY, 11769
>>
>>                 +1-631-244-3035
>>                 yaya@dowling.edu
>> =====================================================
>
>
> -- 
> T +61 (02) 9717 9907
> F +61 (02) 9717 3145
> M +61 (04) 0249 4148
>

Reply to: [list | sender only]