Discussion List Archives

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Absence of _item.category_id or _item_linked.parent_name in somemmcif definitions

Dear Colleagues,

   This is not a matter of something being missing from the mmCIF 
dictionary.
John W. tried explaining this a few years ago.  I'll give it a shot this 
time,
but in view of the continuing problems in reading the DDL2 dictionaries,
I will also suggest a change.  In this case, even though as a technical 
matter
it ain't broke, as a practical human engineering matter, something is broke
so we _should_ fix it.

   In the mmcif dictionary, the definitions of individual tags are _not_
in general completely contained in a single save frame.  For example,
as James notes, the category and parent name for  _atom_site_anisotrop.id
are explicitlygiven in a loops in the _atom_site.id save frame:

       loop_
     _item.name
     _item.category_id
     _item.mandatory_code
                '_atom_site.id'                 atom_site            yes
                '_atom_site_anisotrop.id'       atom_site_anisotrop  yes
                '_geom_angle.atom_site_id_1'    geom_angle           yes
...

     loop_
     _item_linked.child_name
     _item_linked.parent_name
                '_atom_site_anisotrop.id'       '_atom_site.id'
                '_geom_angle.atom_site_id_1'    '_atom_site.id'
...

The two alternatives to this approach would be ether to move the
information from the parent save frames to the children, or to duplicate
the information.  Duplication may help in readability, but it is a 
maintenance
headache.  The DDLm solution of only requiring/allowing specification
of parents, rather than children would be similar to moving this linking
information from loops in the parents into the children in DDL2.

   My suggestion would be to move to the DDLm approach in the DDL2
dictionaries, of putting the parent information with the children, rather
than the children information with the parents in the formal dictionary,
without duplication to minimize maintenance problems, but that Brian's
nice formatting program be modified to gather the child information
so that it is printed with the parent information as well as with each
child to help people in reading these dictionaries.

   I would suggest doing the same for the DDLm dictionaires:  provide
print formatting tools to gather child information relating to a parent
to put with the parent as a useful index.

   Regards,
     Herbert



On 12/21/12 1:26 AM, James Hester wrote:
> I will have a stab at this, although somebody with more experience of 
> the mmCIF development process may wish to comment further.
>
> As far as I can tell, programmatically the only way to fix the 
> 'missing parent' problem you identify is indeed to go through the 
> entire dictionary processing 'item_linked.parent_name' and 
> '_item_linked.child_name' loops, which are usually found at the top of 
> the pointer tree (in this case in the _atom_site save frame).  This 
> same save frame also contains a list of category ids for each of the 
> 'id' values.  My approach in PyCIFRW is to repopulate the individual 
> definitions when ingesting the dictionary, to save time later.  The 
> PyCIFRW code and comments for this can be found at 
> https://bitbucket.org/jamesrhester/pycifrw/src/78576030f75bb4f8cb52d84a60e603815ad38afb/pycifrw/CifFile.nw?at=stable
> starting at line 839, with lines 854-862 describing and discussing 
> your issue.  Note also subsequent lines discussing PDBX.
>
> There is a school of thought that the category name is 'implicit' in a 
> DDL2 dataname or save frame name, however IT Vol G states that this is 
> conventional rather than required so I prefer (like you it seems) 
> never to assume this unless given no alternative.
>
> An mmCIF/PDB person may wish to comment on the philosophical reasons 
> behind these decisions, which I gather have something to do with 
> taking a relational database view of a CIF file.
>
> all the best,
> James.
>
> On Thu, Dec 20, 2012 at 1:32 PM, Richard Gildea <rgildea@gmail.com 
> <mailto:rgildea@gmail.com>> wrote:
>
>     Dear All,
>
>     Certain definitions in the mmcif dictionary (e.g.
>     _atom_site_anisotrop.id <http://atom_site_anisotrop.id>) do not
>     contain the items _item.category_id or _item_linked.parent_name.
>     Without these data items, how is it possible to identify
>     programmatically that _atom_site_anisotrop.id
>     <http://atom_site_anisotrop.id> belongs to the
>     _atom_site_anisotrop category and that it is a pointer to
>     _atom_site.id <http://atom_site.id> (without examining every save
>     frame?
>
>     For quick reference here is the definition in question:
>
>     save__atom_site_anisotrop.id  <http://save__atom_site_anisotrop.id>
>          _item_description.description
>     ;              This data item is a pointer to _atom_site.id  <http://atom_site.id>  in the ATOM_SITE
>                     category.
>     ;
>          _item.name  <http://item.name>                   '_atom_site_anisotrop.id  <http://atom_site_anisotrop.id>'
>          _item.mandatory_code          yes
>          _item_aliases.alias_name    '_atom_site_aniso_label'
>          _item_aliases.dictionary      cif_core.dic
>          _item_aliases.version         2.0.1
>           save_
>          
>
>     Cheers,
>
>     Richard
>
>     _______________________________________________
>     cif-developers mailing list
>     cif-developers@iucr.org <mailto:cif-developers@iucr.org>
>     http://mailman.iucr.org/mailman/listinfo/cif-developers
>
>
>
>
> -- 
> T +61 (02) 9717 9907
> F +61 (02) 9717 3145
> M +61 (04) 0249 4148
>
>
> _______________________________________________
> cif-developers mailing list
> cif-developers@iucr.org
> http://mailman.iucr.org/mailman/listinfo/cif-developers
>    

_______________________________________________
cif-developers mailing list
cif-developers@iucr.org
http://mailman.iucr.org/mailman/listinfo/cif-developers

Reply to: [list | sender only]
International Union of Crystallography

Scientific Union Member of the International Science Council (admitted 1947). Member of CODATA, the ISC Committee on Data. Partner with UNESCO, the United Nations Educational, Scientific and Cultural Organization in the International Year of Crystallography 2014.

International Science Council Scientific Freedom Policy

The IUCr observes the basic policy of non-discrimination and affirms the right and freedom of scientists to associate in international scientific activity without regard to such factors as ethnic origin, religion, citizenship, language, political stance, gender, sex or age, in accordance with the Statutes of the International Council for Science.