[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Reply to: [list | sender only]
Re: magCIF - policy advice requested
- To: James Hester <jamesrhester@gmail.com>, "Discussion list of the IUCr Committee for the Maintenance of the CIF Standard (COMCIFS)" <comcifs@iucr.org>
- Subject: Re: magCIF - policy advice requested
- From: John Westbrook <jwest@rcsb.rutgers.edu>
- Date: Fri, 30 May 2014 09:52:58 -0400
- In-Reply-To: <CAM+dB2dUqSX5pKMv2P-jVuQ9KRcPWDv5eQwVewC92T+SS+0N2A@mail.gmail.com>
- References: <s8eu8i4xc88jniwfani58dgc.1401194205346@email.android.com> <CAM+dB2dFMAgTXFnnU7tfj-14NotE=LACZANMjzcwOhS8B8=-_w@mail.gmail.com> <CABcsX27=HsJWoTB-1d_yKOTXyr7zbJyoK6RXx+H29iReAnNgEw@mail.gmail.com> <CAM+dB2dk4s8jqfKf9wBzcUhia6_dCSoV=8TycyTjt5e8dgvdug@mail.gmail.com> <5385C7A7.1030003@rcsb.rutgers.edu><CAM+dB2dUqSX5pKMv2P-jVuQ9KRcPWDv5eQwVewC92T+SS+0N2A@mail.gmail.com>
Hi James, My issues is only with getting the dictionary definitions into a single DDL. We routinely use extension dictionarieswhich share a common data model. Regards, John On 5/30/14, 3:53 AM, James Hester wrote:> Hi John W,>> I agree that all the new definitions should be collected together in a single file with a single dictionary dialect. I do not see> that repeating definitions from core CIF/ms_CIF/symCIF is necessary, given that we do not do that for other extensions to core CIF.> Our point of contention then becomes whether or not, if the dictionary dialect chosen is DDL2, we have to include periods in the> datanames. I say it is not necessary. If you think that it would be necessary, can you give me some idea of why? Note that I am> not at all against a particular dictionary (e.g. mmCIF, pdbx) adopting 'dot' notation as a rigorous convention.>> thanks,> James.>> On Wed, May 28, 2014 at 9:25 PM, John Westbrook <jwest@rcsb.rutgers.edu <mailto:jwest@rcsb.rutgers.edu>> wrote:>> Hi James,>> I completely agree with Herbert in this case. Having all of the data names in a single dictionary with a common> dialect makes the content much more accessible for users as well as existing software tools. mmCIF incorporated the core> definitions with DDL2 naming conventions and aliases. The core naming conventions were preserved to the extent possible.> Splitting the definitions across DDL dialects is going to be a significant obstacle to many users at this time.>> Regards,>> John>>>> On 5/28/14, 2:29 AM, James Hester wrote:>> Hi Herbert,>> I believe you are advocating duplicating part or all of the modulated structures dictionary (~100 datanames) within the magnetic> structures dictionary, with aliases as necessary. As far as I can see, this buys us no more than 'a loop /can/ be written> so that>> all datanames have dots in them'. I do not even say 'must be written', because the aliases mean that you could continue to> use the> old-style datanames.>> Regarding confusion, and the lack of it in the case of mmCIF/core CIF: the presence of two datanames (the mmCIF version and> the core> CIF version) for each concept in core CIF has not caused confusion and extra work for the simple reason that there are clear> workflow and software demarcations when doing macromolecular work and chemical crystallography. Programmers, being aware of this> divide, work with the appropriate datanames. This demarcation is not a result of anything that COMCIFS have done and> therefore the> lack of confusion is not something that can be taken for granted when moving to a different community.>> In contrast to the macromolecular/small molecule case, the modulated structures community and the magnetic structures> community are> closely intertwined to the extent that the same programs are used (e.g. JANA). Unlike the macromolecular/core CIF case, the> program> user does not in general know whether they are reading/writing a CIF intended for a modulated structures or a magnetic> structures or> plain core CIF consumer. Therefore, if ms_cif is rewritten in DDL2, all programs must now be> rewritten/recompiled/__redistributed to> read and write both styles of datanames. And what about the fact that many programs that ingest these magCIFs will be ordinary> non-magnetic-aware programs expecting core CIF DDL1-style datanames for e.g. the atom positions?>> A first cost benefit analysis then looks like this:> Costs: rewriting 100 definitions and any software that inputs/outputs those datanames and core CIF datanames> Benefits: all datanames in a loop can have dots in them>> On the face of it, these costs outweigh the benefit by several orders of magnitude.>> As a postscript, I don't know if we quite appreciate the fact that once we have defined a dataname, it is almost impossible to> winkle it out of software. Changing a dictionary from DDL1 style to dotted datanames has never been done before (I would assert> that mmCIF started with a clean slate as their community path was PDB -> mmCIF, not core CIF -> mmCIF. And it has only taken 15> years to get /that/ to start to happen.) The best I think we can do is to provide a solid and widely-adopted CIF API that> can apply>> aliases behind the scenes, in which case we can have a little more confidence in adoption of replacement datanames.>> all the best,> James.>>>> On Wed, May 28, 2014 at 1:43 PM, Herbert J. Bernstein <yayahjb@gmail.com <mailto:yayahjb@gmail.com>> <mailto:yayahjb@gmail.com <mailto:yayahjb@gmail.com>>> wrote:>> Dear James,>> It need not cause any confusion. The core names already in the mmCIF> dictionary have not. Small molecule people use the undotted names.> Macromolecular people use the dotted names. If we simply added aliases> for the modulated structures to the mmCIF dictionary (which probably> should be done anyway) we end up with nice clean magCIF loops and> little or no confusion for modulated structure cifs.>> Regards,> Herbert>>> On Tuesday, May 27, 2014, James Hester <jamesrhester@gmail.com <mailto:jamesrhester@gmail.com>> <mailto:jamesrhester@gmail.com <mailto:jamesrhester@gmail.com>__>> wrote:>> I expect that the magCIF writers would write their datanames to match that part of mmCIF that reproduces core CIF.> The only> issue then becomes the (DDL1) modulated structures dictionary. As you suggest, the modulated structures dictionary> could be> rewritten with DDL2-style names, but I don't believe that this additional work is necessary. It would also create> unwelcome> confusion in the community as to which modulated structure datanames should be used.>>> On Tue, May 27, 2014 at 10:36 PM, Herbert J. Bernstein <yayahjb@gmail.com <mailto:yayahjb@gmail.com>> wrote:>> My own inclination would be to follow the approach followed by mmcif which provides a rather complete dotted> notation> mapping of the core so you end up with much cleaner looking loop headers.>> Regards,> Herbert>> Sent from my Xperia™ smartphone>>> James Hester <jamesrhester@gmail.com <mailto:jamesrhester@gmail.com>> wrote:>> Dear COMCIFS members and advisers:>> I am pleased to advise that a CIF dictionary for description of> magnetic structures (magCIF) is currently in preparation and it is> expected that a final draft could be ready before the IUCr meeting.> This has raised a policy issue for COMCIFS that we need to deal with> in a timely way.>> By its nature, the magCIF dictionary builds on the definitions in the> core CIF dictionary, modulated structures CIF dictionary, and symmetry> CIF dictionary (including extending looped categories). At the same> time, the authors wish it to be a single, coherent document. Core CIF> and the modulated structures dictionary use DDL1 naming conventions,> whereas symCIF is a DDL2 dictionary with DDL2 naming conventions. For> coherence and convenience, the authors of magCIF should clearly use a> single DDL and naming convention.>> My inclination is to recommend writing magCIF using DDL2.> Semantically, this will mean that certain DDL2 concepts (e.g. 'key')> will be implicitly imposed on DDL1 datanames. This mapping is however> straightforward and implied by the presence of 'aliases' in mmCIF and> other DDL2 dictionaries>> More trivially, this approach will result in some loops that have> names not containing a period mixed with names that do contain a> period, and non-looped datanames in the CIF data file will also> contain mixtures of such names. I note that the use of a period to> separate category and item is purely conventional and is not> syntactically or semantically required by the DDL that the dictionary> is written in, so I do not consider this to be a problem.>> A further advantage of DDL2-style names is that when magCIF is> translated into DDLm at some not-too-distant point, the same names can> be used (as DDLm naming conventions are the same as DDL2 naming> conventions) and software written with the DDL2 magCIF dictionary in> mind will not require updating to handle files written against the> 'new' DDLm magCIF.>> Does anybody see any issues with this recommendation?>> James.>>> --> T +61 (02) 9717 9907 <tel:%2B61%20%2802%29%209717%209907> <tel:%2B61%20%2802%29%209717%__209907>> F +61 (02) 9717 3145 <tel:%2B61%20%2802%29%209717%203145> <tel:%2B61%20%2802%29%209717%__203145>> M +61 (04) 0249 4148 <tel:%2B61%20%2804%29%200249%204148> <tel:%2B61%20%2804%29%200249%__204148>>>> _________________________________________________> comcifs mailing list> comcifs@iucr.org <mailto:comcifs@iucr.org>> http://mailman.iucr.org/__mailman/listinfo/comcifs <http://mailman.iucr.org/mailman/listinfo/comcifs>>>>>> --> T +61 (02) 9717 9907 <tel:%2B61%20%2802%29%209717%209907> <tel:%2B61%20%2802%29%209717%__209907>> F +61 (02) 9717 3145 <tel:%2B61%20%2802%29%209717%203145> <tel:%2B61%20%2802%29%209717%__203145>> M +61 (04) 0249 4148 <tel:%2B61%20%2804%29%200249%204148> <tel:%2B61%20%2804%29%200249%__204148>>>> _________________________________________________> comcifs mailing list> comcifs@iucr.org <mailto:comcifs@iucr.org> <mailto:comcifs@iucr.org <mailto:comcifs@iucr.org>>>> http://mailman.iucr.org/__mailman/listinfo/comcifs <http://mailman.iucr.org/mailman/listinfo/comcifs>>>>>> --> T +61 (02) 9717 9907 <tel:%2B61%20%2802%29%209717%209907>> F +61 (02) 9717 3145 <tel:%2B61%20%2802%29%209717%203145>> M +61 (04) 0249 4148 <tel:%2B61%20%2804%29%200249%204148>>>> _________________________________________________> comcifs mailing list> comcifs@iucr.org <mailto:comcifs@iucr.org>> http://mailman.iucr.org/__mailman/listinfo/comcifs <http://mailman.iucr.org/mailman/listinfo/comcifs>>>> -->> John Westbrook, Ph.D.> RCSB, Protein Data Bank> Rutgers, The State University of New Jersey> Department of Chemistry and Chemical Biology> 174 Frelinghuysen Rd> Piscataway, NJ 08854-8087> e-mail: jwest@rcsb.rutgers.edu <mailto:jwest@rcsb.rutgers.edu>> Ph: (848) 445-4290 <tel:%28848%29%20445-4290> Fax: (732) 445-4320 <tel:%28732%29%20445-4320>>> _________________________________________________> comcifs mailing list> comcifs@iucr.org <mailto:comcifs@iucr.org>> http://mailman.iucr.org/__mailman/listinfo/comcifs <http://mailman.iucr.org/mailman/listinfo/comcifs>>>>>> --> T +61 (02) 9717 9907> F +61 (02) 9717 3145> M +61 (04) 0249 4148 -- John Westbrook, Ph.D.RCSB, Protein Data BankRutgers, The State University of New JerseyDepartment of Chemistry and Chemical Biology174 Frelinghuysen RdPiscataway, NJ 08854-8087e-mail: jwest@rcsb.rutgers.eduPh: (848) 445-4290 Fax: (732) 445-4320_______________________________________________comcifs mailing listcomcifs@iucr.orghttp://mailman.iucr.org/mailman/listinfo/comcifs
Reply to: [list | sender only]
- References:
- RE: magCIF - policy advice requested (Herbert J. Bernstein)
- Re: magCIF - policy advice requested (James Hester)
- Re: magCIF - policy advice requested (Herbert J. Bernstein)
- Re: magCIF - policy advice requested (James Hester)
- Re: magCIF - policy advice requested (John Westbrook)
- Re: magCIF - policy advice requested (James Hester)
- Prev by Date: Re: magCIF - policy advice requested
- Next by Date: Re: magCIF - policy advice requested
- Prev by thread: Re: magCIF - policy advice requested
- Next by thread: RE: magCIF - policy advice requested
- Index(es):