[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Reply to: [list | sender only]
Re: [Cif2-encoding] Splitting of imgCIF and other sub-topics
- To: Group for discussing encoding and content validation schemes for CIF2 <cif2-encoding@xxxxxxxx>
- Subject: Re: [Cif2-encoding] Splitting of imgCIF and other sub-topics
- From: "Herbert J. Bernstein" <yaya@xxxxxxxxxxxxxxxxxxxxxxx>
- Date: Tue, 24 Aug 2010 09:31:51 -0400 (EDT)
- In-Reply-To: <AANLkTi=+qZQrWJ3duOzWyPq5H=w1GOVbeKRfFLTR8u5a@mail.gmail.com>
- References: <AANLkTilyJE2mCxprlBYaSkysu1OBjY7otWrXDWm3oOT9@mail.gmail.com><614241.93385.qm@web87016.mail.ird.yahoo.com><alpine.BSF.2.00.1006251827270.70846@epsilon.pair.com><8F77913624F7524AACD2A92EAF3BFA54166122952D@SJMEMXMBS11.stjude.sjcrh.local><33483.93964.qm@web87012.mail.ird.yahoo.com><8F77913624F7524AACD2A92EAF3BFA541661229533@SJMEMXMBS11.stjude.sjcrh.local><AANLkTilqKa_vZJEmfjEtd_MzKhH1CijEIglJzWpFQrrC@mail.gmail.com><8F77913624F7524AACD2A92EAF3BFA541661229542@SJMEMXMBS11.stjude.sjcrh.local><AANLkTikTee4PicHKjnnbAdipegyELQ6UWLXz9Zm08aVL@mail.gmail.com><8F77913624F7524AACD2A92EAF3BFA541661229552@SJMEMXMBS11.stjude.sjcrh.local><AANLkTinZ4KNsnREOOU6sVFdGYR_aQHcjdWr_ko648NGm@mail.gmail.com><8F77913624F7524AACD2A92EAF3BFA5416659DED8C@SJMEMXMBS11.stjude.sjcrh.local><AANLkTintziXhwVCEFD0yUtTDo9KG8ut=oL4OgmkjmEBe@mail.gmail.com><alpine.BSF.2.00.1008240629120.23114@epsilon.pair.com><AANLkTi=+qZQrWJ3duOzWyPq5H=w1GOVbeKRfFLTR8u5a@mail.gmail.com>
Dear James, I have not been at all reticent -- imgCIF will be very poorly supported by CIF2 as currently proposed. Of necessity, imgCIF changes encodings internally -- that it why it uses MIME -- same problem as email with images, same solution. Any purely text version has at least a 7% overhead as compared to pure binary. Restricting to UTF-8 increases the overhead to at least 50%. We may get away with the 7% (UTF-16). The 50% version (UTF-8) will be ignored by the community as unworkable. The most likely to be used version will be the current DDL2-based version with embedded compressed binaries that I am augmenting with DDLm-like features and merging in with HDF5. As I noted many months ago, the unfortunate reality is that the current CIF2 effort will not merge well with imgCIF. If avoiding a split is a important -- we need a meeting. I would suggest involving Bob Sweet and holding it at BNL in conjunction with something relevant to NSLS-II. Regards, Herbert ===================================================== Herbert J. Bernstein, Professor of Computer Science Dowling College, Kramer Science Center, KSC 121 Idle Hour Blvd, Oakdale, NY, 11769 +1-631-244-3035 yaya@dowling.edu ===================================================== On Tue, 24 Aug 2010, James Hester wrote: > Hi Herbert: regarding imgCIF, I agree that splitting it off is not a > desirable outcome. I would like to get an idea of how well imgCIF can > be accommodated under the various encoding proposals currently > floating around, as you have been rather reticent to bring it up. My > naive take on things is that a UTF8-only encoding scheme for CIF2 > would not pose significant issues for imgCIF, and a decorated UTF16 > encoding in the style of Scheme B would be even better, and quite > adequate, so imgCIF is not actually presenting any problems and so was > a red herring. > > I'm not sure that face-to-face or Skype discussions are necessarily > going to be more productive. Writing things down, while slower, > allows me at least to collect my thoughts and those of other > participants, and hopefully make a reasoned contribution (my apologies > if I am too long-winded) and as an added bonus those thoughts are > recorded for later reference. For example, where would I now find the > background on why a container format for imgCIF is such a bad idea? > Presumably that was all thrashed out in face to face discussions, and > no record now remains. > > On Tue, Aug 24, 2010 at 8:56 PM, Herbert J. Bernstein > <yaya@bernstein-plus-sons.com> wrote: >> Dear Colleagues, >> >> James' and John's last interchange is so voluminous, I doubt any of >> us has been able to fully appreciate the rich complexity of ideas >> contained therein. For example, one of the suggestions far down in >> the text is: >> >> (James now) Indeed. My intent with this specification was to ensure >> that third parties would be able to recover the encoding. If imgCIF is >> going to cause us to make such an open-ended specification, it is >> probably a sign that imgCIF needs to be addressed separately. For >> example, should we think about redefining it as a container format, >> with a CIF header and UTF16 body (but still part of the >> "Crystallographic Information Framework")? >> >> The idea of an imgCIF "header" in CIF format and a image in another is an >> old, well-established, thoroughly discussed, and mistaken idea, rejected >> in 1998. The handling of multiple images in a single file (e.g. >> a jpeg thumbnail and crystal image and a full-size diffraction image) >> requires the ability to switch among encodings within the file -- >> something handled by the current DDL2 and MIME-based imgCIF format and >> which would be a serious problem in CIF2 has currently proposed, >> increasing the chances that we will have to move imgCIF entirely into >> HDF5 and abandon the CIF representation entirely, sharing only >> the dictionary and not the framework. >> >> If you look carefully, you will see a similar trend with mmCIF, in which >> and XML representation sharing the dictionary plays a much more >> important role than the CIF format. >> >> Is it really desirable to make the new CIF format so rigid and >> unadaptable that major portions of macromolecular crysallography >> end up migrating to very different formats, as they already are >> doing? Yes, there is great value in having a common dictionary, >> but would there not be additional value in having a sufficiently >> flexible common format to allow for more software sharing than >> we now have? It is really desirable for us to continue in the >> direction of a single macromolecular experiment having to >> deal with HDF5 and CIF/DDL2/MIME representations of the image data >> during collection, CCP4-style CIF representations during processing >> and deposition and legacy PDB and PDBML representations in subsequent >> community use? If we could be a little bit more flexible, we might be >> able to reduce the data interchange software burdens a little. >> Right now, this discussion seems headed in the direction of simply >> adding yet another data representation (DDLm/CIF2) to the mix, >> increasing the chances of mistranslation and confusion, rather >> that reducing them. >> >> Please, step back a bit from the detailed discussion of UTF8 and >> look at the work-flow of doing and publishing crystallographic >> experiments and let us try to make a contribution that simplifies >> it, not one that makes it more complex than it needs to be. >> >> I suggest we need to meet and talk, either face-to-face, or by skype. >> >> Regards, >> Herbert >> >> ===================================================== >> Herbert J. Bernstein, Professor of Computer Science >> Dowling College, Kramer Science Center, KSC 121 >> Idle Hour Blvd, Oakdale, NY, 11769 >> >> +1-631-244-3035 >> yaya@dowling.edu >> ===================================================== >> >> _______________________________________________ >> cif2-encoding mailing list >> cif2-encoding@iucr.org >> http://scripts.iucr.org/mailman/listinfo/cif2-encoding >> > > > > -- > T +61 (02) 9717 9907 > F +61 (02) 9717 3145 > M +61 (04) 0249 4148 > _______________________________________________ > cif2-encoding mailing list > cif2-encoding@iucr.org > http://scripts.iucr.org/mailman/listinfo/cif2-encoding >
_______________________________________________ cif2-encoding mailing list cif2-encoding@iucr.org http://scripts.iucr.org/mailman/listinfo/cif2-encoding
Reply to: [list | sender only]
- Follow-Ups:
- Re: [Cif2-encoding] Splitting of imgCIF and other sub-topics (James Hester)
- References:
- Re: [Cif2-encoding] [ddlm-group] options/text vsbinary/end-of-line. .. .. .. .. .. .. .. .. .. .. .. .. . (James Hester)
- Re: [Cif2-encoding] [ddlm-group] options/text vsbinary/end-of-line . .. .. .. .. .. .. .. .. .. .. .. .. .. . (Bollinger, John C)
- Re: [Cif2-encoding] [ddlm-group] options/text vs binary/end-of-line. .. .. .. .. .. .. .. .. .. .. .. .. .. . (James Hester)
- [Cif2-encoding] Splitting of imgCIF and other sub-topics (Herbert J. Bernstein)
- Re: [Cif2-encoding] Splitting of imgCIF and other sub-topics (James Hester)
- Prev by Date: Re: [Cif2-encoding] Splitting of imgCIF and other sub-topics
- Next by Date: Re: [Cif2-encoding] [ddlm-group] options/text vs binary/end-of-line. .. .. .. .. .. .. .. .. .. .. .. .. .. .
- Prev by thread: Re: [Cif2-encoding] Splitting of imgCIF and other sub-topics
- Next by thread: Re: [Cif2-encoding] Splitting of imgCIF and other sub-topics
- Index(es):