Discussion List Archives

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [dddwg] Initiation of formal proposal resulting fromdiscussions at the DDDWG satellite meeting of the ECM Croatia


Dear All,

HKL process around 280 frame formats. Among them are data that were converted to more popular formats in order to 'be processed'. The conversion usually lead to suboptimal data. For some experiments (ordinary small molecules and protein MR)  this usually does not affect results significantly. However for  SAD and absolute configuration may lead to serious problems. Processing that looks OK not necessarily produce correct data - see several paper withdrawals from Nature and Science.

The header proliferation ca be easily limited by idea of setting short header that is the same for all formats, detectors and allow for processing the data.

Images are more difficult die to various compression schemes and peculiarities.

Best regards

Wladek


On 12/5/2015 5:10 PM, Herbert J. Bernstein wrote:
CABcsX24ctBmz4DY+GFoANDgyOqOUVXUG-y4jpQxQtZ_y4=CkeA@mail.gmail.com" type="cite">
Dear Colleagues,

  Having a "hub" format is certainly a useful idea for conversions, but to be useful for the full range of formats is would be a good idea for it to be able to faithfully preserve all data and metadata likely to appear in any of the images we need to deal with, and for it to deal with modern multi-image files.  Also, is would be a good idea for such a hub to support some range of appropriate compressions, so that whether you are dealing with, say, a few 1 megapixel images, or with, say, a run of 3600 18-megapixel 32-bit-pixel images collected with a Eiger 16M in less than a minute the conversions could be manged with reasonable network and storage requirements could be managed.  At the same time it would be desirable for such a hub format to acceptable for use both with home sources and at beamlines.

  I am not certain that there is any one format that can satisfy all these requirements for all applications, but at the moment, I believe that the combination of imgCIF/CBF and NeXus/HDF5 comes fairly close, which is why the IUCr Committee on the Maintenance of the CIF Standard and the NeXus International Advsory Committee have been working for the past few years at making those two formats fully interoperable, using the Dectris Eiger as used in MX data collection as a test case.  There is much work still to be done, but the results so far look promising.

  I would suggest carefully considering the results of that effort in designing any archiving strategy.

  Regards,
    Herbert


On Sat, Dec 5, 2015 at 3:55 PM, Kamil Dziubek <rumianek@amu.edu.pl> wrote:

Dear All,

Indeed Esperanto is merely one of very many diffraction image formats, and therefore perhaps not worthy of particular attention. The interesting feature is however not the format itself, but its use as a vehicle to transform other common area-detector data formats via a translator software. With that advantage, CrysAlisPro is one of few commercial diffraction data software packages (provided with lab setups) capable of importing and processing not only native, but also foreign image formats. This conversion tool is described in the paper I referred to in my previous email and seems to work pretty well.

With the observation in mind, if it would not be feasible to quickly "convince all developers to rewrite their firmware to output a common image format" (as Mike said in his outline) an external interconversion tool could be actually helpful. For example, the software called Open Babel (http://openbabel.org) is known to convert over 110 file formats and data in the fields of molecular modeling, computational chemistry and cheminformatics.

Yours,
Kamil

On 2015-12-04 20:10, Herbert J. Bernstein wrote:

Dear Colleagues,

  Crysalis Esperato is one of the very large number of good image formats for diffraction images.  There are more than 200 of them.  If we get into the habit of archiving all images in their native formats, then we had better also archive all the necessary software to read those images, or when the time comes to read back images from that archive, we may find it very difficult a few years later without that software and some way to run it on then-current systems.

  Regards,
    Herbert

On Fri, Dec 4, 2015 at 10:58 AM, Kamil Dziubek <rumianek@amu.edu.pl> wrote:

Dear All,

Thank you Mike for your brief recapitulation about reaching a consensus on a common raw data storage format. As John and Brian have noted, MX and 'small molecule' single crystal diffraction studies (and not only molecular crystals, also minerals, inorganics, etc.) are at antipodes concerning the commonly used data collection setups. I hope that as soon as such a generic image format will be generally accepted, the authors of the software provided with home lab diffractometers can include conversion tools in the updated versions of their programs.

I would like also to draw your attention that one of the companies providing instruments for single crystal diffraction experiments, namely Rigaku Oxford Diffraction, introduced a generic data image format called 'Esperanto', and included it in the commercial data processing software package CrysAlisPro. This format is an efficient tool converting the most common area-detector data formats (Dectris, Rigaku d*trek, BrukerAXS saxi, Mar/Rayonix, Stoe IPDS) to be imported into the CrysAlisPro software. It has proved useful in a number of cases, including high pressure single crystal diffraction experiments (I have already used this method to process the data collected at two beamlines at the ESRF and one at SOLEIL).

The details of the method are given in the following paper:

http://scripts.iucr.org/cgi-bin/paper?S0909049513018621

Best wishes,

Kamil

On 2015-12-03 18:00, John Helliwell wrote:

Dear Mike,
Many thanks for bringing your proposal about area detector raw data image formats, that you aired in Rovinj, forward.
You mention imgcif and HDF5/NeXus explicitly and so we invite Herbert Bernstein, as chair of that work, to respond directly to your proposal and its possible practical implementation.
Thankyou,
John and Brian
PS Just one, admittedly very specific detail, whilst the bulk of MX data is collected at the synchrotron (estimated at around 90%) we believe that about 95% of 'small molecule' single crystal are detector data is measured on home lab set ups.

Emeritus Prof of Chemistry John R Helliwell DSc_Physics
Perspectives in Crystallography
 
 
From: dddwg [dddwg-bounces@iucr.org] on behalf of Michael Probert [Michael.Probert@newcastle.ac.uk]
Sent: 03 December 2015 15:05
To: dddwg@iucr.org
Subject: Re: [dddwg] Initiation of formal proposal resulting from discussions at the DDDWG satellite meeting of the ECM Croatia
 

Dear All,

 

following a lively and entertaining discussion at this year's DDDWG satellite meeting in Croatia, I feel that we should attempt to formalise some of the thoughts discussed. Therefore I enclose a starting point for discussion in a proposal at the bottom of this email. I feel very strongly about the need for advancement in this area and that the time is absolutely correct to initiate this. It has recently been pointed out that some institutions are already archiving raw data and defining sensible protocols for this seems incredibly sensible if not an absolute necessity for the longevity of such projects.

 

Please feel free to comment on the outline below - I would hope that we could come to some agreed position that could then be taken forward by the group leaders as representative of our collective feelings on the issue of data storage, usefulness and to a certain extent future proofing.

 

I hope that I have managed to convey my ideas clearly and that the proposal makes sense. I am certain that there are aspects that need clarification and am equally certain that a large degree of finessing may be required before this can be taken to the next step. However we must start somewhere and condensing ideas from the meeting seems a good place to start.

 

Many thanks for your time, bye for now

 

Mike

 

The need for fully archived data is becoming more apparent and the
volume of said data is becoming ever greater. One of the larger
hurdles to this process is that for the data archived to be useful it
must be stored in a format that allows other users the ability to
interact with it. Some years ago the idea of imgCIF was created, but
for various reasons instrument manufacturers were reluctant to adapt
to this format. Since then with the advent of newer detector
technologies there has been a small explosion in the number and
variety of frame formats that are currently in use. It now seems a
daunting uphill task to convince all developers to rewrite their
firmware to output a common image format, therefore an alternative
must be found. As a community we currently archive data (positions and
structure factors) in a common format - CIF. There is no reason why
this philosophy would not work for the raw data as well. Users
currently convert all of their processed data into CIF format for
publication, therefore I put it to the DDDWG that one sensible way
forward would be to have users archive their raw data in a common format
(be that imgCIF or HDF5/NeXus) at the point of submission. There are
currently image conversion utilities available for some image formats
and it would not take a large investment of time to generate these for
all users; indeed, I am sure nearly all of these are written in various
places around the world. If the conversion is lossless and all
information on the experimental setup is maintained then there is no
reason for any degradation of data, but there is the huge advantage
that this information would then be of use to everyone for
reinvestigation or authentication protocols. I believe this results in
one moderately sized problem in deciding which format is the best to
use for archiving. This problem can be approached in different ways
although there is, I believe, a simple and pragmatic answer; the
majority of raw data is now produced at synchrotrons due to the
technologies employed - therefore we should take the direction from

them as they are mostly working towards something common in format.


 

 

 

Dr Michael R. Probert
Head of Crystallography
Lecturer in Inorganic Chemistry
School of Chemistry
Newcastle University
Bedson Building
Newcastle upon Tyne
NE1 7RU

tel: +44(0) 191 208 6641
fax: +44(0) 191 208 6929
_______________________________________________
dddwg mailing list
dddwg@iucr.org
http://mailman.iucr.org/cgi-bin/mailman/listinfo/dddwg

 

 

_______________________________________________
dddwg mailing list
dddwg@iucr.org
http://mailman.iucr.org/cgi-bin/mailman/listinfo/dddwg

_______________________________________________
dddwg mailing list
dddwg@iucr.org
http://mailman.iucr.org/cgi-bin/mailman/listinfo/dddwg

 

 

 



_______________________________________________
dddwg mailing list
dddwg@iucr.org
http://mailman.iucr.org/cgi-bin/mailman/listinfo/dddwg

-- 
Dr. Wladek Minor
Professor of Molecular Physiology and Biological Physics
Phone: 434-243-6865
Fax: 434-982-1616
http://krzys.med.virginia.edu/CrystUVa/wladek.htm


US-mail address:
Department of Molecular Physiology and Biological Physics
University of Virginia
PO Box 800736, Charlottesville, VA 22908-0736

Fed-Ex address:
Department of Molecular Physiology and Biological Physics
1340 Jefferson Park Avenue
University of Virginia
Charlottesville, VA 22908

----
_______________________________________________
dddwg mailing list
dddwg@iucr.org
http://mailman.iucr.org/cgi-bin/mailman/listinfo/dddwg

Reply to: [list | sender only]
International Union of Crystallography

Scientific Union Member of the International Science Council (admitted 1947). Member of CODATA, the ISC Committee on Data. Partner with UNESCO, the United Nations Educational, Scientific and Cultural Organization in the International Year of Crystallography 2014.

International Science Council Scientific Freedom Policy

The IUCr observes the basic policy of non-discrimination and affirms the right and freedom of scientists to associate in international scientific activity without regard to such factors as ethnic origin, religion, citizenship, language, political stance, gender, sex or age, in accordance with the Statutes of the International Council for Science.