Discussion List Archives

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [dddwg] To contribute to Open Science using Zenodo I foundquite straightforward

Dear Wladek,

the separation of data and metadata is not imposed by HDF5 but part of the HDF5 implementation by EIGER.  I agree that it's not the most elegant and certainly not the best solution, but when representatives from the MX community agreed on a new standard for the future EIGER detector in 2013, this is how it was described.

The master.h5 file contains all relevant metadata for the experiment (though we're not there yet), and the data.h5 file or files contain the data.  The data are linked to from within master.h5/entry/data/.  There aren't many reasons the data shouldn't physically be in master.h5/entry/data.  If this were the case (and it probably is for most other data recorded in h5), you'd have to deal with one file only, with all the relevant metadata inseparably (unless you try hard) associated with the data.  You'd also have to deal with gigabyte files, which might overwhelm the FAT32 disks many people still use to transport they data.  Some linux admins also limit the size of files users can create (and thus save).  This is the ulimit -f setting.

You could, if you wanted to, move all data from the external data.h5 files into master.h5/entry/data yourself.  This would be a good idea for archival in my opinion.  It would also be absolutely transparent because all MX software that knows how to work with HDF5 reads master.h5 and just follows the links.  The programs would treat data exactly the same if they were saved in master.h5/entry/data (unless I'm much mistaken).

I will try to create a single-file HDF5 MX dataset to verify this.

All best.


Andreas



On Wed, Sep 28, 2016 at 7:17 AM, Wladek Minor <wladek@iwonka.med.virginia.edu> wrote:
All,

There is one problem with hdf5 format. hdf5 requires master file that should contain description of experiment. For example most of the eiger data are produced without header. The master file format is arbitrary at the moment.

Wladek




On 9/27/2016 10:42 AM, Andrew Goetz wrote:
Simon,

why not pack the files into and hdf5 container ? That is what we are planning to do for archiving files.

Andy

On 27.09.2016 16:27, Coles S.J. wrote:
John,

This is very good to hear!

I personally have an example where I wish to make the images available
for a data collection associated with a remarkable structure we are
going to submit to an IUCr journal quite soon. The primary reason I
want to make the raw data available is so that others can take them
and ‘have a go themselves’ to see if they can come up with a different
/ better model.

All fine – but I note you uploaded individual images? Or did Zenodo
unpack a zipfile? I don’t want to go through the effort of uploading
300 individual files – and I’m absolutely sure nobody wants to go
through the effort of downloading 300 individual files to reprocess
and reuse the dataset! Is there a reason for this - why not bundle
them together in a package of some sort (Zip too big)? Or am I missing
something?

Thanks!
Simon.

Simon Coles FRSC, SFHEA.
Professor of Structural Chemistry.
Director, UK National Crystallography Service.
Chemistry, Faculty of Natural and Environmental Sciences,
University of Southampton.
Southampton, SO17 1BJ. UK.
+44(0)2380596721
Staff Page: http://www.soton.ac.uk/chemistry/about/staff/sjc5.page
NCS: http://www.ncs.ac.uk | Southampton Diffraction Centre:
http://www.soton.ac.uk/sdc
ResearcherID: http://www.researcherid.com/rid/A-1795-2009 | ORCID:
http://orcid.org/0000-0001-8414-9272


From: dddwg <dddwg-bounces@iucr.org> on behalf of
"john.helliwell@manchester.ac.uk" <john.helliwell@manchester.ac.uk>
Date: Tuesday, 27 September 2016 at 14:48
To: "dddwg@iucr.org" <dddwg@iucr.org>
Subject: [dddwg] To contribute to Open Science using Zenodo I found
quite straightforward

Dear Colleagues,
I imagine that it will be of general interest that to contribute to
"OpenScience" harnessing Zenodo is quite straightforward ie:- preprint
and raw data uploaded and assigned dois on Sept 24th 2016 and then for
good measure announced on my twitter account @HelliwellJohn on Sept
25th 2016:-

#OpenScience Our atomic resolution cisplatin lysozyme study; the dois
are here:- http://doi.org/10.5281/zenodo.155068
http://doi.org/10.5281/zenodo.154704

The PDB code (5LXW) is cited in the preprint at the first Zenodo doi
above. My instruction to PDB to release the entry 5LXW I made on Sept
25th 2016.

Greetings,
John

Emeritus Prof of Chemistry John R Helliwell DSc_Physics
Perspectives in
Crystallography<https://outlook.manchester.ac.uk/owa/redir.aspx?SURL=JLjRW0_uSO2uCWXoeK-_YEiBuCJ_wkHJ5QxVDbIIAj0l57JRLfrSCGgAdAB0AHAAcwA6AC8ALwB3AHcAdwAuAGMAcgBjAHAAcgBlAHMAcwAuAGMAbwBtAC8AUABlAHIAcwBwAGUAYwB0AGkAdgBlAHMALQBpAG4ALQBDAHIAeQBzAHQAYQBsAGwAbwBnAHIAYQBwAGgAeQAvAEgAZQBsAGwAaQB3AGUAbABsAC8AOQA3ADgAMQA0ADkAOAA3ADMAMgAxADAAOQA.&URL=https%3a%2f%2fwww.crcpress.com%2fPerspectives-in-Crystallography%2fHelliwell%2f9781498732109>

Skills for a Scientific
Life<https://www.crcpress.com/Skills-for-a-Scientific-Life/Helliwell/p/book/9781498768757>
_______________________________________________
dddwg mailing list
dddwg@iucr.org
http://mailman.iucr.org/cgi-bin/mailman/listinfo/dddwg
_______________________________________________
dddwg mailing list
dddwg@iucr.org
http://mailman.iucr.org/cgi-bin/mailman/listinfo/dddwg

--
Dr. Wladek Minor
Harrison Distinguished Professor
Department of Molecular Physiology and Biological Physics
Phone: 434-243-6865
Fax: 434-982-1616
http://krzys.med.virginia.edu/CrystUVa/wladek.htm


US-mail address:
Department of Molecular Physiology and Biological Physics
University of Virginia
PO Box 800736, Charlottesville, VA 22908-0736

Fed-Ex address:
Department of Molecular Physiology and Biological Physics
1340 Jefferson Park Avenue
University of Virginia
Charlottesville, VA 22908


----

_______________________________________________
dddwg mailing list
dddwg@iucr.org
http://mailman.iucr.org/cgi-bin/mailman/listinfo/dddwg



--
Andreas Förster, Ph.D.
MX Application Scientist, Scientific Sales
Phone: +41 56 500 2100 | Direct: +41 56 500 2176 | Email: andreas.foerster@dectris.com
DECTRIS Ltd. | Taefernweg 1 | 5405 Baden-Daettwil | Switzerland www.dectris.com

LinkedIn
 facebook 





Confidentiality Note: This message is intended only for the use of the named 
recipient(s) and may contain confidential and/or privileged information. If you 
are not the intended recipient, please contact the sender and delete the message.
Any unauthorized use of the information contained in this message is prohibited.


_______________________________________________
dddwg mailing list
dddwg@iucr.org
http://mailman.iucr.org/cgi-bin/mailman/listinfo/dddwg

Reply to: [list | sender only]
International Union of Crystallography

Scientific Union Member of the International Science Council (admitted 1947). Member of CODATA, the ISC Committee on Data. Partner with UNESCO, the United Nations Educational, Scientific and Cultural Organization in the International Year of Crystallography 2014.

International Science Council Scientific Freedom Policy

The IUCr observes the basic policy of non-discrimination and affirms the right and freedom of scientists to associate in international scientific activity without regard to such factors as ethnic origin, religion, citizenship, language, political stance, gender, sex or age, in accordance with the Statutes of the International Council for Science.