Discussion List Archives

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Imgcif-l] Adding references to external files to imgCIF

Dear Colleagues,
  Since 2012 NIAC and COMCIFS have worked cooperatively to make imgCIF/CBFand NeXus/HDF5 fully interoperable.  This is veryfar along, e.g.with NeXus/HDF5 NXtransformations having been added toNeXus/HDF5 to carry the same information as imgCIF/CBF AXIS.What James has suggested will allow imgcif/CBF to carry the same datasetstructure information as is conveyed in the external links ofan Eiger dataset, which divides the collected data into a master file withthe metadata and a set of datafiles.  This structural divisionmay not be important for some smaller datasets with only a few hundred to afew thousand frames, but can be very important inhandling datasets with more frames than that that are encountered in serialcrystallography.  Even for the smaller datasets this approach canhelp to solve a problem for archives and facilities that need to storemetadata in a relational database while the data itself has been parked inraw file systems, non-relational databases, zenodo, etc.  As with almostall of CIF, imgCIF/CBF metadata maps very easily and directlyinto relational tables, while putting NeXus/HDF5 metadata into a relationaldatabase first requires exactly the same sort of transformationsas we have already designed to map NeXus/HDF5 metadata into imgCIF/CBF   Tome it seems that James' suggestion is not a reinventionof this particular wheel, but may be an important step in avoidingreinvention of the wheel.  This may avoid a lot of unnecessarytransformationof huge quantities of raw data in serial crystallography while making themetadata more accessible.
  I would suggest giving James' suggestion serious consideration.
  Regards,    Herbertwhile putting
On Wed, Feb 13, 2019 at 4:02 AM James Hester <jamesrhester@gmail.com> wrote:
> Dear Graeme,>> The context of this is the idea that a single imgCIF file could be> generated from a collection of raw image files (in whatever format, whether> HDF5, or ADSC, or Bruker, or Rigaku, etc.) which would contain the metadata> pertaining to that collection. In such a situation, some way of referring> to the raw frames from within the imgCIF file is required.>> I agree that a perfectly reasonable approach is not to generate any new> file at all, and simply to access the metadata directly in whatever format> happens to be there. This was my initial impulse as well and it took me a> while to understand that the actual proposal was to create an imgCIF file,> rather than just use imgCIF datanames for specification purposes.  From a> semantic point of view both amount to the same thing so my only real> motivation here is to add an image linking facility to imgCIF so that the> "generate a summary metadata file" approach is possible.>> Could we just copy the HDF5 way of referring to objects in other HDF5 files> as a quick solution?>> all the best,> James.>> On Wed, 13 Feb 2019 at 19:03, Graeme.Winter@Diamond.ac.uk <> Graeme.Winter@diamond.ac.uk> wrote:>> > Dear James,> >> > On the face of it, this looks a lot to me like a reinvention of HDF5 -> > perhaps with specific semantics - and there is already a (complete?)> > mapping from imgCIF to HDF5 / NeXus> >> > Have I missed something? No offence meant, trying to understand the shape> > of the problem you are trying to solve> >> > Thanks & best wishes Graeme> >> > > On 13 Feb 2019, at 05:15, James Hester <jamesrhester@gmail.com> wrote:> > >> > > Dear All,> > >> > > Recent Commdat discussion revealed a desire to reference external> images> > > from within an imgCIF file. This would allow the metadata for a dataset> > to> > > be held within a single imgCIF file, while the frames themselves remain> > > separate. This avoids the impracticality of navigating through an> > enormous> > > mulit-frame imgCIF file in order to extract a relatively compact amount> > of> > > information.> > >> > > As a starting proposal, I suggest we extend the _array_data category> with> > > the following three datanames:> > >> > > (1) _array_data.external_format    A value drawn from an enumerated> list> > of> > > formats (e.g. "SMV","HDF5","Bruker"). The definition for each> enumerated> > > value would explain how to interpret _array_data.internal_path> > > (2) _array_data.location_url           A URI for the file containing> the> > > image. A relative URL is relative to the location of the imgCIF file> > > (3) _array_data.internal_path        A format-specific string> describing> > > the location of the frame within the file identified by> > > _array_data.location_uri, interpreted according to the value given in> > > _array_data.external_format> > >> > > So for a multi-frame HDF5 file buried in a subdirectory of the location> > > referenced with a DOI, with appropriate definitions of the path> notation:> > >> > > loop_> > > _array_data.array_id> > > _array_data.binary_id> > > _array_data.external_format> > > _array_data.location_uri> > > _array_data.internal_path> > > 1 1 NXMX doi:x.y.z> directory/run/masterfilename:/entry1/detector/data[0]> > > 1 2 NXMX doi:x.y.z> directory/run/masterfilename:/entry1/detector/data[1]> > > ...> > >> > > Or for a bunch of single-frame files generated by an ADSC detector in> the> > > same directory as the imgCIF file> > >> > > _array_data.array_id> > > _array_data.binary_id> > > _array_data.external_format> > > _array_data.location_uri> > > 1 1 ADSC ./tartaric.001> > > 1 2 ADSC ./tartaric.002> > > 1 3 ADSC ./tartaric.003> > > ...> > >> > > The imgCIF data items describing the structure of the data array would> > > refer to the data after it has been provided by the format. The form in> > > which it is provided should be specified in the definition of each> value> > of> > > "_array_data.external_format".  So, for example, the various> compression> > > methods in HDF5 would be invisible if the data as returned are> specified> > to> > > be an array of Reals.> > >> > > From the point of view of initial data validation, it would be> sufficient> > > to check that all referenced files are accessible, and that the> provided> > > locations exist.> > >> > > Thoughts?> > > James.> > >> > > --> > > T +61 (02) 9717 9907> > > F +61 (02) 9717 3145> > > M +61 (04) 0249 4148> > > _______________________________________________> > > imgcif-l mailing list> > > imgcif-l@iucr.org> > > http://mailman.iucr.org/cgi-bin/mailman/listinfo/imgcif-l> >> >> > --> > This e-mail and any attachments may contain confidential, copyright and> or> > privileged material, and are for the use of the intended addressee only.> If> > you are not the intended addressee or an authorised recipient of the> > addressee please notify us of receipt by returning the e-mail and do not> > use, copy, retain, distribute or disclose the information in or attached> to> > the e-mail.> > Any opinions expressed within this e-mail are those of the individual and> > not necessarily of Diamond Light Source Ltd.> > Diamond Light Source Ltd. cannot guarantee that this e-mail or any> > attachments are free from viruses and we cannot accept liability for any> > damage which you may sustain as a result of software viruses which may be> > transmitted in or with the message.> > Diamond Light Source Limited (company no. 4375679). Registered in England> > and Wales with its registered office at Diamond House, Harwell Science> and> > Innovation Campus, Didcot, Oxfordshire, OX11 0DE, United Kingdom> >> >>> --> T +61 (02) 9717 9907> F +61 (02) 9717 3145> M +61 (04) 0249 4148> _______________________________________________> imgcif-l mailing list> imgcif-l@iucr.org> http://mailman.iucr.org/cgi-bin/mailman/listinfo/imgcif-l>_______________________________________________imgcif-l mailing listimgcif-l@iucr.orghttp://mailman.iucr.org/cgi-bin/mailman/listinfo/imgcif-l

Reply to: [list | sender only]
International Union of Crystallography

Scientific Union Member of the International Council for Science (admitted 1947). Member of CODATA, the ICSU Committee on Data. Member of ICSTI, the International Council for Scientific and Technical Information. Partner with UNESCO, the United Nations Educational, Scientific and Cultural Organization in the International Year of Crystallography 2014.

ICSU Scientific Freedom Policy

The IUCr observes the basic policy of non-discrimination and affirms the right and freedom of scientists to associate in international scientific activity without regard to such factors as ethnic origin, religion, citizenship, language, political stance, gender, sex or age, in accordance with the Statutes of the International Council for Science.