[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Reply to: [list | sender only]
Re: [Imgcif-l] Adding references to external files to imgCIF
- To: "Herbert J. Bernstein" <yayahjb@gmail.com>
- Subject: Re: [Imgcif-l] Adding references to external files to imgCIF
- From: James Hester <jamesrhester@gmail.com>
- Date: Thu, 14 Feb 2019 14:39:03 +1100
- Cc: The Crystallographic Binary File and its imgCIF application to image data<imgcif-l@iucr.org>
- DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025;h=mime-version:references:in-reply-to:reply-to:from:date:message-id:subject:to:cc;bh=Yfd0NH6xvJGWe1zlECfAJfE6jDaH3iAAPXvYKPNmSss=;b=NWYSWotmjaEzNgbX0aBFL5FjA/IgG8nMTb15oxgUf1wRlgETOpT71i20up7k91H18VXt/sUftBIwKWDcsQjTUGJC54/WOpnik9htIvial9sFOYPzGVEXnJOA2mnju9Q66fM0yeUQU0Gfy0X3Cnn5VONqw+fiDLH379fAqlFowo0BmcJPxN7kJJV5Ebi4MSSUyY133KTXnCO2Oj525FZWAF8SgX3cQTEl5s34BROpN55fsHXQwC6kwInife9QPgWN+9GLKdlpVBlm8j5MxZAyx7P/+A5sJGqmB36TykPHEKMu7V5bVHe+PQ4H+rZXmu7hxzo/eb9/5wC8b+CJ9r/hiw==
- In-Reply-To: <CABcsX27+RTDq9HKsVBKRqn7Xs9_sV=o7xDkd0K_YAuxPRWTLPw@mail.gmail.com>
- References: <CAM+dB2dGcbLy3NuMy1g=QvWP3Mhj09F1WksKRXJ1BHeZ9_fXyw@mail.gmail.com><FDBF95B6-0C0A-48A1-92B4-9B567AD5C9E5@diamond.ac.uk><CAM+dB2c9qOZg8D151WwoJYkM_YtR-+kKcFvNaLNk4cM=3vEoQQ@mail.gmail.com><CABcsX27+RTDq9HKsVBKRqn7Xs9_sV=o7xDkd0K_YAuxPRWTLPw@mail.gmail.com>
Thanks for the support Herbert. Does anybody have any concerns or improvements to the data names that I sent originally? If not, I guess I will write up some formal dictionary definitions for your consideration. James. On Wed, 13 Feb 2019 at 21:39, Herbert J. Bernstein <yayahjb@gmail.com> wrote: > Dear Colleagues, > > Since 2012 NIAC and COMCIFS have worked cooperatively to make imgCIF/CBF > and NeXus/HDF5 fully interoperable. This is very > far along, e.g.with NeXus/HDF5 NXtransformations having been added to > NeXus/HDF5 to carry the same information as imgCIF/CBF AXIS. > What James has suggested will allow imgcif/CBF to carry the same dataset > structure information as is conveyed in the external links of > an Eiger dataset, which divides the collected data into a master file with > the metadata and a set of datafiles. This structural division > may not be important for some smaller datasets with only a few hundred to > a few thousand frames, but can be very important in > handling datasets with more frames than that that are encountered in > serial crystallography. Even for the smaller datasets this approach can > help to solve a problem for archives and facilities that need to store > metadata in a relational database while the data itself has been parked in > raw file systems, non-relational databases, zenodo, etc. As with almost > all of CIF, imgCIF/CBF metadata maps very easily and directly > into relational tables, while putting NeXus/HDF5 metadata into a > relational database first requires exactly the same sort of transformations > as we have already designed to map NeXus/HDF5 metadata into imgCIF/CBF > To me it seems that James' suggestion is not a reinvention > of this particular wheel, but may be an important step in avoiding > reinvention of the wheel. This may avoid a lot of unnecessary > transformation > of huge quantities of raw data in serial crystallography while making the > metadata more accessible. > > I would suggest giving James' suggestion serious consideration. > > Regards, > Herbert > while putting > > On Wed, Feb 13, 2019 at 4:02 AM James Hester <jamesrhester@gmail.com> > wrote: > >> Dear Graeme, >> >> The context of this is the idea that a single imgCIF file could be >> generated from a collection of raw image files (in whatever format, >> whether >> HDF5, or ADSC, or Bruker, or Rigaku, etc.) which would contain the >> metadata >> pertaining to that collection. In such a situation, some way of referring >> to the raw frames from within the imgCIF file is required. >> >> I agree that a perfectly reasonable approach is not to generate any new >> file at all, and simply to access the metadata directly in whatever format >> happens to be there. This was my initial impulse as well and it took me a >> while to understand that the actual proposal was to create an imgCIF file, >> rather than just use imgCIF datanames for specification purposes. From a >> semantic point of view both amount to the same thing so my only real >> motivation here is to add an image linking facility to imgCIF so that the >> "generate a summary metadata file" approach is possible. >> >> Could we just copy the HDF5 way of referring to objects in other HDF5 >> files >> as a quick solution? >> >> all the best, >> James. >> >> On Wed, 13 Feb 2019 at 19:03, Graeme.Winter@Diamond.ac.uk < >> Graeme.Winter@diamond.ac.uk> wrote: >> >> > Dear James, >> > >> > On the face of it, this looks a lot to me like a reinvention of HDF5 - >> > perhaps with specific semantics - and there is already a (complete?) >> > mapping from imgCIF to HDF5 / NeXus >> > >> > Have I missed something? No offence meant, trying to understand the >> shape >> > of the problem you are trying to solve >> > >> > Thanks & best wishes Graeme >> > >> > > On 13 Feb 2019, at 05:15, James Hester <jamesrhester@gmail.com> >> wrote: >> > > >> > > Dear All, >> > > >> > > Recent Commdat discussion revealed a desire to reference external >> images >> > > from within an imgCIF file. This would allow the metadata for a >> dataset >> > to >> > > be held within a single imgCIF file, while the frames themselves >> remain >> > > separate. This avoids the impracticality of navigating through an >> > enormous >> > > mulit-frame imgCIF file in order to extract a relatively compact >> amount >> > of >> > > information. >> > > >> > > As a starting proposal, I suggest we extend the _array_data category >> with >> > > the following three datanames: >> > > >> > > (1) _array_data.external_format A value drawn from an enumerated >> list >> > of >> > > formats (e.g. "SMV","HDF5","Bruker"). The definition for each >> enumerated >> > > value would explain how to interpret _array_data.internal_path >> > > (2) _array_data.location_url A URI for the file containing >> the >> > > image. A relative URL is relative to the location of the imgCIF file >> > > (3) _array_data.internal_path A format-specific string >> describing >> > > the location of the frame within the file identified by >> > > _array_data.location_uri, interpreted according to the value given in >> > > _array_data.external_format >> > > >> > > So for a multi-frame HDF5 file buried in a subdirectory of the >> location >> > > referenced with a DOI, with appropriate definitions of the path >> notation: >> > > >> > > loop_ >> > > _array_data.array_id >> > > _array_data.binary_id >> > > _array_data.external_format >> > > _array_data.location_uri >> > > _array_data.internal_path >> > > 1 1 NXMX doi:x.y.z >> directory/run/masterfilename:/entry1/detector/data[0] >> > > 1 2 NXMX doi:x.y.z >> directory/run/masterfilename:/entry1/detector/data[1] >> > > ... >> > > >> > > Or for a bunch of single-frame files generated by an ADSC detector in >> the >> > > same directory as the imgCIF file >> > > >> > > _array_data.array_id >> > > _array_data.binary_id >> > > _array_data.external_format >> > > _array_data.location_uri >> > > 1 1 ADSC ./tartaric.001 >> > > 1 2 ADSC ./tartaric.002 >> > > 1 3 ADSC ./tartaric.003 >> > > ... >> > > >> > > The imgCIF data items describing the structure of the data array would >> > > refer to the data after it has been provided by the format. The form >> in >> > > which it is provided should be specified in the definition of each >> value >> > of >> > > "_array_data.external_format". So, for example, the various >> compression >> > > methods in HDF5 would be invisible if the data as returned are >> specified >> > to >> > > be an array of Reals. >> > > >> > > From the point of view of initial data validation, it would be >> sufficient >> > > to check that all referenced files are accessible, and that the >> provided >> > > locations exist. >> > > >> > > Thoughts? >> > > James. >> > > >> > > -- >> > > T +61 (02) 9717 9907 >> > > F +61 (02) 9717 3145 >> > > M +61 (04) 0249 4148 >> > > _______________________________________________ >> > > imgcif-l mailing list >> > > imgcif-l@iucr.org >> > > http://mailman.iucr.org/cgi-bin/mailman/listinfo/imgcif-l >> > >> > >> > -- >> > This e-mail and any attachments may contain confidential, copyright and >> or >> > privileged material, and are for the use of the intended addressee >> only. If >> > you are not the intended addressee or an authorised recipient of the >> > addressee please notify us of receipt by returning the e-mail and do not >> > use, copy, retain, distribute or disclose the information in or >> attached to >> > the e-mail. >> > Any opinions expressed within this e-mail are those of the individual >> and >> > not necessarily of Diamond Light Source Ltd. >> > Diamond Light Source Ltd. cannot guarantee that this e-mail or any >> > attachments are free from viruses and we cannot accept liability for any >> > damage which you may sustain as a result of software viruses which may >> be >> > transmitted in or with the message. >> > Diamond Light Source Limited (company no. 4375679). Registered in >> England >> > and Wales with its registered office at Diamond House, Harwell Science >> and >> > Innovation Campus, Didcot, Oxfordshire, OX11 0DE, United Kingdom >> > >> > >> >> -- >> T +61 (02) 9717 9907 >> F +61 (02) 9717 3145 >> M +61 (04) 0249 4148 >> _______________________________________________ >> imgcif-l mailing list >> imgcif-l@iucr.org >> http://mailman.iucr.org/cgi-bin/mailman/listinfo/imgcif-l >> > -- T +61 (02) 9717 9907 F +61 (02) 9717 3145 M +61 (04) 0249 4148 _______________________________________________ imgcif-l mailing list imgcif-l@iucr.org <A HREF="http://mailman.iucr.org/cgi-bin/mailman/listinfo/imgcif-l">http://mailman.iucr.org/cgi-bin/mailman/listinfo/imgcif-l</A>
Reply to: [list | sender only]
- Prev by Date: Re: [Imgcif-l] Adding references to external files to imgCIF
- Next by Date: Re: [Imgcif-l] Adding references to external files to imgCIF
- Prev by thread: Re: [Imgcif-l] Adding references to external files to imgCIF
- Next by thread: Re: [Imgcif-l] Adding references to external files to imgCIF
- Index(es):