Discussion List Archives

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Imgcif-l] Adding references to external files to imgCIF

Dear James,

On the face of it, this looks a lot to me like a reinvention of HDF5 -
perhaps with specific semantics - and there is already a (complete?)
mapping from imgCIF to HDF5 / NeXus

Have I missed something? No offence meant, trying to understand the
shape of the problem you are trying to solve

Thanks & best wishes Graeme

> On 13 Feb 2019, at 05:15, James Hester <jamesrhester@gmail.com> wrote:
>
> Dear All,
>
> Recent Commdat discussion revealed a desire to reference external images
> from within an imgCIF file. This would allow the metadata for a dataset to
> be held within a single imgCIF file, while the frames themselves remain
> separate. This avoids the impracticality of navigating through an enormous
> mulit-frame imgCIF file in order to extract a relatively compact amount of
> information.
>
> As a starting proposal, I suggest we extend the _array_data category with
> the following three datanames:
>
> (1) _array_data.external_format    A value drawn from an enumerated list of
> formats (e.g. "SMV","HDF5","Bruker"). The definition for each enumerated
> value would explain how to interpret _array_data.internal_path
> (2) _array_data.location_url           A URI for the file containing the
> image. A relative URL is relative to the location of the imgCIF file
> (3) _array_data.internal_path        A format-specific string describing
> the location of the frame within the file identified by
> _array_data.location_uri, interpreted according to the value given in
> _array_data.external_format
>
> So for a multi-frame HDF5 file buried in a subdirectory of the location
> referenced with a DOI, with appropriate definitions of the path notation:
>
> loop_
> _array_data.array_id
> _array_data.binary_id
> _array_data.external_format
> _array_data.location_uri
> _array_data.internal_path
> 1 1 NXMX doi:x.y.z directory/run/masterfilename:/entry1/detector/data[0]
> 1 2 NXMX doi:x.y.z directory/run/masterfilename:/entry1/detector/data[1]
> ...
>
> Or for a bunch of single-frame files generated by an ADSC detector in the
> same directory as the imgCIF file
>
> _array_data.array_id
> _array_data.binary_id
> _array_data.external_format
> _array_data.location_uri
> 1 1 ADSC ./tartaric.001
> 1 2 ADSC ./tartaric.002
> 1 3 ADSC ./tartaric.003
> ...
>
> The imgCIF data items describing the structure of the data array would
> refer to the data after it has been provided by the format. The form in
> which it is provided should be specified in the definition of each value of
> "_array_data.external_format".  So, for example, the various compression
> methods in HDF5 would be invisible if the data as returned are specified to
> be an array of Reals.
> 
> From the point of view of initial data validation, it would be sufficient
> to check that all referenced files are accessible, and that the provided
> locations exist.
> 
> Thoughts?
> James.
> 
> --
> T +61 (02) 9717 9907
> F +61 (02) 9717 3145
> M +61 (04) 0249 4148
> _______________________________________________
> imgcif-l mailing list
> imgcif-l@iucr.org
> http://mailman.iucr.org/cgi-bin/mailman/listinfo/imgcif-l

  

--
This e-mail and any attachments may contain confidential, copyright
and or privileged material, and are for the use of the intended
addressee only. If you are not the intended addressee or an authorised
recipient of the addressee please notify us of receipt by returning
the e-mail and do not use, copy, retain, distribute or disclose the
information in or attached to the e-mail.Any opinions expressed within
this e-mail are those of the individual and not necessarily of Diamond
Light Source Ltd. Diamond Light Source Ltd. cannot guarantee that this
e-mail or any attachments are free from viruses and we cannot accept
liability for any damage which you may sustain as a result of software
viruses which may be transmitted in or with the message.Diamond Light
Source Limited (company no. 4375679). Registered in England and Wales
with its registered office at Diamond House, Harwell Science and
Innovation Campus, Didcot, Oxfordshire, OX11 0DE, United Kingdom
_______________________________________________
imgcif-l mailing list
imgcif-l@iucr.org
http://mailman.iucr.org/cgi-bin/mailman/listinfo/imgcif-l

Reply to: [list | sender only]