Re: [Imgcif-l] Adding references to external files to imgCIF

Dear James,
On the face of it, this looks a lot to me like a reinvention of HDF5 - perhaps with specific semantics - and there is already a (complete?) mapping from imgCIF to HDF5 / NeXus 
Have I missed something? No offence meant, trying to understand the shape of the problem you are trying to solve
Thanks & best wishes Graeme
> On 13 Feb 2019, at 05:15, James Hester <jamesrhester@gmail.com> wrote:> > Dear All,> > Recent Commdat discussion revealed a desire to reference external images> from within an imgCIF file. This would allow the metadata for a dataset to> be held within a single imgCIF file, while the frames themselves remain> separate. This avoids the impracticality of navigating through an enormous> mulit-frame imgCIF file in order to extract a relatively compact amount of> information.> > As a starting proposal, I suggest we extend the _array_data category with> the following three datanames:> > (1) _array_data.external_format    A value drawn from an enumerated list of> formats (e.g. "SMV","HDF5","Bruker"). The definition for each enumerated> value would explain how to interpret _array_data.internal_path> (2) _array_data.location_url           A URI for the file containing the> image. A relative URL is relative to the location of the imgCIF file> (3) _array_data.internal_path        A format-specific string describing> the location of the frame within the file identified by> _array_data.location_uri, interpreted according to the value given in> _array_data.external_format> > So for a multi-frame HDF5 file buried in a subdirectory of the location> referenced with a DOI, with appropriate definitions of the path notation:> > loop_> _array_data.array_id> _array_data.binary_id> _array_data.external_format> _array_data.location_uri> _array_data.internal_path> 1 1 NXMX doi:x.y.z directory/run/masterfilename:/entry1/detector/data[0]> 1 2 NXMX doi:x.y.z directory/run/masterfilename:/entry1/detector/data[1]> ...> > Or for a bunch of single-frame files generated by an ADSC detector in the> same directory as the imgCIF file> > _array_data.array_id> _array_data.binary_id> _array_data.external_format> _array_data.location_uri> 1 1 ADSC ./tartaric.001> 1 2 ADSC ./tartaric.002> 1 3 ADSC ./tartaric.003> ...> > The imgCIF data items describing the structure of the data array would> refer to the data after it has been provided by the format. The form in> which it is provided should be specified in the definition of each value of> "_array_data.external_format".  So, for example, the various compression> methods in HDF5 would be invisible if the data as returned are specified to> be an array of Reals.> > From the point of view of initial data validation, it would be sufficient> to check that all referenced files are accessible, and that the provided> locations exist.> > Thoughts?> James.> > -- > T +61 (02) 9717 9907> F +61 (02) 9717 3145> M +61 (04) 0249 4148> _______________________________________________> imgcif-l mailing list> imgcif-l@iucr.org> http://mailman.iucr.org/cgi-bin/mailman/listinfo/imgcif-l

