[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Reply to: [list | sender only]
Re: [Imgcif-l] Adding references to external files to imgCIF
- To: James Hester <james.r.hester@gmail.com>
- Subject: Re: [Imgcif-l] Adding references to external files to imgCIF
- From: "Herbert J. Bernstein via imgcif-l" <imgcif-l@iucr.org>
- Date: Mon, 9 May 2022 05:24:23 -0400
- Cc: "Herbert J. Bernstein" <yayahjb@gmail.com>, Graeme Winter <graeme.winter@gmail.com>, The Crystallographic Binary File and its imgCIF application to image data<imgcif-l@iucr.org>, Aaron Brewster <asbrewster@lbl.gov>, Billy Poon <BKPoon@lbl.gov>
- In-Reply-To: <CAM+dB2eK1P1s98REHjOJW2zB--UOHznawM6qhx1ZdOSeF7wGUg@mail.gmail.com>
- References: <CAM+dB2dGcbLy3NuMy1g=QvWP3Mhj09F1WksKRXJ1BHeZ9_fXyw@mail.gmail.com><FDBF95B6-0C0A-48A1-92B4-9B567AD5C9E5@diamond.ac.uk><CAM+dB2c9qOZg8D151WwoJYkM_YtR-+kKcFvNaLNk4cM=3vEoQQ@mail.gmail.com><CABcsX27+RTDq9HKsVBKRqn7Xs9_sV=o7xDkd0K_YAuxPRWTLPw@mail.gmail.com><CAM+dB2d=bmnEP9f5d99xG+U-FQ26+mezOhprH=J+5AdzLgSGbQ@mail.gmail.com><CAM+dB2fbEnrZy46W-8AM+g-Rnp+s11diPztbm6tNOJ6FTBC5vA@mail.gmail.com><CAM+dB2eXcLOP2z_9mitnXACH0+KZf7tcFZ4jOLL0SNub1DKcqA@mail.gmail.com><CAM+dB2cZeMENx5hrRXYFXdJJk+AjtkOB2nqOgGW1zF0=mzYd3g@mail.gmail.com><CABcsX24kz0-VeXA7P_kYT=135QrO1k0pLwHco8dBo=-djL-8=w@mail.gmail.com><CAM+dB2eY9DbdgOs--CrmsoyRQ4F0SuD37J4Owg7bDutWKasGHA@mail.gmail.com><CABcsX27tzfC0VAckEAqWY8Mb3MKDQAc4ZpfwooabZu_=p5QdfA@mail.gmail.com><CAM+dB2eK1P1s98REHjOJW2zB--UOHznawM6qhx1ZdOSeF7wGUg@mail.gmail.com>
Dear James, Yes, a specific pull request would be very helpful, since I am not sure I really understand what is needed here. The .html is the primary change needed, but if you could provide the .dic as you did before, I can work from that. Regards, Herbert On Sun, May 8, 2022, 8:43 PM James H <jamesrhester@gmail.comgt; wrote: gt; Thanks Herbert for the clarification. Regarding array sections I think gt; we might be talking about different things but I'll park that for the gt; moment as it is not urgent. gt; gt; What is urgent is that two of the new external data tags have been gt; left out of the update. Please see the issue at gt; https://github.com/yayahjb/cbflib/issues/46 which I'm drawing to your gt; attention here as I'm not sure if anybody looks at issues posted on gt; Github. I'd be happy to create a pull request if that makes life gt; easier. gt; gt; all the best, gt; James. gt; gt; On Sat, 7 May 2022 at 00:18, Herbert J. Bernstein <yayahjb@gmail.comgt; gt; wrote: gt; gt; gt; gt; Dear James, gt; gt; The normalization was discussed ages ago and is consistent with DDL2 gt; conventions, which gt; gt; are more normalized than core cif. The array sections were introduced gt; when we had to gt; gt; start dealing with the Eiger. It is routine in an eiger 16M data gt; collection to revert to a 4M gt; gt; ROI (built into the hardware) when more speed is required. Such gt; descriptions have to be gt; gt; somewhere. As speeds increase further, we will soon need to make more gt; use of gt; gt; module-by-module ROIs, and we definitely will have to pull them in both gt; individually gt; gt; and in groups instead of trying to only move full images. What approach gt; do you suggest gt; gt; for such cases? gt; gt; Regards, gt; gt; Herbert gt; gt; gt; gt; On Thu, May 5, 2022 at 11:15 PM James H <jamesrhester@gmail.comgt; wrote: gt; gt;gt; gt; gt;gt; Thanks Herbert for making these updates. My apologies for taking so gt; gt;gt; long to come back to them. gt; gt;gt; gt; gt;gt; I notice that the new 1.8.5 has moved the _array_data.external_data_* gt; gt;gt; tags into a separate array_data_external_data loop. While I appreciate gt; gt;gt; that a separate loop is as good a place as any, I would also have gt; gt;gt; appreciated some discussion of this - perhaps I missed it? Anyway, I gt; gt;gt; do not propose to dispute it now. gt; gt;gt; gt; gt;gt; More importantly, _array_data_external_data.frame seems to have gt; gt;gt; acquired a format ARRAYID(start1:end1:stride1,start2:end2:stride2, gt; gt;gt; ...) which I don't recall discussing, and there are now references in gt; gt;gt; the definition to ARRAY_STRUCTURE_LIST which I believe miss the point gt; gt;gt; that the ARRAY_STRUCTURE_LIST items are used to characterise the array gt; gt;gt; after it has been obtained from the external data source, and are gt; gt;gt; definitely *not* supposed to describe the layout of the data within gt; gt;gt; the external data source. Likewise, ARRAY_ID refers to the layout of gt; gt;gt; the data after they have been delivered, and so have no direct gt; gt;gt; relevance to how the data are stored. I appreciate that C and Fortran gt; gt;gt; layout should be considered by the author of the imgCIF file when gt; gt;gt; describing what will be returned from the external source, but I'm not gt; gt;gt; sure that this warning is particularly necessary here as the author gt; gt;gt; will in any case be forced to consider the details of the gt; gt;gt; format-specific behaviour when constructing the external data pointer. gt; gt;gt; gt; gt;gt; thanks, gt; gt;gt; James. gt; gt;gt; gt; gt;gt; On Wed, 6 Apr 2022 at 23:39, Herbert J. Bernstein <yayahjb@gmail.comgt; gt; wrote: gt; gt;gt; gt; gt; gt;gt; gt; Dear Colleagues, gt; gt;gt; gt; gt; gt;gt; gt; I propose the following plan of action to get James' changes into gt; the cif_img dictionary gt; gt;gt; gt; gt; gt;gt; gt; 0. In both the yayahjb cbflib active branches: main and gt; CBFlib-0.9.7-devel, bring gt; gt;gt; gt; the currently posted cif_img_1.8.5 dictionary up to an agreed level gt; (which will be gt; gt;gt; gt; called 1.8.6 if there are any changes) and make one last CBFlib 0.9.6 gt; release with gt; gt;gt; gt; that as the default dictionary gt; gt;gt; gt; 1. Merge the current CBFlib_0.9.7-devel branch into main gt; gt;gt; gt; 2. Make that the default release in yayahjb gt; gt;gt; gt; gt; gt;gt; gt; If nobody objects, I plan to post the necessary pull requests and gt; releases this weekend. gt; gt;gt; gt; gt;gt; gt; gt; gt;gt; gt; gt; gt;gt; gt; On Wed, Apr 6, 2022 at 5:10 AM James H <jamesrhester@gmail.comgt; gt; wrote: gt; gt;gt; gt;gt; gt; gt;gt; gt;gt; Dear All, gt; gt;gt; gt;gt; gt; gt;gt; gt;gt; Just a quick note: a further year later and the external data gt; pointers gt; gt;gt; gt;gt; work has not yet been merged, and neither has a further proposed data gt; gt;gt; gt;gt; name [1]. On the bright side an implementation using these pointers gt; gt;gt; gt;gt; has been published as a test of practicality [2]. It would of course gt; gt;gt; gt;gt; be most welcome if imgCIF deliberative processes could get themselves gt; gt;gt; gt;gt; to the point that these new data names are merged into the official gt; gt;gt; gt;gt; version of the main dictionary, given that no issues have been gt; gt;gt; gt;gt; identified. gt; gt;gt; gt;gt; gt; gt;gt; gt;gt; Meanwhile, in order to facilitate use of automated DDLm checking gt; tools gt; gt;gt; gt;gt; on data files using imgCIF data names, I have now generated (1) a gt; gt;gt; gt;gt; direct translation of current version 1.8.4 into DDLm (2) a direct gt; gt;gt; gt;gt; translation with added external data pointers to DDLm in a separate gt; gt;gt; gt;gt; "journals-extension" branch. Both of these currently exist as pull gt; gt;gt; gt;gt; requests on the https://github.com/COMCIFS/imgCIF repository, which gt; is gt; gt;gt; gt;gt; intended to hold the DDLm version of the imgCIF dictionary. Anyone is gt; gt;gt; gt;gt; most welcome to comment on these pull requests of course, but I gt; gt;gt; gt;gt; emphasise that they simply use a different dictionary language for gt; gt;gt; gt;gt; defining the same data names, and therefore should have no gt; gt;gt; gt;gt; implications for current imgCIF/CBF usage. gt; gt;gt; gt;gt; gt; gt;gt; gt;gt; best wishes, gt; gt;gt; gt;gt; James. gt; gt;gt; gt;gt; gt; gt;gt; gt;gt; [1] pull request at https://github.com/yayahjb/cbflib/pull/39 gt; gt;gt; gt;gt; [2] https://github.com/jamesrhester/ImgCIFHandler.jl gt; gt;gt; gt;gt; gt; gt;gt; gt;gt; On Mon, 12 Apr 2021 at 16:38, James H <jamesrhester@gmail.comgt; gt; wrote: gt; gt;gt; gt;gt; gt; gt; gt;gt; gt;gt; gt; Dear All, gt; gt;gt; gt;gt; gt; gt; gt;gt; gt;gt; gt; Over a year later I have now written up definitions in DDL2 for gt; inclusion in imgCIF. The full definitions are at the Github issue ( gt; https://github.com/COMCIFS/imgCIF/issues/7). Please have a look and gt; provide feedback here or there. Note that I have added datanames for gt; specifying that the images are contained within compressed archives. I've gt; checked a few known sources of images (proteindiffraction.org, zenodo, a gt; uni repository) and this scheme seems to cover those bases. If you have gt; time, please have a look at your favourite open archive of raw data to see gt; if this scheme is sufficient for you to specify a particular image in that gt; archive. I've reproduced the examples from the definitions below. gt; gt;gt; gt;gt; gt; gt; gt;gt; gt;gt; gt; Of course, in a perfect world we would just give a DOI but those gt; days are not yet upon us due to landing pages. Happy to be corrected on gt; that. gt; gt;gt; gt;gt; gt; gt; gt;gt; gt;gt; gt; best wishes, gt; gt;gt; gt;gt; gt; James. gt; gt;gt; gt;gt; gt; gt; gt;gt; gt;gt; gt; Examples gt; gt;gt; gt;gt; gt; ======== gt; gt;gt; gt;gt; gt; # The frames are contained in a single HDF5-format file accessible gt; gt;gt; gt;gt; gt; # at https://zenodo.org/record/12345/files/tartaric.h5. An gt; array of 2D gt; gt;gt; gt;gt; gt; # images is found at HDF5 location /entry1/detector1/data gt; gt;gt; gt;gt; gt; gt; gt;gt; gt;gt; gt; loop_ gt; gt;gt; gt;gt; gt; _array_data.array_id gt; gt;gt; gt;gt; gt; _array_data.binary_id gt; gt;gt; gt;gt; gt; _array_data.external_format gt; gt;gt; gt;gt; gt; _array_data.location_uri gt; gt;gt; gt;gt; gt; _array_data.external_path gt; gt;gt; gt;gt; gt; _array_data.external_frame gt; gt;gt; gt;gt; gt; 1 1 HDF5 https://zenodo.org/record/12345/files/tartaric.h5 gt; /entry1/detector1/data 1 gt; gt;gt; gt;gt; gt; 1 2 HDF5 https://zenodo.org/record/12345/files/tartaric.h5 gt; /entry1/detector1/data 2 gt; gt;gt; gt;gt; gt; ... gt; gt;gt; gt;gt; gt; gt; gt;gt; gt;gt; gt; # Frames are contained in individual Smart6000 Bruker-format gt; files gt; gt;gt; gt;gt; gt; # accessible using https://uni_repo.edu/5341 in subdirectory gt; run1. gt; gt;gt; gt;gt; gt; gt; gt;gt; gt;gt; gt; loop_ gt; gt;gt; gt;gt; gt; _array_data.array_id gt; gt;gt; gt;gt; gt; _array_data.binary_id gt; gt;gt; gt;gt; gt; _array_data.external_format gt; gt;gt; gt;gt; gt; _array_data.external_version gt; gt;gt; gt;gt; gt; _array_data.location_uri gt; gt;gt; gt;gt; gt; 1 1 Bruker Smart6000 gt; https://uni_repo.edu/5341/run1/tartaric.001 gt; gt;gt; gt;gt; gt; 1 2 Bruker Smart6000 gt; https://uni_repo.edu/5341/run1/tartaric.002 gt; gt;gt; gt;gt; gt; ... gt; gt;gt; gt;gt; gt; gt; gt;gt; gt;gt; gt; # Frames with SMV format are contained at gt; data.proteindiffraction.org in a tarred gt; gt;gt; gt;gt; gt; # archive compressed with bzip2. gt; gt;gt; gt;gt; gt; gt; gt;gt; gt;gt; gt; loop_ gt; gt;gt; gt;gt; gt; _array_data.array_id gt; gt;gt; gt;gt; gt; _array_data.binary_id gt; gt;gt; gt;gt; gt; _array_data.external_format gt; gt;gt; gt;gt; gt; _array_data.location_uri gt; gt;gt; gt;gt; gt; _array_data.external_archive_format gt; gt;gt; gt;gt; gt; _array_data.external_archive_path gt; gt;gt; gt;gt; gt; 1 1 SMV gt; gt;gt; gt;gt; gt; gt; https://data.proteindiffraction.org/ssgcid/MyulA_01062_a_B12-sddc0001574_7k69. tar.bz2 gt; gt;gt; gt;gt; gt; TBZ gt; gt;gt; gt;gt; gt; MyulA_01062_a_B12-sddc0001574_7k69/data/317895h4_y_0001.img gt; gt;gt; gt;gt; gt; 1 2 SMV gt; gt;gt; gt;gt; gt; gt; https://data.proteindiffraction.org/ssgcid/MyulA_01062_a_B12-sddc0001574_7k69. tar.bz2 gt; gt;gt; gt;gt; gt; TBZ gt; gt;gt; gt;gt; gt; MyulA_01062_a_B12-sddc0001574_7k69/data/317895h4_y_0002.img gt; gt;gt; gt;gt; gt; gt; gt;gt; gt;gt; gt; gt; gt;gt; gt;gt; gt; On Tue, 5 Mar 2019 at 16:37, James Hester <jamesrhester@gmail.comgt; gt; wrote: gt; gt;gt; gt;gt; gt;gt; gt; gt;gt; gt;gt; gt;gt; OK, I've drafted up some definitions (just the human-readable gt; part for now) for you all to peruse. Please look at gt; https://github.com/COMCIFS/imgCIF/issues/7 and provide feedback here or gt; there. gt; gt;gt; gt;gt; gt;gt; gt; gt;gt; gt;gt; gt;gt; all the the best, gt; gt;gt; gt;gt; gt;gt; James. gt; gt;gt; gt;gt; gt;gt; gt; gt;gt; gt;gt; gt;gt; On Thu, 14 Feb 2019 at 14:39, James Hester < gt; jamesrhester@gmail.comgt; wrote: gt; gt;gt; gt;gt; gt;gt;gt; gt; gt;gt; gt;gt; gt;gt;gt; Thanks for the support Herbert. Does anybody have any concerns gt; or improvements to the data names that I sent originally? If not, I guess I gt; will write up some formal dictionary definitions for your consideration. gt; gt;gt; gt;gt; gt;gt;gt; gt; gt;gt; gt;gt; gt;gt;gt; James. gt; gt;gt; gt;gt; gt;gt;gt; gt; gt;gt; gt;gt; gt;gt;gt; On Wed, 13 Feb 2019 at 21:39, Herbert J. Bernstein < gt; yayahjb@gmail.comgt; wrote: gt; gt;gt; gt;gt; gt;gt;gt;gt; gt; gt;gt; gt;gt; gt;gt;gt;gt; Dear Colleagues, gt; gt;gt; gt;gt; gt;gt;gt;gt; gt; gt;gt; gt;gt; gt;gt;gt;gt; Since 2012 NIAC and COMCIFS have worked cooperatively to make gt; imgCIF/CBF and NeXus/HDF5 fully interoperable. This is very gt; gt;gt; gt;gt; gt;gt;gt;gt; far along, e.g.with NeXus/HDF5 NXtransformations having been gt; added to NeXus/HDF5 to carry the same information as imgCIF/CBF AXIS. gt; gt;gt; gt;gt; gt;gt;gt;gt; What James has suggested will allow imgcif/CBF to carry the gt; same dataset structure information as is conveyed in the external links of gt; gt;gt; gt;gt; gt;gt;gt;gt; an Eiger dataset, which divides the collected data into a gt; master file with the metadata and a set of datafiles. This structural gt; division gt; gt;gt; gt;gt; gt;gt;gt;gt; may not be important for some smaller datasets with only a few gt; hundred to a few thousand frames, but can be very important in gt; gt;gt; gt;gt; gt;gt;gt;gt; handling datasets with more frames than that that are gt; encountered in serial crystallography. Even for the smaller datasets this gt; approach can gt; gt;gt; gt;gt; gt;gt;gt;gt; help to solve a problem for archives and facilities that need gt; to store metadata in a relational database while the data itself has been gt; parked in gt; gt;gt; gt;gt; gt;gt;gt;gt; raw file systems, non-relational databases, zenodo, etc. As gt; with almost all of CIF, imgCIF/CBF metadata maps very easily and directly gt; gt;gt; gt;gt; gt;gt;gt;gt; into relational tables, while putting NeXus/HDF5 metadata into gt; a relational database first requires exactly the same sort of gt; transformations gt; gt;gt; gt;gt; gt;gt;gt;gt; as we have already designed to map NeXus/HDF5 metadata into gt; imgCIF/CBF To me it seems that James' suggestion is not a reinvention gt; gt;gt; gt;gt; gt;gt;gt;gt; of this particular wheel, but may be an important step in gt; avoiding reinvention of the wheel. This may avoid a lot of unnecessary gt; transformation gt; gt;gt; gt;gt; gt;gt;gt;gt; of huge quantities of raw data in serial crystallography while gt; making the metadata more accessible. gt; gt;gt; gt;gt; gt;gt;gt;gt; gt; gt;gt; gt;gt; gt;gt;gt;gt; I would suggest giving James' suggestion serious gt; consideration. gt; gt;gt; gt;gt; gt;gt;gt;gt; gt; gt;gt; gt;gt; gt;gt;gt;gt; Regards, gt; gt;gt; gt;gt; gt;gt;gt;gt; Herbert gt; gt;gt; gt;gt; gt;gt;gt;gt; while putting gt; gt;gt; gt;gt; gt;gt;gt;gt; gt; gt;gt; gt;gt; gt;gt;gt;gt; On Wed, Feb 13, 2019 at 4:02 AM James Hester < gt; jamesrhester@gmail.comgt; wrote: gt; gt;gt; gt;gt; gt;gt;gt;gt;gt; gt; gt;gt; gt;gt; gt;gt;gt;gt;gt; Dear Graeme, gt; gt;gt; gt;gt; gt;gt;gt;gt;gt; gt; gt;gt; gt;gt; gt;gt;gt;gt;gt; The context of this is the idea that a single imgCIF file gt; could be gt; gt;gt; gt;gt; gt;gt;gt;gt;gt; generated from a collection of raw image files (in whatever gt; format, whether gt; gt;gt; gt;gt; gt;gt;gt;gt;gt; HDF5, or ADSC, or Bruker, or Rigaku, etc.) which would contain gt; the metadata gt; gt;gt; gt;gt; gt;gt;gt;gt;gt; pertaining to that collection. In such a situation, some way gt; of referring gt; gt;gt; gt;gt; gt;gt;gt;gt;gt; to the raw frames from within the imgCIF file is required. gt; gt;gt; gt;gt; gt;gt;gt;gt;gt; gt; gt;gt; gt;gt; gt;gt;gt;gt;gt; I agree that a perfectly reasonable approach is not to gt; generate any new gt; gt;gt; gt;gt; gt;gt;gt;gt;gt; file at all, and simply to access the metadata directly in gt; whatever format gt; gt;gt; gt;gt; gt;gt;gt;gt;gt; happens to be there. This was my initial impulse as well and gt; it took me a gt; gt;gt; gt;gt; gt;gt;gt;gt;gt; while to understand that the actual proposal was to create an gt; imgCIF file, gt; gt;gt; gt;gt; gt;gt;gt;gt;gt; rather than just use imgCIF datanames for specification gt; purposes. From a gt; gt;gt; gt;gt; gt;gt;gt;gt;gt; semantic point of view both amount to the same thing so my gt; only real gt; gt;gt; gt;gt; gt;gt;gt;gt;gt; motivation here is to add an image linking facility to imgCIF gt; so that the gt; gt;gt; gt;gt; gt;gt;gt;gt;gt; "generate a summary metadata file" approach is possible. gt; gt;gt; gt;gt; gt;gt;gt;gt;gt; gt; gt;gt; gt;gt; gt;gt;gt;gt;gt; Could we just copy the HDF5 way of referring to objects in gt; other HDF5 files gt; gt;gt; gt;gt; gt;gt;gt;gt;gt; as a quick solution? gt; gt;gt; gt;gt; gt;gt;gt;gt;gt; gt; gt;gt; gt;gt; gt;gt;gt;gt;gt; all the best, gt; gt;gt; gt;gt; gt;gt;gt;gt;gt; James. gt; gt;gt; gt;gt; gt;gt;gt;gt;gt; gt; gt;gt; gt;gt; gt;gt;gt;gt;gt; On Wed, 13 Feb 2019 at 19:03, Graeme.Winter@Diamond.ac.uk < gt; gt;gt; gt;gt; gt;gt;gt;gt;gt; Graeme.Winter@diamond.ac.ukgt; wrote: gt; gt;gt; gt;gt; gt;gt;gt;gt;gt; gt; gt;gt; gt;gt; gt;gt;gt;gt;gt; gt; Dear James, gt; gt;gt; gt;gt; gt;gt;gt;gt;gt; gt; gt; gt;gt; gt;gt; gt;gt;gt;gt;gt; gt; On the face of it, this looks a lot to me like a reinvention gt; of HDF5 - gt; gt;gt; gt;gt; gt;gt;gt;gt;gt; gt; perhaps with specific semantics - and there is already a gt; (complete?) gt; gt;gt; gt;gt; gt;gt;gt;gt;gt; gt; mapping from imgCIF to HDF5 / NeXus gt; gt;gt; gt;gt; gt;gt;gt;gt;gt; gt; gt; gt;gt; gt;gt; gt;gt;gt;gt;gt; gt; Have I missed something? No offence meant, trying to gt; understand the shape gt; gt;gt; gt;gt; gt;gt;gt;gt;gt; gt; of the problem you are trying to solve gt; gt;gt; gt;gt; gt;gt;gt;gt;gt; gt; gt; gt;gt; gt;gt; gt;gt;gt;gt;gt; gt; Thanks & best wishes Graeme gt; gt;gt; gt;gt; gt;gt;gt;gt;gt; gt; gt; gt;gt; gt;gt; gt;gt;gt;gt;gt; gt; gt; On 13 Feb 2019, at 05:15, James Hester < gt; jamesrhester@gmail.comgt; wrote: gt; gt;gt; gt;gt; gt;gt;gt;gt;gt; gt; gt; gt; gt;gt; gt;gt; gt;gt;gt;gt;gt; gt; gt; Dear All, gt; gt;gt; gt;gt; gt;gt;gt;gt;gt; gt; gt; gt; gt;gt; gt;gt; gt;gt;gt;gt;gt; gt; gt; Recent Commdat discussion revealed a desire to reference gt; external images gt; gt;gt; gt;gt; gt;gt;gt;gt;gt; gt; gt; from within an imgCIF file. This would allow the metadata gt; for a dataset gt; gt;gt; gt;gt; gt;gt;gt;gt;gt; gt; to gt; gt;gt; gt;gt; gt;gt;gt;gt;gt; gt; gt; be held within a single imgCIF file, while the frames gt; themselves remain gt; gt;gt; gt;gt; gt;gt;gt;gt;gt; gt; gt; separate. This avoids the impracticality of navigating gt; through an gt; gt;gt; gt;gt; gt;gt;gt;gt;gt; gt; enormous gt; gt;gt; gt;gt; gt;gt;gt;gt;gt; gt; gt; mulit-frame imgCIF file in order to extract a relatively gt; compact amount gt; gt;gt; gt;gt; gt;gt;gt;gt;gt; gt; of gt; gt;gt; gt;gt; gt;gt;gt;gt;gt; gt; gt; information. gt; gt;gt; gt;gt; gt;gt;gt;gt;gt; gt; gt; gt; gt;gt; gt;gt; gt;gt;gt;gt;gt; gt; gt; As a starting proposal, I suggest we extend the gt; _array_data category with gt; gt;gt; gt;gt; gt;gt;gt;gt;gt; gt; gt; the following three datanames: gt; gt;gt; gt;gt; gt;gt;gt;gt;gt; gt; gt; gt; gt;gt; gt;gt; gt;gt;gt;gt;gt; gt; gt; (1) _array_data.external_format A value drawn from an gt; enumerated list gt; gt;gt; gt;gt; gt;gt;gt;gt;gt; gt; of gt; gt;gt; gt;gt; gt;gt;gt;gt;gt; gt; gt; formats (e.g. "SMV","HDF5","Bruker"). The definition for gt; each enumerated gt; gt;gt; gt;gt; gt;gt;gt;gt;gt; gt; gt; value would explain how to interpret gt; _array_data.internal_path gt; gt;gt; gt;gt; gt;gt;gt;gt;gt; gt; gt; (2) _array_data.location_url A URI for the file gt; containing the gt; gt;gt; gt;gt; gt;gt;gt;gt;gt; gt; gt; image. A relative URL is relative to the location of the gt; imgCIF file gt; gt;gt; gt;gt; gt;gt;gt;gt;gt; gt; gt; (3) _array_data.internal_path A format-specific gt; string describing gt; gt;gt; gt;gt; gt;gt;gt;gt;gt; gt; gt; the location of the frame within the file identified by gt; gt;gt; gt;gt; gt;gt;gt;gt;gt; gt; gt; _array_data.location_uri, interpreted according to the gt; value given in gt; gt;gt; gt;gt; gt;gt;gt;gt;gt; gt; gt; _array_data.external_format gt; gt;gt; gt;gt; gt;gt;gt;gt;gt; gt; gt; gt; gt;gt; gt;gt; gt;gt;gt;gt;gt; gt; gt; So for a multi-frame HDF5 file buried in a subdirectory of gt; the location gt; gt;gt; gt;gt; gt;gt;gt;gt;gt; gt; gt; referenced with a DOI, with appropriate definitions of the gt; path notation: gt; gt;gt; gt;gt; gt;gt;gt;gt;gt; gt; gt; gt; gt;gt; gt;gt; gt;gt;gt;gt;gt; gt; gt; loop_ gt; gt;gt; gt;gt; gt;gt;gt;gt;gt; gt; gt; _array_data.array_id gt; gt;gt; gt;gt; gt;gt;gt;gt;gt; gt; gt; _array_data.binary_id gt; gt;gt; gt;gt; gt;gt;gt;gt;gt; gt; gt; _array_data.external_format gt; gt;gt; gt;gt; gt;gt;gt;gt;gt; gt; gt; _array_data.location_uri gt; gt;gt; gt;gt; gt;gt;gt;gt;gt; gt; gt; _array_data.internal_path gt; gt;gt; gt;gt; gt;gt;gt;gt;gt; gt; gt; 1 1 NXMX doi:x.y.z gt; directory/run/masterfilename:/entry1/detector/data[0] gt; gt;gt; gt;gt; gt;gt;gt;gt;gt; gt; gt; 1 2 NXMX doi:x.y.z gt; directory/run/masterfilename:/entry1/detector/data[1] gt; gt;gt; gt;gt; gt;gt;gt;gt;gt; gt; gt; ... gt; gt;gt; gt;gt; gt;gt;gt;gt;gt; gt; gt; gt; gt;gt; gt;gt; gt;gt;gt;gt;gt; gt; gt; Or for a bunch of single-frame files generated by an ADSC gt; detector in the gt; gt;gt; gt;gt; gt;gt;gt;gt;gt; gt; gt; same directory as the imgCIF file gt; gt;gt; gt;gt; gt;gt;gt;gt;gt; gt; gt; gt; gt;gt; gt;gt; gt;gt;gt;gt;gt; gt; gt; _array_data.array_id gt; gt;gt; gt;gt; gt;gt;gt;gt;gt; gt; gt; _array_data.binary_id gt; gt;gt; gt;gt; gt;gt;gt;gt;gt; gt; gt; _array_data.external_format gt; gt;gt; gt;gt; gt;gt;gt;gt;gt; gt; gt; _array_data.location_uri gt; gt;gt; gt;gt; gt;gt;gt;gt;gt; gt; gt; 1 1 ADSC ./tartaric.001 gt; gt;gt; gt;gt; gt;gt;gt;gt;gt; gt; gt; 1 2 ADSC ./tartaric.002 gt; gt;gt; gt;gt; gt;gt;gt;gt;gt; gt; gt; 1 3 ADSC ./tartaric.003 gt; gt;gt; gt;gt; gt;gt;gt;gt;gt; gt; gt; ... gt; gt;gt; gt;gt; gt;gt;gt;gt;gt; gt; gt; gt; gt;gt; gt;gt; gt;gt;gt;gt;gt; gt; gt; The imgCIF data items describing the structure of the data gt; array would gt; gt;gt; gt;gt; gt;gt;gt;gt;gt; gt; gt; refer to the data after it has been provided by the gt; format. The form in gt; gt;gt; gt;gt; gt;gt;gt;gt;gt; gt; gt; which it is provided should be specified in the definition gt; of each value gt; gt;gt; gt;gt; gt;gt;gt;gt;gt; gt; of gt; gt;gt; gt;gt; gt;gt;gt;gt;gt; gt; gt; "_array_data.external_format". So, for example, the gt; various compression gt; gt;gt; gt;gt; gt;gt;gt;gt;gt; gt; gt; methods in HDF5 would be invisible if the data as returned gt; are specified gt; gt;gt; gt;gt; gt;gt;gt;gt;gt; gt; to gt; gt;gt; gt;gt; gt;gt;gt;gt;gt; gt; gt; be an array of Reals. gt; gt;gt; gt;gt; gt;gt;gt;gt;gt; gt; gt; gt; gt;gt; gt;gt; gt;gt;gt;gt;gt; gt; gt; From the point of view of initial data validation, it gt; would be sufficient gt; gt;gt; gt;gt; gt;gt;gt;gt;gt; gt; gt; to check that all referenced files are accessible, and gt; that the provided gt; gt;gt; gt;gt; gt;gt;gt;gt;gt; gt; gt; locations exist. gt; gt;gt; gt;gt; gt;gt;gt;gt;gt; gt; gt; gt; gt;gt; gt;gt; gt;gt;gt;gt;gt; gt; gt; Thoughts? gt; gt;gt; gt;gt; gt;gt;gt;gt;gt; gt; gt; James. gt; gt;gt; gt;gt; gt;gt;gt;gt;gt; gt; gt; gt; gt;gt; gt;gt; gt;gt;gt;gt;gt; gt; gt; -- gt; gt;gt; gt;gt; gt;gt;gt;gt;gt; gt; gt; T +61 (02) 9717 9907 gt; gt;gt; gt;gt; gt;gt;gt;gt;gt; gt; gt; F +61 (02) 9717 3145 gt; gt;gt; gt;gt; gt;gt;gt;gt;gt; gt; gt; M +61 (04) 0249 4148 gt; gt;gt; gt;gt; gt;gt;gt;gt;gt; gt; gt; _______________________________________________ gt; gt;gt; gt;gt; gt;gt;gt;gt;gt; gt; gt; imgcif-l mailing list gt; gt;gt; gt;gt; gt;gt;gt;gt;gt; gt; gt; imgcif-l@iucr.org gt; gt;gt; gt;gt; gt;gt;gt;gt;gt; gt; gt; http://mailman.iucr.org/cgi-bin/mailman/listinfo/imgcif-l gt; gt;gt; gt;gt; gt;gt;gt;gt;gt; gt; gt; gt;gt; gt;gt; gt;gt;gt;gt;gt; gt; gt; gt;gt; gt;gt; gt;gt;gt;gt;gt; gt; -- gt; gt;gt; gt;gt; gt;gt;gt;gt;gt; gt; This e-mail and any attachments may contain confidential, gt; copyright and or gt; gt;gt; gt;gt; gt;gt;gt;gt;gt; gt; privileged material, and are for the use of the intended gt; addressee only. If gt; gt;gt; gt;gt; gt;gt;gt;gt;gt; gt; you are not the intended addressee or an authorised gt; recipient of the gt; gt;gt; gt;gt; gt;gt;gt;gt;gt; gt; addressee please notify us of receipt by returning the gt; e-mail and do not gt; gt;gt; gt;gt; gt;gt;gt;gt;gt; gt; use, copy, retain, distribute or disclose the information in gt; or attached to gt; gt;gt; gt;gt; gt;gt;gt;gt;gt; gt; the e-mail. gt; gt;gt; gt;gt; gt;gt;gt;gt;gt; gt; Any opinions expressed within this e-mail are those of the gt; individual and gt; gt;gt; gt;gt; gt;gt;gt;gt;gt; gt; not necessarily of Diamond Light Source Ltd. gt; gt;gt; gt;gt; gt;gt;gt;gt;gt; gt; Diamond Light Source Ltd. cannot guarantee that this e-mail gt; or any gt; gt;gt; gt;gt; gt;gt;gt;gt;gt; gt; attachments are free from viruses and we cannot accept gt; liability for any gt; gt;gt; gt;gt; gt;gt;gt;gt;gt; gt; damage which you may sustain as a result of software viruses gt; which may be gt; gt;gt; gt;gt; gt;gt;gt;gt;gt; gt; transmitted in or with the message. gt; gt;gt; gt;gt; gt;gt;gt;gt;gt; gt; Diamond Light Source Limited (company no. 4375679). gt; Registered in England gt; gt;gt; gt;gt; gt;gt;gt;gt;gt; gt; and Wales with its registered office at Diamond House, gt; Harwell Science and gt; gt;gt; gt;gt; gt;gt;gt;gt;gt; gt; Innovation Campus, Didcot, Oxfordshire, OX11 0DE, United gt; Kingdom gt; gt;gt; gt;gt; gt;gt;gt;gt;gt; gt; gt; gt;gt; gt;gt; gt;gt;gt;gt;gt; gt; gt; gt;gt; gt;gt; gt;gt;gt;gt;gt; gt; gt;gt; gt;gt; gt;gt;gt;gt;gt; -- gt; gt;gt; gt;gt; gt;gt;gt;gt;gt; T +61 (02) 9717 9907 gt; gt;gt; gt;gt; gt;gt;gt;gt;gt; F +61 (02) 9717 3145 gt; gt;gt; gt;gt; gt;gt;gt;gt;gt; M +61 (04) 0249 4148 gt; gt;gt; gt;gt; gt;gt;gt;gt;gt; _______________________________________________ gt; gt;gt; gt;gt; gt;gt;gt;gt;gt; imgcif-l mailing list gt; gt;gt; gt;gt; gt;gt;gt;gt;gt; imgcif-l@iucr.org gt; gt;gt; gt;gt; gt;gt;gt;gt;gt; http://mailman.iucr.org/cgi-bin/mailman/listinfo/imgcif-l gt; gt;gt; gt;gt; gt;gt;gt; gt; gt;gt; gt;gt; gt;gt;gt; gt; gt;gt; gt;gt; gt;gt;gt; gt; gt;gt; gt;gt; gt;gt;gt; -- gt; gt;gt; gt;gt; gt;gt;gt; T +61 (02) 9717 9907 gt; gt;gt; gt;gt; gt;gt;gt; F +61 (02) 9717 3145 gt; gt;gt; gt;gt; gt;gt;gt; M +61 (04) 0249 4148 gt; gt;gt; gt;gt; gt;gt; gt; gt;gt; gt;gt; gt;gt; gt; gt;gt; gt;gt; gt;gt; gt; gt;gt; gt;gt; gt;gt; -- gt; gt;gt; gt;gt; gt;gt; T +61 (02) 9717 9907 gt; gt;gt; gt;gt; gt;gt; F +61 (02) 9717 3145 gt; gt;gt; gt;gt; gt;gt; M +61 (04) 0249 4148 gt; gt;gt; gt;gt; gt; gt; gt;gt; gt;gt; gt; gt; gt;gt; gt;gt; gt; gt; gt;gt; gt;gt; gt; -- gt; gt;gt; gt;gt; gt; T +61 (02) 9717 9907 gt; gt;gt; gt;gt; gt; F +61 (02) 9717 3145 gt; gt;gt; gt;gt; gt; M +61 (04) 0249 4148 gt; gt;gt; gt;gt; gt; gt;gt; gt;gt; gt; gt;gt; gt;gt; gt; gt;gt; gt;gt; -- gt; gt;gt; gt;gt; T +61 (02) 9717 9907 gt; gt;gt; gt;gt; F +61 (02) 9717 3145 gt; gt;gt; gt;gt; M +61 (04) 0249 4148 gt; gt;gt; gt; gt;gt; gt; gt;gt; gt; gt;gt; -- gt; gt;gt; T +61 (02) 9717 9907 gt; gt;gt; F +61 (02) 9717 3145 gt; gt;gt; M +61 (04) 0249 4148 gt; gt; gt; gt; -- gt; T +61 (02) 9717 9907 gt; F +61 (02) 9717 3145 gt; M +61 (04) 0249 4148 gt; _______________________________________________ imgcif-l mailing list imgcif-l@iucr.org http://mailman.iucr.org/cgi-bin/mailman/listinfo/imgcif-l
Reply to: [list | sender only]
- Prev by Date: Re: [Imgcif-l] Adding references to external files to imgCIF
- Next by Date: Re: [Imgcif-l] Adding references to external files to imgCIF
- Prev by thread: Re: [Imgcif-l] Adding references to external files to imgCIF
- Next by thread: Re: [Imgcif-l] Adding references to external files to imgCIF
- Index(es):