[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Reply to: [list | sender only]
Re: Global section in CIF headers
- To: "Discussion list of the IUCr Committee for the Maintenance of the CIFStandard (COMCIFS)" <comcifs@iucr.org>
- Subject: Re: Global section in CIF headers
- From: James Hester <jamesrhester@gmail.com>
- Date: Wed, 9 Sep 2009 12:49:18 +1000
- In-Reply-To: <F429002A-90A6-4429-A9D7-51016C5180E7@ANL.gov>
- References: <4AA67814.4000007@niehs.nih.gov><F429002A-90A6-4429-A9D7-51016C5180E7@ANL.gov>
Let's explore these ideas. First I think it is worth clarifying the syntax situation. Currently, there are three CIF syntaxes: the old 1.0 syntax, where brackets were allowed to begin non-delimited character strings; the current 1.1 syntax, where this behaviour was disallowed and maximum line lengths were increased; and the upcoming 1.2 syntax, which has bracketed lists and is still being finalised. The next layer of semantics is provided by the dictionaries to which a given file conforms. DDL1, DDL2 and DDLm are relevant only to the dictionaries, not directly to the data files. Note that the "dot" notation introduced in DDL2/mmCIF datanames has no computer-readable meaning, but is purely a convenience for the human reader. While I wholeheartedly agree with the sentiment of removing all data from comments, in the one particular case of distinguishing between different syntaxes, it is convenient to have a syntax indicator in the first few characters of a file. A simple examination of the first line of the file is sufficient to decide which parser to execute, following which no syntax issues remain. However, if the syntax is to be specified in a global_ block, some sort of CIF parser needs to be run in order to even find the global block and discover the precise syntax, following which, presumably, the parser has to reconfigure itself and continue on. This seems like a comparatively complex procedure compared to examination of the first few characters. Note also that the global_ block under this proposal will be forced to occur at the beginning of the file, although the original specification indicated that it could appear anywhere in the file. Moving on to Brian's suggestion, I think we are overdue for finding a way to describe links between data blocks. However, using a global_ block to do this restricts the description to a single data file. As an alternative proposal, how about defining a CIF semantic dictionary, with the following two datanames in it? _semantic.block_id _semantic.block_relationship These datanames could be looped inside any datablock order to convey a series of relationships to other datablocks, including those not in the same file e.g. loop_ _semantic.block_id _semantic.block_relationship |Sydney|090909|JRH 'wavelength determination' |Sydney|080808|VKP 'identical batch of sample' |Sydney|070707|XYZ 'raw powder data' |Sydney|060606|ABC 'Lebail refined structure' In the example I have used a pd_block.id type construction to uniquely identify a datablock. Also, the nature of the relationship could be formalised into an enumerated list to help machine-readability. James. On Wed, Sep 9, 2009 at 2:44 AM, Brian H. Toby<Brian.Toby@anl.gov> wrote: > A global section could also be used to describe the relationships between > data blocks in a single CIF. To date, (outside of pdCIF) the CIF model has > assumed that each block is fully defined internally and thus can be used > independently. This defeats the point of having multi-block file structures. > Brian > On Sep 8, 2009, at 10:28 AM, Joe Krahn wrote: > > I have been thinking that it makes sense to allow a global_ block as > part of a CIF file. Globals have been excluded because they don't fit > very well into the data model, but it might be useful to allow them to > provide general format hints to the parser. > My idea is that a common low-level parser could be used for mmCIF, CIF, > and possibly other STAR variants. The global_ header would define > parsing rules for the file, including possible future revisions of the > same format, but not be considered part of the actual mmCIF data. For > example: > global_ > _format mmCIF > _version 1.2 > In a way, this just replaces the initial comment-embedded CIF > identifier, but I have always dislike the idea of a comment containing > data. This approach could be more detailed, depending on how much the > CIF/mmCIF format changes over time. Will it ever include STAR 2.0 > bracketed lists? Will they ever directly include Unicode text? > Joe Krahn > _______________________________________________ > comcifs mailing list > comcifs@iucr.org > http://scripts.iucr.org/mailman/listinfo/comcifs > > ******************************************************************** > Brian H. Toby, Ph.D. office: 630-252-5488 > Senior Physicist/Materials Characterization Group Leader > Advanced Photon Source > 9700 S. Cass Ave, Bldg. 433/D003 work cell: 630-327-8426 > Argonne National Laboratory secretary (Marija): 630-252-5453 > Argonne, IL 60439-4856 e-mail: brian dot toby at anl dot gov > ******************************************************************** > "We will restore science to its rightful place, and wield technology's > wonders... We will harness the sun and the winds and the soil to fuel our > cars and run our factories... All this we can do. All this we will do." > > _______________________________________________ > comcifs mailing list > comcifs@iucr.org > http://scripts.iucr.org/mailman/listinfo/comcifs > > -- T +61 (02) 9717 9907 F +61 (02) 9717 3145 M +61 (04) 0249 4148
Reply to: [list | sender only]
- Follow-Ups:
- Re: Global section in CIF headers (Brian McMahon)
- References:
- Global section in CIF headers (Joe Krahn)
- Re: Global section in CIF headers (Brian H. Toby)
- Prev by Date: Re: Global section in CIF headers
- Next by Date: Re: Global section in CIF headers
- Prev by thread: Re: Global section in CIF headers
- Next by thread: Re: Global section in CIF headers
- Index(es):