Discussion List Archives

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Imgcif-l] proposed change in first line of imgcif files

Hi

this is right, in essence. But there's more...

On 18 Sep 2008, at 05:41, James Hester wrote:
>

> Correct me if I'm wrong, but I
> believe it is because imgCIF files can be enormous and the overhead of
> reading through the entire file to determine missing tags is
> prohibitive.

What I would like to be able to do is determine a lot about the image  
from the first line of the imgCIF/CBF; in this case, "a lot" means  
things like the type of detector for several reasons (I don't believe  
these will ever change, even if imgCIF becomes a universal standard  
that everyone adopts and uses), including:

(1) I have to rely on our local analysis of what default values of  
things to use, rather than items in the imgCIF header - getting a  
detector/image type from the first line means not having to parse the  
entire header in at least some cases.

(2) In the particular case of the PIlatus images, I'm perfectly happy  
to read in the original Pilatus "cbf", which essentially had the first  
line (saying it was a cbf, and is particularly suited to those  
programs which don't use the header information to any useful extent)  
and the binary section, the miniCBF, which has what I describe in  
shorthand as "the useful information", or the full CBF, which is close  
to what people who have worked with CIF over the last 15 - 20 years  
would recognize as a fully-formed CIF. The difference comes in how  
quickly I can read these images in, what parsing routines to use (e.g.  
home written routines, ones someone else may have donated or cbflib,  
or a combination of the three), and how much work I need to do to  
interpret the header stuff.


>
>
> Would it be possible to fix this with a DDL attribute that dictionary
> writers could use to indicate that a data item should appear at the
> 'beginning' of a data block?  This attribute would work as follows: it
> could take values 'beginning', 'middle' and 'end', with 'middle' being
> the default.  An imgCIF dictionary would specify certain data values
> as occuring at the 'beginning' (ie in the header) and input programs
> such as MOSFLM would then need only to read until they found a data
> value that was not specified as belonging to the 'beginning' (or
> alternatively, until they found a data value belonging to the 'end'.)
> This attribute would actually be used quite rarely in the CIF world as
> I don't think there is a general need for this sort of control of
> order within a datablock.  Note that using multiple dictionaries (eg
> mmCIF and imgCIF) would not introduce ambiguity about positioning, as
> the order within the beginning/middle/end zones would still be
> arbitrary.
>
> I'd be interested to hear whether or not such a scheme would remove
> the need for a special header comment (I'm looking particularly at
> Harry and Herbert here), and if the response is positive, I will take
> it to COMCIFS for discussion.
>
> James.
>
> On Wed, Aug 27, 2008 at 10:35 AM, Herbert J. Bernstein
> <yaya@bernstein-plus-sons.com> wrote:
>>
>> There was an informal meeting to discuss imgCIF at the IUCr
>> Congress in Osaka on 26 August 2008.  Details of the
>> discussion will follow in future nessages.  This message
>> will summarize a proposal for a change in the first line
>> of all CBF/imgCIF files that are not fully populated
>> with all the imgCIF tags needed for processing by mosflm
>> and adxv.
>>
>> 1.  What problem is being solve?.  As the use of imgCIF
>> has increased, two very distinct sets of files have appeared:
>> the "miniCBFs" used for the Pilatus 6m detector and
>> more fully populated imgCIF files, such as the ones
>> produced for ADSC detectors.  While the information
>> necessary for processing can be discovered from context
>> in handling a miniCBF, it may be necessary to read fairly
>> far into the file to discover that the file is indeed a
>> miniCBF, complicating the design of reading software.
>>
>> 2.  The proposed solution.  Currently CBF files begin
>> with a magic number comment line
>>           1         2         3         4         5
>>  12345678901234567890123456789012345678901234567890
>>  ###CBF: VERSION n.m
>>
>> We propose to extend the magic number comment line with
>> two optional fields to read
>>
>>           1         2         3         4         5
>>  12345678901234567890123456789012345678901234567890
>>  ###CBF: VERSION n.m     style     style_version
>>
>> where "style" is a unique CBF style identifier left
>> justified as a single word in columns 25-34 and
>> "style_version" is a left justified integer in
>> columns 35-44.
>>
>> Each style will be registered in a central repository
>> along with information on the tags that will be
>> carried forthat stye and a template of the  tags
>> that would be needed to fully populate the file.
>>
>> More details will follow on this list and on the
>> CBFlib wiki after the Osaka meeting is over.
>> =====================================================
>> Herbert J. Bernstein, Professor of Computer Science
>>   Dowling College, Kramer Science Center, KSC 121
>>        Idle Hour Blvd, Oakdale, NY, 11769
>>
>>                 +1-631-244-3035
>>                 yaya@dowling.edu
>> =====================================================
>> _______________________________________________
>> imgcif-l mailing list
>> imgcif-l@iucr.org
>> http://scripts.iucr.org/mailman/listinfo/imgcif-l
>
>
>
> --
> T +61 (02) 9717 9907
> F +61 (02) 9717 3145
> M +61 (04) 0249 4148
> _______________________________________________
> imgcif-l mailing list
> imgcif-l@iucr.org
> http://scripts.iucr.org/mailman/listinfo/imgcif-l

Harry
-- 
Dr Harry Powell, MRC Laboratory of Molecular Biology, MRC Centre,  
Hills Road, Cambridge, CB2 0QH



_______________________________________________
imgcif-l mailing list
imgcif-l@iucr.org
http://scripts.iucr.org/mailman/listinfo/imgcif-l

Reply to: [list | sender only]
International Union of Crystallography

Scientific Union Member of the International Science Council (admitted 1947). Member of CODATA, the ISC Committee on Data. Partner with UNESCO, the United Nations Educational, Scientific and Cultural Organization in the International Year of Crystallography 2014.

International Science Council Scientific Freedom Policy

The IUCr observes the basic policy of non-discrimination and affirms the right and freedom of scientists to associate in international scientific activity without regard to such factors as ethnic origin, religion, citizenship, language, political stance, gender, sex or age, in accordance with the Statutes of the International Council for Science.