Discussion List Archives

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Imgcif-l] Reading CBF file headers from Python with PyCifRW

Hi Graeme,

See the below code for a slightly hacky fix for CBF files.  As PyCIFRW reads
in the entire file text before commencing parsing, this code simply chops
out all characters between instances of '-BINARY-FILE-SECTION-' before
parsing what is left.  This is hacky because, if someone has put this
character string in a comment or non array_data.data datavalue, this
approach won't work.  I don't think this will be a problem in practice.

You should use the keyword argument 'CBF=True' when reading the file, i.e.

cf = CifFile("mycbfdata.cbf",CBF=True)

I'm not sure what version is distributed with cctbx, so here is a
long-winded description of the edits you need to make to what you have.
Feel free to improvise at will.

1) in file StarFile.py go to the ReadStar function definition and change it
to the following (I've simply added the 'CBF' keyword argument):

def
ReadStar(filename,maxlength=2048,dest=StarFile(),scantype='standard',grammar='1.1',CBF=False):

2) Further on in the function, add the lines marked with a '+' sign below:
    if not text:      # empty file, return empty block
        dest.set_uri(my_uri)
        return dest
+   # filter out non-ASCII characters in CBF files if required.  We assume
+   # that the binary is enclosed in a fixed string that occurs
+   # nowhere else.
+   if CBF:
+      text_bits  = text.split("-BINARY-FORMAT-SECTION-")
+      text = text_bits[0]
+      for section in range(2,len(text_bits),2):
+          text = text+" (binary omitted)"+text_bits[section]
    # we recognise ctrl-Z as end of file
    endoffile = text.find('\x1a')
    if endoffile >= 0:

and you're done!  Let me know if there are any problems.

On Thu, Jul 15, 2010 at 12:08 AM, <Graeme.Winter@diamond.ac.uk> wrote:

> Hi James,
>
> I would think graefully ignoring all non-cif file contents would work
> wonderfully... I look forward to a fix!
>
> Best wishes,
>
> Graeme
>
> -----Original Message-----
> From: imgcif-l-bounces@iucr.org [mailto:imgcif-l-bounces@iucr.org] On
> Behalf Of James Hester
> Sent: 14 July 2010 14:10
> To: The Crystallographic Binary File and its imgCIF application to image
> data
> Subject: Re: [Imgcif-l] Reading CBF file headers from Python with
> PyCifRW
>
> I'm glad you like PyCifRW...but at the moment it won't play well with
> cbf, as it is expecting pure ASCII for all tokens.  I'll look into
> graceful handling of syntax errors, so you at least get back anything
> that has been successfully parsed.
>
> I once thought it might be nice if PyCIFRW could deal with imgCIF and
> CBF files, but quickly realised that I would need to be using a C
> library for speed and would thus have issues distributing it in a
> uniform cross-platform manner.
>
> On Wed, Jul 14, 2010 at 5:49 PM, <Graeme.Winter@diamond.ac.uk> wrote:
>
> > Hi Folks,
> >
> > Has anyone tried reading cbf files with PyCifRW? It appears to cope
> > fine with the "real" cif at the top but then chokes when it hits the
> > binary data. I'd like to be able to just read the cif describing the
> > experiment, so in theory there should be a way to do this. If I copy
> > out just the cif into another file it works fine.
> >
> > Thing is, I don't want to copy the cif out to read it when I want to
> > parse every header in a data set :o) - has anyone a workaround for
> this?
> >
> >
> > I like PyCifRW as it's included in cctbx and works just fine. The
> > pycbf isn't built by default :o(
> >
> > Thanks,
> >
> > Graeme
> >
> >
> >
> > Dr. Graeme Winter
> > Software and MX Support Scientist
> > Diamond Light Source
> >
> > +44 1235 778091 (work)
> > +44 7786 662784 (work mobile)
> >
> >
> >
> >
> >
> > --
> >
> > This e-mail and any attachments may contain confidential, copyright
> > and or privileged material, and are for the use of the intended
> > addressee only. If you are not the intended addressee or an authorised
>
> > recipient of the addressee please notify us of receipt by returning
> > the e-mail and do not use, copy, retain, distribute or disclose the
> > information in or attached to the e-mail.
> >
> > Any opinions expressed within this e-mail are those of the individual
> > and not necessarily of Diamond Light Source Ltd.
> >
> > Diamond Light Source Ltd. cannot guarantee that this e-mail or any
> > attachments are free from viruses and we cannot accept liability for
> > any damage which you may sustain as a result of software viruses which
>
> > may be transmitted in or with the message.
> >
> > Diamond Light Source Limited (company no. 4375679). Registered in
> > England and Wales with its registered office at Diamond House, Harwell
>
> > Science and Innovation Campus, Didcot, Oxfordshire, OX11 0DE, United
> > Kingdom
> >
> >
> >
> >
> >
> >
> >
> >
> >
> > _______________________________________________
> > imgcif-l mailing list
> > imgcif-l@iucr.org
> > http://scripts.iucr.org/mailman/listinfo/imgcif-l
> >
>
>
>
> --
> T +61 (02) 9717 9907
> F +61 (02) 9717 3145
> M +61 (04) 0249 4148
> _______________________________________________
> imgcif-l mailing list
> imgcif-l@iucr.org
> http://scripts.iucr.org/mailman/listinfo/imgcif-l
>
> --
> This e-mail and any attachments may contain confidential, copyright and or
> privileged material, and are for the use of the intended addressee only. If
> you are not the intended addressee or an authorised recipient of the
> addressee please notify us of receipt by returning the e-mail and do not
> use, copy, retain, distribute or disclose the information in or attached to
> the e-mail.
> Any opinions expressed within this e-mail are those of the individual and
> not necessarily of Diamond Light Source Ltd.
> Diamond Light Source Ltd. cannot guarantee that this e-mail or any
> attachments are free from viruses and we cannot accept liability for any
> damage which you may sustain as a result of software viruses which may be
> transmitted in or with the message.
> Diamond Light Source Limited (company no. 4375679). Registered in England
> and Wales with its registered office at Diamond House, Harwell Science and
> Innovation Campus, Didcot, Oxfordshire, OX11 0DE, United Kingdom
>
>
>
>
> _______________________________________________
> imgcif-l mailing list
> imgcif-l@iucr.org
> http://scripts.iucr.org/mailman/listinfo/imgcif-l
>



-- 
T +61 (02) 9717 9907
F +61 (02) 9717 3145
M +61 (04) 0249 4148
_______________________________________________
imgcif-l mailing list
imgcif-l@iucr.org
http://scripts.iucr.org/mailman/listinfo/imgcif-l

Reply to: [list | sender only]
International Union of Crystallography

Scientific Union Member of the International Science Council (admitted 1947). Member of CODATA, the ISC Committee on Data. Partner with UNESCO, the United Nations Educational, Scientific and Cultural Organization in the International Year of Crystallography 2014.

International Science Council Scientific Freedom Policy

The IUCr observes the basic policy of non-discrimination and affirms the right and freedom of scientists to associate in international scientific activity without regard to such factors as ethnic origin, religion, citizenship, language, political stance, gender, sex or age, in accordance with the Statutes of the International Council for Science.