[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Reply to: [list | sender only]
status of v1 of the imgcif library
- To: imgCIF listserver <imgcif-l@bnl.gov>
- Subject: status of v1 of the imgcif library
- From: Paul Ellis <ellis@ssrl-real.slac.stanford.edu>
- Date: Fri, 13 Mar 1998 09:55:55 -0800
Hi everyone, This is to let you all know the current status of the first version of the imgcif library and to get comments on the way I have implemented a few of the points you have been dicussing. 1. The compression/decompression and cif/cbf file parsing code is running. The full library should be finished next week or the week after. When it is complete and tested, I'll let you know. 2. The compression I have implemented is a lossless pixel-to-pixel differences followed by modified Huffman encoding. Using Andy and Bob's terminology, this corresponds to the simplest type of "Predictor Huffman" algorithm. This scheme produces a bitstream (which is then encoded in 8-bit bytes) so the little-endian/big-endian difference dissappears. It also has the advantage that the compressed image depends only on the pixel values and not on the number of bytes occupied by the pixel on the computer doing the compression or whether the original pixels were signed or unsigned. Compression and decompression are fairly fast and the compression ratio is good. To compress or decompress a 2000*2000 pixel 18-bit image typically takes less than 2 seconds on a 300MHz pentium-II or an R10000 SG and less than 1.5 seconds on a 500MHz alpha. With typical images from SSRL, each pixel yields around 5-6 bits in the compressed image. This corresponds to a compression ratio around 3:1 compared to the 16-bit per pixel with overflow table scheme used by MAR for uncompressed images. I don't think there should be any copyright/patent problems as the modifications to the basic Huffman algorithm and all the code are mine. 3. The type of encoding is stored within the binary section (as well as in the CIF header) so additional compression schemes can be added in the future. 4. The binary sections are stored as ';'-delimited strings in an otherwise pure CIF file. eg: # # Array data # loop_ _array_data.array_id _array_data.data image_1 ; START OF BINARY (binary data) END OF BINARY ; The start and end of the binary section are structured in a way similar to that described in section 6 (Binary sections) of the OVERVIEW OF THE FORMAT in the draft proposal. ###_START_OF_HEADER, ###_END_OF_HEADER, ###_START_OF_BIN, ###_END_OF_BINARY, ###_END_OF_CBF are no longer necessary. 5. Because the binary sections are encoded simply as an extra data type, a file can contain any number of binary sections appearing in any order. There is no restriction to a single binary section. This can work with very large files with multiple binary sections because a binary section is read into memory only when that data is requested by the calling program rather than when the imgcif file is first parsed. This lets a program access all of the pure-CIF data very rapidly and then access the binary data as needed. 6. The library is pure ANSI C and should run on any computer with an ANSI C compiler without having to define any system-specifics. The only assumption made in the code is that an int is at least 32 bits. This assumption may dissapear by the final version. If someone needs the library to work on a system with 16-bit ints, please let me know. Does anyone have any comments? Paul Ellis
Reply to: [list | sender only]
- Prev by Date: AFP/XPLOR/ Electronic Mailing/Transaction Processing Project Manager
- Next by Date: Re: Removing the '_array_element_size' category or not ?
- Prev by thread: Re: Web Pages
- Next by thread: Re: status of v1 of the imgcif library
- Index(es):