Crystallographic Information Framework

[CIF logo]

Index

Image CIF dictionary (imgCIF) version 3.00.04

_array_data.data

Name:
_array_data.data

Definition:

   
    The value of _array_data.data contains the array data
    encapsulated in a STAR string.

    The representation used is a variant on the
    Multipurpose Internet Mail Extensions (MIME) specified
    in RFC 2045-2049 by N. Freed et al.  The boundary
    delimiter used in writing an imgCIF or CBF is
    '\n--CIF-BINARY-FORMAT-SECTION--' (including the
    required initial '\n--').

    The Content-Type may be any of the discrete types permitted
    in RFC 2045; 'application/octet-stream' is recommended
    for diffraction images in the ARRAY_DATA category.
    Note:  When appropriate in other categories, e.g. for
    photographs of crystals, more precise types, such as
    'image/jpeg', 'image/tiff', 'image/png', etc. should be used.

    If an octet stream was compressed, the compression should
    be specified by the parameter
      'conversions="X-CBF_PACKED"'
    or the parameter
      'conversions="X-CBF_CANONICAL"'
    or the parameter
      'conversions="X-CBF_BYTE_OFFSET"'
    or the parameter
      'conversions="X-CBF_BACKGROUND_OFFSET_DELTA"'

    If the parameter
      'conversions="X-CBF_PACKED"'
    is given it may be further modified with the parameters
      '"uncorrelated_sections"'
    or
      '"flat"'

    If the '"uncorrelated_sections"' parameter is
    given, each section will be compressed without using
    the prior section for averaging.

    If the '"flat"' parameter is given, each the
    image will be treated as one long row.

    Note that the X-CBF_CANONICAL and X-CBF_PACKED are
    slower but more efficient compressions that the others.
    The X-CBF_BYTE_OFFSET compression is a good compromise
    between speed and efficiency for ordinary diffraction
    images.  The X-CBF_BACKGROUND_OFFSET_DELTA compression
    is oriented towards sparse data, such as masks and
    tables of replacement pixel values for images with
    overloaded spots.

    The Content-Transfer-Encoding may be 'BASE64',
    'Quoted-Printable', 'X-BASE8', 'X-BASE10',
    'X-BASE16' or 'X-BASE32K', for an imgCIF or 'BINARY'
    for a CBF.  The octal, decimal and hexadecimal transfer
    encodings are provided for convenience in debugging and
    are not recommended for archiving and data interchange.

    In a CIF, one of the parameters 'charset=us-ascii',
    'charset=utf-8' or 'charset=utf-16' may be used on the
    Content-Transfer-Encoding to specify the character set
    used for the external presentation of the encoded data.
    If no charset parameter is given, the character set of
    the enclosing CIF is assumed.  In any case, if a BOM
    flag is detected (FE FF for big-endian UTF-16, FF FE for
    little-endian UTF-16 or EF BB BF for UTF-8) is detected,
    the indicated charset will be assumed until the end of the
    encoded data or the detection of a different BOM.  The
    charset of the Content-Transfer-Encoding is not the character
    set of the encoded data, only the character set of the
    presentation of the encoded data and should be respecified
    for each distinct STAR string.

    In an imgCIF file, the encoded binary data begins after
    the empty line terminating the header.  In an imgCIF file,
    the encoded binary data ends with the terminating boundary
    delimiter '\n--CIF-BINARY-FORMAT-SECTION----'
    in the currently effective charset or with the '\n; '
    that terminates the STAR string.

    In a CBF, the raw binary data begins after an empty line
    terminating the header and after the sequence:

    Octet   Hex   Decimal  Purpose
      0     0C       12    (ctrl-L) Page break
      1     1A       26    (ctrl-Z) Stop listings in MS-DOS
      2     04       04    (Ctrl-D) Stop listings in UNIX
      3     D5      213    Binary section begins

    None of these octets are included in the calculation of
    the message size or in the calculation of the
    message digest.

    The X-Binary-Size header specifies the size of the
    equivalent binary data in octets.  If compression was
    used, this size is the size after compression, including
    any book-keeping fields.  An adjustment is made for
    the deprecated binary formats in which eight bytes of binary
    header are used for the compression type.  In this case,
    the eight bytes used for the compression type are subtracted
    from the size, so that the same size will be reported
    if the compression type is supplied in the MIME header.
    Use of the MIME header is the recommended way to
    supply the compression type.  In general, no portion of
    the  binary header is included in the calculation of the size.

    The X-Binary-Element-Type header specifies the type of
    Binary data in the octets, using the same descriptive
    phrases as in _array_structure.encoding_type.  The default
    value is 'unsigned 32-bit integer'.

    An MD5 message digest may, optionally, be used. The 'RSA Data
    Security, Inc. MD5 Message-Digest Algorithm' should be used.
    No portion of the header is included in the calculation of the
    message digest.

    If the Transfer Encoding is 'X-BASE8', 'X-BASE10' or
    'X-BASE16', the data are presented as octal, decimal or
    hexadecimal data organized into lines or words.  Each word
    is created by composing octets of data in fixed groups of
    2, 3, 4, 6 or 8 octets, either in the order ...4321 ('big-
    endian') or 1234... ('little-endian').  If there are fewer
    than the specified number of octets to fill the last word,
    then the missing octets are presented as '==' for each
    missing octet.  Exactly two equal signs are used for each
    missing octet even for octal and decimal encoding.
    The format of lines is:

    rnd xxxxxx xxxxxx xxxxxx

    where r is 'H', 'O' or 'D' for hexadecimal, octal or
    decimal, n is the number of octets per word and d is '<'
    or '>' for the '...4321' and '1234...' octet orderings,
    respectively.  The '==' padding for the last word should
    be on the appropriate side to correspond to the missing
    octets, e.g.

    H4< FFFFFFFF FFFFFFFF 07FFFFFF ====0000

    or

    H3> FF0700 00====

    For these hexadecimal, octal and decimal formats only,
    comments beginning with '#' are permitted to improve
    readability.

    BASE64 encoding follows MIME conventions.  Octets are
    in groups of three: c1, c2, c3.  The resulting 24 bits
    are broken into four six-bit quantities, starting with
    the high-order six bits (c1 >> 2) of the first octet, then
    the low-order two bits of the first octet followed by the
    high-order four bits of the second octet [(c1 & 3)<<4 | (c2>>4)],
    then the bottom four bits of the second octet followed by the
    high-order two bits of the last octet [(c2 & 15)<<2 | (c3>>6)],
    then the bottom six bits of the last octet (c3 & 63).  Each
    of these four quantities is translated into an ASCII character
    using the mapping:

          1         2         3         4         5         6
    0123456789012345678901234567890123456789012345678901234567890123
    |         |         |         |         |         |         |
    ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/

    With short groups of octets padded on the right with one '='
    if c3 is missing, and with '==' if both c2 and c3 are missing.

    X-BASE32K encoding is similar to BASE64 encoding, except that
    sets of 15 octets are encoded as sets of 8 16-bit unicode
    characters, by breaking the 120 bits into 8 15-bit quantities.
    256 is added to each 15 bit quantity to bring it into a
    printable uncode range.  When encoding, zero padding is used
    to fill out the last 15 bit quantity.  If 8 or more bits of
    padding are used, a single equals sign (hexadecimal 003D) is
    appended.  Embedded whitespace and newlines are introduced
    to produce lines of no more than 80 characters each.  On
    decoding, all printable ascii characters and ascii whitespace
    characters are ignored except for any trailing equals signs.
    The number of trailing equals signs indicated the number of
    trailing octets to be trimmed from the end of the decoded data.
    (see Georgi Darakev, Vassil Litchev, Kostadin Z. Mitev, Herbert
    J. Bernstein, 'Efficient Support of Binary Data in the XML
    Implementation of the NeXus File Format',absract W0165,
    ACA Summer Meeting, Honolulu, HI, July 2006).

    QUOTED-PRINTABLE encoding also follows MIME conventions, copying
    octets without translation if their ASCII values are 32...38,
    42, 48...57, 59, 60, 62, 64...126 and the octet is not a ';'
    in column 1.  All other characters are translated to =nn, where
    nn is the hexadecimal encoding of the octet.  All lines are
    'wrapped' with a terminating '=' (i.e. the MIME conventions
    for an implicit line terminator are never used).

    The "X-Binary-Element-Byte-Order" can specify either
    '"BIG_ENDIAN"' or '"LITTLE_ENDIAN"' byte order of the imaage
    data.  Only LITTLE_ENDIAN is recommended.  Processors
    may treat BIG_ENDIAN as a warning of data that can
    only be processed by special software.

    The "X-Binary-Number-of-Elements" specifies the number of
    elements (not the number of octets) in the decompressed, decoded
    image.

    The optional "X-Binary-Size-Fastest-Dimension" specifies the
    number of elements (not the number of octets) in one row of the
    fastest changing dimension of the binary data array. This
    information must be in the MIME header for proper operation of
    some of the decompression algorithms.

    The optional "X-Binary-Size-Second-Dimension" specifies the
    number of elements (not the number of octets) in one column of
    the second-fastest changing dimension of the binary data array.
    This information must be in the MIME header for proper operation
    of some of the decompression algorithms.

    The optional "X-Binary-Size-Third-Dimension" specifies the
    number of sections for the third-fastest changing dimension of
    the binary data array.

    The optional "X-Binary-Size-Padding" specifies the size in
    octets of an optional padding after the binary array data and
    before the closing flags for a binary section.

Type: Binary

Category:
array_data