Differences between versions 2.0.x and 1.0 of the Core CIF Dictionary

Contents

Syntax
Dictionary formalism
Dictionary contents
Data files

Syntax

  • In versions 2.0 and subsequently, the limitation to 32 characters of data names is removed. Data names have no formal length limit. However, the limit of 80 characters on the overall line length restricts the useful length of data names to 76 characters (permitting the string "data" to be prepended in constructing data block names within the dictionary).

Dictionary formalism

CIF dictionaries are themselves STAR data files, with data blocks or save frames containing the definition and properties of the data names described. These definitions and properties are recorded with the use of data names described in a Dictionary Definition Language (DDL). Version 2.0 of the Core CIF dictionary employs a DDL different from that used in release 1.0. Consequently the layout of definition entries in the two dictionaries differs. For more information on DDL, consult the DDL page on this server.

The major differences apparent to the average user are:

  • Formalisation of categories. In the original CIF dictionary, data names were constructed as a set of components separated by underscores, such as _atom_site_fract_x, which were chosen to reflect a hierarchical classification scheme. However, there was no formal mechanism for dividing a data name into its category and topic components. The new dictionary assigns each data name to a category, usually (but not always) corresponding in name to the first few components of the data name itself. Thus _atom_site_fract_x belongs to the atom_site category, _atom_type_description to the atom_type category, and _cell_length_a to the cell category.

    It is a requirement of CIFs conforming to DDL version 1.4 and above that a single loop_ construct may contain only data names of the same category.

    The data names in the dictionary are arranged alphabetically by category; in the formatted versions, the current category is identified in the running head to each page.

  • Descriptive sections. Within the dictionary, each category contains a data name designed to include information about that category, and not intended for use as an entry in a CIF data file. The data names fulfilling this function are all assigned to the category category_overview, have type 'null', and include square brackets, e.g.
    data_atom_type_[]
    _name '_atom_type_[]'
    _category category_overview
    _type null
    _definition
    ; Data items in the ATOM_TYPE category record details about
    properties of the atoms that occupy the atom sites, such as the
    atomic scattering factors.
    ;
    The square brackets may include a code for the relevant dictionary extension; e.g. a symmetry CIF dictionary might extend the contents of the atom_type category and have an annotation of the new features introduced to the category in a data block for _atom_type_[sym].

    A similar explanatory purpose was behind the _chemical_formula_appendix entry in the original Core dictionary, which has been deleted from version 2.0 and above.

  • Withdrawal of units extensions. The original Core dictionary permitted the derivation of new data names from existing ones by appending a suffix to indicate the use of different physical units (e.g. _cell_length_a_nm derived from _cell_length_a to yield a cell length in nanometres, instead of the default ångströms). This mechanism is no longer sanctioned, and all data names have a single associated physical unit (listed in the dictionary).

    For strict compatibility with old data files, a compatibility dictionary has been produced. This may be used by dictionary validation software to permit the recognition of the old data names, but it should not be used to generate such data names in new data files.

  • Dictionary identification strings. Previous dictionary files were referred to by an informal file name such as cifdic.C91. The new dictionary contains a header section including strings for the _dictionary_name and _dictionary_version, e.g.
    data_on_this_dictionary
    _dictionary_name cif_core.dic
    _dictionary_version 2.0.1
    _dictionary_update 1997-01-20
    _dictionary_history
    ;
    1991-05-27 Created from CIF Dictionary text. SRH
    ...
    1996-11-27 Release version 2.0. IUCr
    ...
    ;
    These strings should be used within a CIF to identify the dictionary version with which that CIF is compatible, e.g.
    data_cif_example
    _audit_conform_dict_name cif_core.dic
    _audit_conform_dict_version 2.0.1
    _audit_conform_dict_location ftp://ftp.iucr.ac.uk/pub/cif_core.2.0.1.dic
    Note from the example a typical URL specifying the file name of the current Core dictionary. On the IUCr server, the local file name is compounded of the dictionary name and version strings; but if the dictionary were downloaded to a DOS computer, it would need to be stored under a different file name. It is the responsibility of an installation accessing dictionary files to provide the mapping between local file names and the dictionary identifiers.

Dictionary contents

Considerable effort has been expended in ensuring compatibility with data names in old data files. To this end, version 2.0 of the Core dictionary includes all data names in version 1.0, with the following minor exceptions (both discussed further in the Dictionary formalism section above):
  1. Names generated by the addition of suffixes listed as _units_extension in version 1.0. These may still be accessed through the compatibility CIF dictionary described above, but should not be used in new data files.
  2. _chemical_formula_appendix, which included a textual description of a group of data names and is replaced in version 2.0 by data names in the category_overview category.
In addition, the following data names, though still present in the version 2.0 dictionary, carry an indication that they should be replaced in future data files by alternative names. The purpose of this to align data names with the name structure of their parent category (i.e. it is potentially confusing that the data name _diffrn_radiation_detector should belong to the diffrn_detector category and not the diffrn_radiation one).
  1. _diffrn_radiation_detector should be replaced by _diffrn_detector
  2. _diffrn_radiation_detector_dtime should be replaced by _diffrn_detector_dtime
  3. _diffrn_radiation_source should be replaced by _diffrn_source
The following 203 data names are new in version 2.0 of the Core. They represent 55 category descriptions and 148 `real' data names. (The total number of data names defined in version 2.0 is 624.)
  • _atom_site_[]
    • _atom_site_aniso_B_11
      _atom_site_aniso_B_12
      _atom_site_aniso_B_13
      _atom_site_aniso_B_22
      _atom_site_aniso_B_23
      _atom_site_aniso_B_33
    • _atom_site_aniso_ratio
    • _atom_site_B_equiv_geom_mean
    • _atom_site_B_iso_or_equiv
    • _atom_site_disorder_assembly
    • _atom_site_U_equiv_geom_mean
  • _atom_sites_[]
    • _atom_sites_Cartn_tran_vector_1
      _atom_sites_Cartn_tran_vector_2
      _atom_sites_Cartn_tran_vector_3
    • _atom_sites_fract_tran_matrix_11
      _atom_sites_fract_tran_matrix_12
      _atom_sites_fract_tran_matrix_13
      _atom_sites_fract_tran_matrix_21
      _atom_sites_fract_tran_matrix_22
      _atom_sites_fract_tran_matrix_23
      _atom_sites_fract_tran_matrix_31
      _atom_sites_fract_tran_matrix_32
      _atom_sites_fract_tran_matrix_33
    • _atom_sites_fract_tran_vector_1
      _atom_sites_fract_tran_vector_2
      _atom_sites_fract_tran_vector_3
  • _atom_type_[]
    • _atom_type_scat_length_neutron
  • _audit_[]
  • _audit_author_[]
    • _audit_author_address
    • _audit_author_name
    • _audit_block_code
  • _audit_conform_[]
    • _audit_conform_dict_location
    • _audit_conform_dict_name
    • _audit_conform_dict_version
  • _audit_contact_author_[]
    • _audit_contact_author_address
    • _audit_contact_author_email
    • _audit_contact_author_fax
    • _audit_contact_author_name
    • _audit_contact_author_phone
  • _audit_link_[]
    • _audit_link_block_code
    • _audit_link_block_description
  • _cell_[]
  • _cell_measurement_refln_[]
  • _chemical_[]
  • _chemical_conn_atom_[]
  • _chemical_conn_bond_[]
  • _chemical_formula_[]
    • _chemical_formula_iupac
  • _citation_[]
    • _citation_abstract
    • _citation_abstract_id_CAS
  • _citation_author_[]
    • _citation_author_citation_id
    • _citation_author_name
    • _citation_author_ordinal
    • _citation_book_id_ISBN
    • _citation_book_publisher
    • _citation_book_publisher_city
    • _citation_book_title
    • _citation_coordinate_linkage
    • _citation_country
    • _citation_database_id_Medline
  • _citation_editor_[]
    • _citation_editor_citation_id
    • _citation_editor_name
    • _citation_editor_ordinal
    • _citation_id
    • _citation_journal_abbrev
    • _citation_journal_full
    • _citation_journal_id_ASTM
    • _citation_journal_id_CSD
    • _citation_journal_id_ISSN
    • _citation_journal_issue
    • _citation_journal_volume
    • _citation_language
    • _citation_page_first
    • _citation_page_last
    • _citation_special_details
    • _citation_title
    • _citation_year
  • _computing_[]
  • _database_[]
  • _diffrn_[]
    • _diffrn_ambient_environment
  • _diffrn_attenuator_[]
    • _diffrn_crystal_treatment
    • _diffrn_detector
  • _diffrn_detector_[]
    • _diffrn_detector_details
    • _diffrn_detector_dtime
    • _diffrn_detector_type
  • _diffrn_measurement_[]
    • _diffrn_measurement_details
    • _diffrn_measurement_device_details
    • _diffrn_measurement_device_type
    • _diffrn_measurement_specimen_support
  • _diffrn_orient_matrix_[]
  • _diffrn_orient_refln_[]
    • _diffrn_orient_refln_angle_omega
    • _diffrn_orient_refln_angle_theta
  • _diffrn_radiation_[]
    • _diffrn_radiation_collimation
    • _diffrn_radiation_probe
  • _diffrn_radiation_wavelength_[]
    • _diffrn_radiation_xray_symbol
  • _diffrn_refln_[]
    • _diffrn_refln_scan_rate
    • _diffrn_refln_scan_time_backgd
  • _diffrn_reflns_[]
  • _diffrn_scale_group_[]
  • _diffrn_source_[]
    • _diffrn_source
    • _diffrn_source_current
    • _diffrn_source_details
    • _diffrn_source_power
    • _diffrn_source_size
    • _diffrn_source_target
    • _diffrn_source_type
    • _diffrn_source_voltage
  • _diffrn_standard_refln_[]
  • _diffrn_standards_[]
  • _exptl_[]
  • _exptl_crystal_[]
  • _exptl_crystal_face_[]
  • _geom_[]
  • _geom_angle_[]
  • _geom_bond_[]
  • _geom_contact_[]
  • _geom_hbond_[]
    • _geom_hbond_angle_DHA
    • _geom_hbond_atom_site_label_A
      _geom_hbond_atom_site_label_D
      _geom_hbond_atom_site_label_H
    • _geom_hbond_distance_DA
      _geom_hbond_distance_DH
      _geom_hbond_distance_HA
    • _geom_hbond_publ_flag
    • _geom_hbond_site_symmetry_A
      _geom_hbond_site_symmetry_D
      _geom_hbond_site_symmetry_H
  • _geom_torsion_[]
  • _journal_[]
    • _journal_data_validation_number
  • _journal_index_[]
    • _journal_index_subterm
    • _journal_index_term
    • _journal_index_type
    • _journal_language
    • _journal_paper_category
  • _publ_[]
  • _publ_author_[]
    • _publ_author_footnote
  • _publ_body_[]
    • _publ_body_contents
    • _publ_body_element
    • _publ_body_format
    • _publ_body_label
    • _publ_body_title
    • _publ_contact_author_address
    • _publ_contact_author_name
  • _publ_manuscript_incl_[]
    • _publ_requested_category
    • _publ_section_exptl_prep
    • _publ_section_exptl_refinement
    • _publ_section_exptl_solution
    • _publ_section_synopsis
    • _publ_section_title_footnote
  • _refine_[]
    • _refine_diff_density_rms
    • _refine_ls_R_Fsqd_factor
    • _refine_ls_R_I_factor
    • _refine_ls_d_res_high
    • _refine_ls_d_res_low
    • _refine_ls_weighting_details
  • _refln_[]
  • _reflns_[]
  • _reflns_scale_[]
    • _reflns_shell_Rmerge_F_all
      _reflns_shell_Rmerge_F_obs
    • _reflns_shell_Rmerge_I_all
      _reflns_shell_Rmerge_I_obs
  • _reflns_shell_[]
    • _reflns_shell_d_res_high
    • _reflns_shell_d_res_low
    • _reflns_shell_meanI_over_sigI_all
    • _reflns_shell_meanI_over_sigI_obs
    • _reflns_shell_number_measured_all
    • _reflns_shell_number_measured_obs
    • _reflns_shell_number_possible
    • _reflns_shell_number_unique_all
    • _reflns_shell_number_unique_obs
    • _reflns_shell_percent_possible_all
    • _reflns_shell_percent_possible_obs
  • _symmetry_[]
  • _symmetry_equiv_[]
    • _symmetry_equiv_pos_site_id

Data files

Every effort has been made to ensure that existing data files remain conformant to the new dictionary. Provided an existing CIF output application does not use derivative data names indicating the use of different physical units. as permitted in the original Core, the resultant files will fully conform to the Core CIF version 2.0 and upwards.

However, any data files wishing to use the new data names introduced in version 2.0 and later should include a record of the version of the dictionary against which the file was compiled. This is achieved by adding to each data block the pair of entries indicated below for the most recent version:

_audit_conform_dict_name cif_core.dic
_audit_conform_dict_version 2.0.1

This will become increasingly important as dictionary validation software is written that checks data types, value ranges and interdependencies.


Updated 29 January 1997