[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Reply to: [list | sender only]
Re: Please advise regarding a design of CIF dictionaries for materialproperties
- To: Nick.Spadaccini@uwa.edu.au,"Discussion list of the IUCr Committee for the Maintenance of the CIFStandard (COMCIFS)" <comcifs@iucr.org>
- Subject: Re: Please advise regarding a design of CIF dictionaries for materialproperties
- From: Saulius Grazulis <grazulis@ibt.lt>
- Date: Sun, 02 Oct 2011 16:19:24 +0300
- In-Reply-To: <CAAE5B5E.15F19%nick@csse.uwa.edu.au>
- Organization: IBT
- References: <CAAE5B5E.15F19%nick@csse.uwa.edu.au>
Dear Nick, many thanks for you detailed answer, and for your comprehensive example of DDLm dictionary! Here, for brevity, I will only highlight the key questions, one question at a time, that might require a very definite and formal answer. On 10/02/2011 01:18 PM, Nick Spadaccini wrote: >> data_... block name in the dictionary no longer matches tag >> name. I guess this should not be a problem... Is it? > > It is a convenience to have the data block name match the _name of > the item, it is NOT a requirement of the DDLs (well certainly was not > at its inception, but I am not sure if interpretations have since > changed). This is exactly the question which bothers me: is it a must that the data_.. block prefix in DDL1 dictionary matches the declared data name, is it a formal recommendation, or is it just common practice and tradition? In other words, is it a MUST match, SHOULD match or MAY match according to the RFC 2119 (http://www.ietf.org/rfc/rfc2119.txt)? May I explain why I insist on that precise wording. When we write a CIF processing program, we want it to be correct, in a sense that it MUST process every correct CIF and produce defined results, and it MUST report an error for every incorrect CIF (provided the sets of correct and incorrect CIFs are computable, which I guess they are according to the current definitions). Now, if the data block<->declare name correspondence is a MUST, then I infer that: a) correct software MAY use data block names to search for name declarations (do we need this?); b) correct software MUST report an error when data block name is not a prefix of a declared data name; c) if a dictionary where b) is the case is ever encountered, then the dictionary is incorrect and it the responsibility of the dictionary maintainer to fix the error. d) if a validator program validates a dictionary against DDL and does not report an error when the the non-conforming dictionary is processed, then the validator is buggy and needs a fix. If, however, the block<->declare name correspondence MAY or SHOULD be, then: a) correct software MUST NOT use data block names to search for name declarations (programmers, beware!); b) correct software MAY/SHOULD report a (suppressable, non-fatal) warning when data block name is not a prefix of a declared data name; c) if dictionaries where b) is the case are encountered, and a program does not accept them, then a program is buggy and it is the responsibility of the program maintainer to fix it. As you see, a course of supposed events when a program accepts or rejects a dictionary differs radically depending on whether the data<->name correspondence is a MUST, SHOULD or MAY item. >From what you say in the quote above ("It is a convenience to have the data block name match the _name of the item"), the correspondence MAY be present (and it MAY be not). According to what David Brown wrote (Wed, 28 Sep 2011 12:07:30 -0400, "It is not a problem in DDLm, I am not sure about DDL1, but it could be confusing. Best avoided."), I get impression that it SHOULD. But according what John Bollinger wrote (Wed, 28 Sep 2011 11:05:54 -0500, "It also specifies (ITG 2.5.5) that item names be used as definition datablock names."), it sounds more like a MUST. So, for me to know how to write a correct CIF validator and a correct CIF dictionary, I need to know how to interpret the definition of the correct DDL1 dictionary -- whether: a) "item names MUST be used as definition datablock names" b) "item names SHOULD be used as definition datablock names" c) "item names MAY be used as definition datablock names" which of the a)-c) situations is the actual case? Any choice among a-c is actually possible; I am sure that every developer has taken this choice silently and meybe even implicitely, but it would probably be beneficial for CIF users to make the choice explicit, especially given the variety of possible interpretations. BTW, I have scanned the existing (ftp://ftp.iucr.org/cifdics/) IUCr dictionaries for the correspondence. In the mmCIF dictionary, all save block names are prefixes of the corresponding declared tag names (data not shown ;); however there are 4 dictionaries that have several cases of data block names differing slightly (I attach a file with the non-matching tag list; the first line is a Perl command that produced it; warning -- long lines!). Thus, picking a "MUST" clause (the case "a)" above) would probably be too restrictive and invalidate too many existing dictionaries... Regards, Saulius -- Dr. Saulius Gražulis Institute of Biotechnology, Graiciuno 8 LT-02241 Vilnius, Lietuva (Lithuania) fax: (+370-5)-2602116 / phone (office): (+370-5)-2602556 mobile: (+370-684)-49802, (+370-614)-36366
+ perl -MCIFParser -le 'for $file(@ARGV) {$p = new CIFParser; $d = $p->Run($file); for(@{$d}) { printf "%-32s %-32s %s\n", $file, "_".$_->{name}, join(",", @{$_->{values}{"_name"}}) unless !$_->{values}{"_name"} || !$_->{name} || substr($_->{values}{"_name"}[0],1) =~ /^\Q$_->{name}\E/ }}' cif_compat_1.0.dic cif_compat.dic cif_core_2.0.1.dic cif_core_2.0.dic cif_core_2.1.dic cif_core_2.2.dic cif_core_2.3.1.dic cif_core_2.3.2.dic cif_core_2.3.dic cif_core_2.4.1.dic cif_core_2.4.2.dic cif_core_2.4.dic cif_core.dic cif_core_restraints_1.0.dic cif_core_restraints.dic cif_img_1.0.dic cif_img_1.3.1.dic cif_img_1.3.2.dic cif_img.dic cif_iucr_1.0.dic cif_iucr.dic cif_mm_1.0.00.dic cif_mm_1.0.dic cif_mm_2.0.03.dic cif_mm_2.0.09.dic cif_mm.dic cif_ms_1.0.1.dic cif_ms_1.0.dic cif_ms.dic cif_pd_1.0.1.dic cif_pd_1.0.dic cif_pd.dic cif_register_1.0.dic cif_register.dic cif_rho_1.0.1.dic cif_rho_1.0.dic cif_rho.dic cif_sym_1.0.1.dic cif_sym_1.0.dic cif_sym.dic ddl2_core_2.1.3.dic ddl2_core.dic ddl_core_1.4.1.dic ddl_core_2.1.3.dic ddl_core.dic draft_cif_core_2.4.dic mmcif_ddl_2.1.6.dic mmcif_ddl.dic mmcif_std_2.0.09.dic mmcif_std.dic # # Dictionary name Data block name Declared tag names (comma-separated) # cif_compat_1.0.dic _atom_site_aniso_B_*_nm _atom_site_aniso_B_11_nm,_atom_site_aniso_B_12_nm,_atom_site_aniso_B_13_nm,_atom_site_aniso_B_22_nm,_atom_site_aniso_B_23_nm,_atom_site_aniso_B_33_nm cif_compat_1.0.dic _atom_site_aniso_B_*_pm _atom_site_aniso_B_11_pm,_atom_site_aniso_B_12_pm,_atom_site_aniso_B_13_pm,_atom_site_aniso_B_22_pm,_atom_site_aniso_B_23_pm,_atom_site_aniso_B_33_pm cif_compat_1.0.dic _atom_site_aniso_U_*_nm _atom_site_aniso_U_11_nm,_atom_site_aniso_U_12_nm,_atom_site_aniso_U_13_nm,_atom_site_aniso_U_22_nm,_atom_site_aniso_U_23_nm,_atom_site_aniso_U_33_nm cif_compat_1.0.dic _atom_site_aniso_U_*_pm _atom_site_aniso_U_11_pm,_atom_site_aniso_U_12_pm,_atom_site_aniso_U_13_pm,_atom_site_aniso_U_22_pm,_atom_site_aniso_U_23_pm,_atom_site_aniso_U_33_pm cif_compat_1.0.dic _atom_site_Cartn_*_nm _atom_site_Cartn_x_nm,_atom_site_Cartn_y_nm,_atom_site_Cartn_z_nm cif_compat_1.0.dic _atom_site_Cartn_*_pm _atom_site_Cartn_x_pm,_atom_site_Cartn_y_pm,_atom_site_Cartn_z_pm cif_compat_1.0.dic _atom_type_radius_*_nm _atom_type_radius_bond_nm,_atom_type_radius_contact_nm cif_compat_1.0.dic _atom_type_radius_*_pm _atom_type_radius_bond_pm,_atom_type_radius_contact_pm cif_compat_1.0.dic _cell_length_*_nm _cell_length_a_nm,_cell_length_b_nm,_cell_length_c_nm cif_compat_1.0.dic _cell_length_*_pm _cell_length_a_pm,_cell_length_b_pm,_cell_length_c_pm cif_compat_1.0.dic _exptl_crystal_size_*_cm _exptl_crystal_size_max_cm,_exptl_crystal_size_mid_cm,_exptl_crystal_size_min_cm,_exptl_crystal_size_rad_cm cif_compat_1.0.dic _refine_diff_density_*_nm _refine_diff_density_max_nm,_refine_diff_density_min_nm,_refine_diff_density_rms_nm cif_compat_1.0.dic _refine_diff_density_*_pm _refine_diff_density_max_pm,_refine_diff_density_min_pm,_refine_diff_density_rms_pm cif_compat_1.0.dic _reflns_d_resolution_*_nm _reflns_d_resolution_high_nm,_reflns_d_resolution_low_nm cif_compat_1.0.dic _reflns_d_resolution_*_pm _reflns_d_resolution_high_pm,_reflns_d_resolution_low_pm cif_compat.dic _atom_site_aniso_B_*_nm _atom_site_aniso_B_11_nm,_atom_site_aniso_B_12_nm,_atom_site_aniso_B_13_nm,_atom_site_aniso_B_22_nm,_atom_site_aniso_B_23_nm,_atom_site_aniso_B_33_nm cif_compat.dic _atom_site_aniso_B_*_pm _atom_site_aniso_B_11_pm,_atom_site_aniso_B_12_pm,_atom_site_aniso_B_13_pm,_atom_site_aniso_B_22_pm,_atom_site_aniso_B_23_pm,_atom_site_aniso_B_33_pm cif_compat.dic _atom_site_aniso_U_*_nm _atom_site_aniso_U_11_nm,_atom_site_aniso_U_12_nm,_atom_site_aniso_U_13_nm,_atom_site_aniso_U_22_nm,_atom_site_aniso_U_23_nm,_atom_site_aniso_U_33_nm cif_compat.dic _atom_site_aniso_U_*_pm _atom_site_aniso_U_11_pm,_atom_site_aniso_U_12_pm,_atom_site_aniso_U_13_pm,_atom_site_aniso_U_22_pm,_atom_site_aniso_U_23_pm,_atom_site_aniso_U_33_pm cif_compat.dic _atom_site_Cartn_*_nm _atom_site_Cartn_x_nm,_atom_site_Cartn_y_nm,_atom_site_Cartn_z_nm cif_compat.dic _atom_site_Cartn_*_pm _atom_site_Cartn_x_pm,_atom_site_Cartn_y_pm,_atom_site_Cartn_z_pm cif_compat.dic _atom_type_radius_*_nm _atom_type_radius_bond_nm,_atom_type_radius_contact_nm cif_compat.dic _atom_type_radius_*_pm _atom_type_radius_bond_pm,_atom_type_radius_contact_pm cif_compat.dic _cell_length_*_nm _cell_length_a_nm,_cell_length_b_nm,_cell_length_c_nm cif_compat.dic _cell_length_*_pm _cell_length_a_pm,_cell_length_b_pm,_cell_length_c_pm cif_compat.dic _exptl_crystal_size_*_cm _exptl_crystal_size_max_cm,_exptl_crystal_size_mid_cm,_exptl_crystal_size_min_cm,_exptl_crystal_size_rad_cm cif_compat.dic _refine_diff_density_*_nm _refine_diff_density_max_nm,_refine_diff_density_min_nm,_refine_diff_density_rms_nm cif_compat.dic _refine_diff_density_*_pm _refine_diff_density_max_pm,_refine_diff_density_min_pm,_refine_diff_density_rms_pm cif_compat.dic _reflns_d_resolution_*_nm _reflns_d_resolution_high_nm,_reflns_d_resolution_low_nm cif_compat.dic _reflns_d_resolution_*_pm _reflns_d_resolution_high_pm,_reflns_d_resolution_low_pm cif_core_restraints_1.0.dic _restr_equal_angle_details _restr_equal_angle_detail cif_core_restraints_1.0.dic _restr_rigid_body_site_symmetry_ _restr_rigid_body_site_symmetry cif_core_restraints.dic _restr_equal_angle_details _restr_equal_angle_detail cif_core_restraints.dic _restr_rigid_body_site_symmetry_ _restr_rigid_body_site_symmetry cif_ms_1.0.1.dic _atom_site[ms] _atom_site_[ms] cif_ms_1.0.1.dic _cell[ms] _cell_[ms] cif_ms_1.0.1.dic _diffrn_refln[ms] _diffrn_refln_[ms] cif_ms_1.0.1.dic _diffrn_reflns[ms] _diffrn_reflns_[ms] cif_ms_1.0.1.dic _diffrn_standard_refln[ms] _diffrn_standard_refln_[ms] cif_ms_1.0.1.dic _exptl_crystal_face[ms] _exptl_crystal_face_[ms] cif_ms_1.0.1.dic _exptl_crystal[ms] _exptl_crystal_[ms] cif_ms_1.0.1.dic _geom_angle[ms] _geom_angle_[ms] cif_ms_1.0.1.dic _geom_bond[ms] _geom_bond_[ms] cif_ms_1.0.1.dic _geom_contact[ms] _geom_contact_[ms] cif_ms_1.0.1.dic _geom_torsion[ms] _geom_torsion_[ms] cif_ms_1.0.1.dic _refine[ms] _refine_[ms] cif_ms_1.0.1.dic _refln[ms] _refln_[ms] cif_ms_1.0.1.dic _reflns[ms] _reflns_[ms] cif_ms.dic _atom_site[ms] _atom_site_[ms] cif_ms.dic _cell[ms] _cell_[ms] cif_ms.dic _diffrn_refln[ms] _diffrn_refln_[ms] cif_ms.dic _diffrn_reflns[ms] _diffrn_reflns_[ms] cif_ms.dic _diffrn_standard_refln[ms] _diffrn_standard_refln_[ms] cif_ms.dic _exptl_crystal_face[ms] _exptl_crystal_face_[ms] cif_ms.dic _exptl_crystal[ms] _exptl_crystal_[ms] cif_ms.dic _geom_angle[ms] _geom_angle_[ms] cif_ms.dic _geom_bond[ms] _geom_bond_[ms] cif_ms.dic _geom_contact[ms] _geom_contact_[ms] cif_ms.dic _geom_torsion[ms] _geom_torsion_[ms] cif_ms.dic _refine[ms] _refine_[ms] cif_ms.dic _refln[ms] _refln_[ms] cif_ms.dic _reflns[ms] _reflns_[ms] cif_rho_1.0.1.dic _atom_site_label_rho _atom_site_label cif_rho_1.0.1.dic _atom_rho_multipole_kappa_ _atom_rho_multipole_kappa,_atom_rho_multipole_kappa_prime0,_atom_rho_multipole_kappa_prime1,_atom_rho_multipole_kappa_prime2,_atom_rho_multipole_kappa_prime3,_atom_rho_multipole_kappa_prime4 cif_rho_1.0.dic _atom_site_label_rho _atom_site_label cif_rho_1.0.dic _atom_rho_multipole_kappa_ _atom_rho_multipole_kappa,_atom_rho_multipole_kappa_prime0,_atom_rho_multipole_kappa_prime1,_atom_rho_multipole_kappa_prime2,_atom_rho_multipole_kappa_prime3,_atom_rho_multipole_kappa_prime4 cif_rho.dic _atom_site_label_rho _atom_site_label cif_rho.dic _atom_rho_multipole_kappa_ _atom_rho_multipole_kappa,_atom_rho_multipole_kappa_prime0,_atom_rho_multipole_kappa_prime1,_atom_rho_multipole_kappa_prime2,_atom_rho_multipole_kappa_prime3,_atom_rho_multipole_kappa_prime4
Reply to: [list | sender only]
- Follow-Ups:
- References:
- Prev by Date: Re: Please advise regarding a design of CIF dictionaries for materialproperties
- Next by Date: Re: Please advise regarding a design of CIF dictionaries for materialproperties
- Prev by thread: Re: Please advise regarding a design of CIF dictionaries for materialproperties
- Next by thread: Re: Please advise regarding a design of CIF dictionaries for materialproperties
- Index(es):