Discussion List Archives

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [ddlm-group] Relationship asmong CIF2, STAR,CIF1 and Python. . . .

In response to John B'a request I have copied below two dictionary items from my current working version of the DDLm core dictionary to show what a DDLm CIF dictioanry looks like.

The first save frame gives, as requested, the entry for _exptl_crystal.density_diffrn which includes a method for calculating this quantity.  This calls upon, inter alia, _cell.volume whose definition is given in the second save frame.  Note the alias that allows _cell_volume defined in the DDL1 core dictionary or _cell.volume defined in the DDL2 core dictionary, to be recognized by an input routine designed to read CIF1 or CIF2 datafiles (not to be confused with CIF1 or CIF2 syntax.  CIF1 and CIF2 datefiles are both written using CIF1 syntax).  This input routine will accept the occasional appearance of () at the end of a dataname even though this is not allowed by CIF2 syntax.  The value found for _cell.volume is then stored as DDLm _cell.volume where it can be used directly in the method for _exptl_crystal.density_diffrn without any further processing.  If the input asks that the value of
_exptl_crystal_density_diffrn be calculated, the list of aliases would idenrify this as being the same as _exptl_crystal.density_diffrn (though in this case, as in most others, the two names are interchangeable under DDLm, though not under DDL1 or DDL2).  By whatever means the density calculation method is invoked, the program uses only DDLm machinery and DDLm data values to calculate the density.  If you require CIF1 or CIF2 output (which is not implied by the 'promise' as I read it), this can be done by referring to the _aliases for _exptl_crystal.density_diffrn.  If it helps, it would be easy to add a flag to identify which alias is used in CIF1 and which in CIF2 datafiles, although this information is already implicit in the _alias loop.  The program would likely need a special CIF1 or CIF2 datafile output routine to match the corresponding input routine.  In this way the archived files are converted on reading to DDLm CIF and, if desired, can be output in of the approved formats as CIF1, CIF2 or CIFm datafiles, subject to the restriction that CIFm has a richer list of available dataitems not all of which are available in CIF1 or CIF2.

I cannot see why this will not work.


P.S. Just to avoid further questions, _cell.volume can be calculated from _cell.vector which is calculated by a method from the cell constants and so could be calculated from basic information supplied in the CIF1 of CIF2 datafile.

Two save frames extracted from the developing DDLm core dictionary

    _definition.id             '_exptl_crystal.density_diffrn'
    _definition.update           2008-02-20
     Crystal density calculated from crystal unit cell and atomic mass of the
    _description.common         'CrystalDensityDiffrn'
    _name.category_id            exptl_crystal
    _name.object_id              density_diffrn
    _type.purpose                Measured
    _type.container              Single
    _type.contents               Real
    _enumeration.range           0.0:
    _units.code                  megagrams_per_metre_cubed
    'calculation of the density from the cell voluem and cell msss'
    _exptl_crystal.density_diffrn = 1.6605 * _cell.atomic_mass / _cell.volume
 '_exptl_crystal_density_diffrn'   cifdic.C91
 '_exptl_crystal.density_diffrn'   cif_mm_1.0.dic

    _definition.id             '_cell.volume'
    _definition.update           2008-02-13
     Volume of the crystal unit cell.
    _description.common         'CellVolume'
    _name.category_id            cell
    _name.object_id              volume
          '_cell_volume'   cifdic.C91
          '_cell.volume'   cif_mm_1.0.dic
    _type.purpose                Measured
    _type.container              Single
    _type.contents               Real
    _enumeration.range           0.0:
    _units.code                  angstroms_cubed
    'calculation of the cell volume from unit cell vectors'
      With v  as  cell_vector
      _cell.volume =  v.a * ( v.b ^ v.c )

Bollinger, John C wrote:
On Tuesday, January 18, 2011 7:20 AM, Herbert J. Bernstein wrote:
  Now I am very confused.  You say we have not broken the promise on the
IUCr web site, but at the same time we seem to be defining a CIF2 that
will not accept CIF1 documents.

  Please bear with me, and, even if you think it has already been
explained, please explain precisely how to use CIF1 documents in the
currently proposed CIF2 environment.

  If we have a sound way in which a CIF1 document has use of a DDLm
dictionary, then we do not need to bother most of the community with CIF2
for data files at this time.  All they need right now is what I called
DDLm-2011, a CIF2ish DDLm dictionary format.
I agree with that assessment of need, but I don't see what would be gained by limiting CIF2 release like that.  If CIF2 is not ready or appropriate for data files, then I think a CIF2-like DDLm-2011 language leads users and especially developers in the wrong direction.  If we wish to release DDLm without unleashing CIF2 on the world then let the initial DDLm and dictionary releases be crafted in an altogether different format, such as XML.  In the unlikely event that there were genuine interest in such a course, it would be worth mentioning that I have a suitable XML schema at hand, as well as supporting software that could easily be adapted to translating existing DDL and dictionary documents.

 If we don't have a sound way
in which a CIF1 document has use of a DDLm dictionary, then I think we are
breaking the promise on the IUCr web page.  Please recall that DDL2
dictionaries are not valid CIF1 documents -- they have save frames, so it
is not unprecedented to have a different spec for dictionaries as opposed
to data files.
I accept that, but it's a different matter for the data format to be a subset of the dictionary format than for the data format to be a related but subtly incompatible format.  We will have that anyway when DDLm dictionaries are used to validate CIF 1 files, bet let's please not set it as the direction for the indefinite future.

 It makes a big difference to most of the user community if
we are simply telling them we have a new dictionary format rather than
telling them we are changing the data file format.
Agreed, in that much of the user community doesn't care about dictionaries.  On the other hand, members of the user community who care about some of the new CIF2 features -- Unicode support, as a prime example -- would not necessarily take the distinction as a positive or even a neutral proposition.

  On David's description, I think I really did explain why I think we will
have trouble populating missing values involving CIF1 tags that are not
valid CIF2 tags.  Doing that using the alias mechanism would seem to
require defining the CIF1 tag in the DDLm dictionary as a primary
definition and then aliasing a CIF2 tag to that primary CIF1 tag, so that
a method working with the CIF2 tag would effectively populate instances of
the CIF1 tag, but, and this is the part I can't seem to get past, defining
the CIF1 tag in a new CIF2-style DDLm dictionary would seem to require
that the CIF1 tag be a valid CIF2 tag.
I think we will not easily get past this dispute without an example.  For that purpose, then, perhaps James, David, or another participant with practical DDLm and dREL experience would be kind enough to present a solution to this exercise:
Provide DDLm definitions and a dREL method that support computing a missing value for the Core item _exptl_crystal_density_diffrn, based on Core items _chemical_formula_weight, _cell_formula_units_Z, and _cell_volume.  The definitions presented should use DDLm formalism for the defined data names, and should be compatible also with validating the corresponding mmCIF data names.

James's and David's comments have given me every reason to believe that this would be straightforward, though the definitions together with their required context might be bulky.  I am hoping that the requested definitions are in fact already written.

 I suspect we will get into trouble
in other areas of using existing CIF1 tags in CIF2 DDLm dictionaries.
One of the key promises of DDLm, as I see it, is that the distinctions between various syntax versions and between DDL1 and DDL2 formalisms are relevant to only two program activities:
1) On input, reading a file correctly and associating data items with the correct DDLm definitions.
2) On output, producing well-formed files for the target syntax version that are valid with respect to the DDL1 (or DDL2) dictionaries with which the DDLm dictionary provides compatibility.

As long as those two features work correctly, details of syntax version and original target dictionary can be completely abstracted away from validation and dREL operations, leaving no room for other areas of trouble.  Success in those areas will be a function of program, DDL, and dictionary details.  CIF2 syntax need only be sufficient to support the required DDLm features; it does not otherwise bear on the problem.

How important each of those trouble may be depends on our goals, so I
respectfully urge that we make certain that we are working from common
goals, so that we can then focus on whether we are meeting those goals,
rather than have debates that seem to be based on different goals for
different speakers.
That is a reasonable criticism of our process to date.  I am willing to participate in the proposed goal re-evaluation process, and I hope it will help resolve some of our current disputes.  Of late, however, we have also seen significant differences in technical analyses that should be independent of participants' goals.  Therefore, I do not anticipate that the goal re-evaluation exercise will provide clear resolutions to *all* our current disputes.



John C. Bollinger, Ph.D.
Department of Structural Biology
St. Jude Children's Research Hospital

Email Disclaimer:  www.stjude.org/emaildisclaimer

ddlm-group mailing list

fn:I.David Brown
org:McMaster University;Brockhouse Institute for Materials Research
adr:;;King St. W;Hamilton;Ontario;L8S 4M1;Canada
title:Professor Emeritus
tel;work:+905 525 9140 x 24710
tel;fax:+905 521 2773

ddlm-group mailing list

Reply to: [list | sender only]
International Union of Crystallography

Scientific Union Member of the International Science Council (admitted 1947). Member of CODATA, the ISC Committee on Data. Partner with UNESCO, the United Nations Educational, Scientific and Cultural Organization in the International Year of Crystallography 2014.

International Science Council Scientific Freedom Policy

The IUCr observes the basic policy of non-discrimination and affirms the right and freedom of scientists to associate in international scientific activity without regard to such factors as ethnic origin, religion, citizenship, language, political stance, gender, sex or age, in accordance with the Statutes of the International Council for Science.