Discussion List Archives

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [ddlm-group] Relationship asmong CIF2, STAR,CIF1 and Python. . . .

Dear David,

   Thank you for the detailed explanation.  I suspect we can make the 
external program you are proposing work as you suggest, but wouldn't it be 
simpler and clearer if we simply allowed for the creation of DDLm 
dictionaries that were able to work directly in terms of the coreCIF and 
mmCIF tags as written? The only reason the tags theselves were restricted 
in the recent discussions was to allow them to be used in methods directly 
with names that are acceptable to dREL as variable names.  If we bring 
what amounts to the aliasing into the dREL methods themselves, the DDLm 
dictionary becomes much cleaner and simpler, with the save frames written 
entirely in terms of the real tags the users are familar with, except for 
the few lines of code in the methods themselves. Certainly, in the case of 
save frames that do not have any methods, the logic of forcing a tag name 
change to be able to work with a method that does not exist seems 
difficult to support.

   Thank you again.  This is definitely making more sense.


  Herbert J. Bernstein, Professor of Computer Science
    Dowling College, Kramer Science Center, KSC 121
         Idle Hour Blvd, Oakdale, NY, 11769


On Wed, 19 Jan 2011, David Brown wrote:

> Herbert,
> Herbert J. Bernstein wrote:
>         While a flag on which tags to use in DDL1 and DDL2 would
>       be a very helpful addition, we also need a mechanism to
>       ensure that that processing works with whichever tag
>       was actually used in the data file, especially when
>       populating missing values.
> This should be no problem.  When an alias is located for an input item, the
> value would be tagged within the program to indicate which standard the
> value originally appeared in.  This could then be checked at output time to
> ensure that the output was in the same format (if so desired).  Presumably
> the output items would all be written to the same standard, so one needs an
> 'output flag' within the progrm (not needed in the dictionary) which might
> be defaulted to the standerd of the input file as revealled by the format of
> the datanames used (or the magic number in the case of CIFm files), or
> failing that defaulted to CIFm.  However the user of the program may wish to
> output in a different format and so could decide which of the three
> standards to use for output, regardless of the standard used in the input. 
> Useful in converting a CIF1 datafile to CIF2 for example, which was one of
> the problems DDLm was supposed to overcome.
> Using DDLm CIF-dictionaries to populste empty fields will likely generate
> values for items that are not defined in CIF1 or 2 dictionaries.  If these
> are desired the only possibility is to output in CIFm format.  The user can
> specify the name of items to be calculated and can use any of the aliases
> for this purpose, but it would be pointless trying to include an item only
> found in the DDLm dictionaries in a CIF1 file, since this might prevent the
> file being used with legacy software.  I am not sure whether there is any
> advantage to be gained from mixing items from different standards.  Clearly
> a CIFm datafile is not expected to be read by legacy software, but coule, by
> this mechanism, bo converted to a CIF1 or CIF2 datafile.
> I cannot see that there is any problem with the processing once the datafile
> has been read, since the datavalues of equivalent items in the different
> standards are always the same.  DDLm allows for vectors and matrices while
> DDL1 and 2 only allow the components to be stored, but the components are
> also defined (and aliased) in DDLm-based dictionaries and any method that
> calls on a matrix will be instructed by a method how to populate the matrix
> from available component information (assuming that it exists in the CIF1 or
> 2 datafile in the first place).
>         More importantly, it appears that you are trying
>       to ensure that your dictionary will work with CIF1
>       (DDL1 and DDL2) data files.  Why can we not agree
>       that such interoperability is, as promised on the
>       IUCr web site, a firm goal of this exercise.
> The way I read the promise made to our users, we agreed only to make sure
> that the DDLm based dictionaries could be programmed to read the CIF1 and
> CIF2 archive datafiles, but we did not promise they will be able to output
> files in the older standards.  However, it appear that outputting any of the
> CIF formats is no problem and I am happy to go along with the objective of
> ensuring that DDLm allows both the reading and writing of CIF1 and CIR2 as
> well as CIFm datafiles.
> David
>         Regards,
>           Herbert
>       =====================================================
>        Herbert J. Bernstein, Professor of Computer Science
>          Dowling College, Kramer Science Center, KSC 121
>               Idle Hour Blvd, Oakdale, NY, 11769
>                        +1-631-244-3035
>                        yaya@dowling.edu
>       =====================================================
>       On Tue, 18 Jan 2011, David Brown wrote:
>             In response to John B'a request I have copied below
>             two dictionary items
>             from my current working version of the DDLm core
>             dictionary to show what a
>             DDLm CIF dictioanry looks like.
>             The first save frame gives, as requested, the entry
>             for
>             _exptl_crystal.density_diffrn which includes a
>             method for calculating this
>             quantity.  This calls upon, inter alia, _cell.volume
>             whose definition is
>             given in the second save frame.  Note the alias that
>             allows _cell_volume
>             defined in the DDL1 core dictionary or _cell.volume
>             defined in the DDL2 core
>             dictionary, to be recognized by an input routine
>             designed to read CIF1 or
>             CIF2 datafiles (not to be confused with CIF1 or CIF2
>             syntax.  CIF1 and CIF2
>             datefiles are both written using CIF1 syntax).  This
>             input routine will
>             accept the occasional appearance of () at the end of
>             a dataname even though
>             this is not allowed by CIF2 syntax.  The value found
>             for _cell.volume is
>             then stored as DDLm _cell.volume where it can be
>             used directly in the method
>             for _exptl_crystal.density_diffrn without any
>             further processing.  If the
>             input asks that the value of
>             _exptl_crystal_density_diffrn be calculated,
>             the list of aliases would idenrify this as being the
>             same as
>             _exptl_crystal.density_diffrn (though in this case,
>             as in most others, the
>             two names are interchangeable under DDLm, though not
>             under DDL1 or DDL2). 
>             By whatever means the density calculation method is
>             invoked, the program
>             uses only DDLm machinery and DDLm data values to
>             calculate the density.  If
>             you require CIF1 or CIF2 output (which is not
>             implied by the 'promise' as I
>             read it), this can be done by referring to the
>             _aliases for
>             _exptl_crystal.density_diffrn.  If it helps, it
>             would be easy to add a flag
>             to identify which alias is used in CIF1 and which in
>             CIF2 datafiles,
>             although this information is already implicit in the
>             _alias loop.  The
>             program would likely need a special CIF1 or CIF2
>             datafile output routine to
>             match the corresponding input routine.  In this way
>             the archived files are
>             converted on reading to DDLm CIF and, if desired,
>             can be output in of the
>             approved formats as CIF1, CIF2 or CIFm datafiles,
>             subject to the restriction
>             that CIFm has a richer list of available dataitems
>             not all of which are
>             available in CIF1 or CIF2.
>             I cannot see why this will not work.
>             David
>             P.S. Just to avoid further questions, _cell.volume
>             can be calculated from
>             _cell.vector which is calculated by a method from
>             the cell constants and so
>             could be calculated from basic information supplied
>             in the CIF1 of CIF2
>             datafile.
>             -------------------------------------------------------------------
>             Two save frames extracted from the developing DDLm
>             core dictionary
>             -------------------------------------------------------------------
>             save_exptl_crystal.density_diffrn
>                 _definition.id            
>             '_exptl_crystal.density_diffrn'
>                 _definition.update           2008-02-20
>                 _description.text
>             ;
>                  Crystal density calculated from crystal unit
>             cell and atomic mass of
>             the
>                  contents.
>             ;
>                 _description.common        
>             'CrystalDensityDiffrn'
>                 _name.category_id            exptl_crystal
>                 _name.object_id              density_diffrn
>                 _type.purpose                Measured
>                 _type.container              Single
>                 _type.contents               Real
>                 _enumeration.range           0.0:
>                 _units.code                 
>             megagrams_per_metre_cubed
>                  loop_
>                 _method.description
>                 _method.purpose  
>                 _method.expression
>                 'calculation of the density from the cell voluem
>             and cell msss'
>                  Evaluation
>             ;
>                 _exptl_crystal.density_diffrn = 1.6605 *
>             _cell.atomic_mass /
>             _cell.volume
>             ;
>                 loop_
>                    _alias.definition_id
>                    _alias.dictionary_uri
>              '_exptl_crystal_density_diffrn'   cifdic.C91
>              '_exptl_crystal.density_diffrn'   cif_mm_1.0.dic
>                  save_
>             save_cell.volume
>                 _definition.id             '_cell.volume'
>                 _definition.update           2008-02-13
>                 _description.text
>             ;
>                  Volume of the crystal unit cell.
>             ;
>                 _description.common         'CellVolume'
>                 _name.category_id            cell
>                 _name.object_id              volume
>                 loop_
>                    _alias.definition_id
>                    _alias.dictionary_uri
>                       '_cell_volume'   cifdic.C91
>                       '_cell.volume'   cif_mm_1.0.dic
>                 _type.purpose                Measured
>                 _type.container              Single
>                 _type.contents               Real
>                 _enumeration.range           0.0:
>                 _units.code                  angstroms_cubed
>                  loop_
>                 _method.description
>                 _method.purpose  
>                 _method.expression
>                 'calculation of the cell volume from unit cell
>             vectors'
>                  Evaluation
>             ;
>                   With v  as  cell_vector
>                   _cell.volume =  v.a * ( v.b ^ v.c )
>             ;
>                  save_
>             Bollinger, John C wrote:
>             On Tuesday, January 18, 2011 7:20 AM, Herbert J.
>             Bernstein wrote:
>               Now I am very confused.  You say we have not
>             broken the promise on the
>             IUCr web site, but at the same time we seem to be
>             defining a CIF2 that
>             will not accept CIF1 documents.
>               Please bear with me, and, even if you think it has
>             already been
>             explained, please explain precisely how to use CIF1
>             documents in the
>             currently proposed CIF2 environment.
>               If we have a sound way in which a CIF1 document
>             has use of a DDLm
>             dictionary, then we do not need to bother most of
>             the community with CIF2
>             for data files at this time.  All they need right
>             now is what I called
>             DDLm-2011, a CIF2ish DDLm dictionary format.
>             I agree with that assessment of need, but I don't
>             see what would be gained b
>             y limiting CIF2 release like that.  If CIF2 is not
>             ready or appropriate for data files, then I think a
>             CIF2-like DDLm-2011 language leads users and espe
>             cially developers in the wrong direction.  If we
>             wish to release DDLm withou
>             t unleashing CIF2 on the world then let the initial
>             DDLm and dictionary rele
>             ases be crafted in an altogether different format,
>             such as XML.  In the unli
>             kely event that there were genuine interest in such
>             a course, it would be wo
>             rth mentioning that I have a suitable XML schema at
>             hand, as well as support
>             ing software that could easily be adapted to
>             translating existing DDL and di
>             ctionary documents.
>              If we don't have a sound way
>             in which a CIF1 document has use of a DDLm
>             dictionary, then I think we are
>             breaking the promise on the IUCr web page.  Please
>             recall that DDL2
>             dictionaries are not valid CIF1 documents -- they
>             have save frames, so it
>             is not unprecedented to have a different spec for
>             dictionaries as opposed
>             to data files.
>             I accept that, but it's a different matter for the
>             data format to be a subse
>             t of the dictionary format than for the data format
>             to be a related but subt
>             ly incompatible format.  We will have that anyway
>             when DDLm dictionaries are
>              used to validate CIF 1 files, bet let's please not
>             set it as the direction for the indefinite future.
>              It makes a big difference to most of the user
>             community if
>             we are simply telling them we have a new dictionary
>             format rather than
>             telling them we are changing the data file format.
>             Agreed, in that much of the user community doesn't
>             care about dictionaries.
>              On the other hand, members of the user community
>             who care about some of the
>              new CIF2 features -- Unicode support, as a prime
>             example -- would not neces
>             sarily take the distinction as a positive or even a
>             neutral proposition.
>               On David's description, I think I really did
>             explain why I think we will
>             have trouble populating missing values involving
>             CIF1 tags that are not
>             valid CIF2 tags.  Doing that using the alias
>             mechanism would seem to
>             require defining the CIF1 tag in the DDLm dictionary
>             as a primary
>             definition and then aliasing a CIF2 tag to that
>             primary CIF1 tag, so that
>             a method working with the CIF2 tag would effectively
>             populate instances of
>             the CIF1 tag, but, and this is the part I can't seem
>             to get past, defining
>             the CIF1 tag in a new CIF2-style DDLm dictionary
>             would seem to require
>             that the CIF1 tag be a valid CIF2 tag.
>             I think we will not easily get past this dispute
>             without an example.  For th
>             at purpose, then, perhaps James, David, or another
>             participant with practica
>             l DDLm and dREL experience would be kind enough to
>             present a solution to thi
>             s exercise:
>             Provide DDLm definitions and a dREL method that
>             support computing a missing value for the Core item
>             _exptl_crystal_density_diffrn, based on Core items _
>             chemical_formula_weight, _cell_formula_units_Z, and
>             _cell_volume.  The defin
>             itions presented should use DDLm formalism for the
>             defined data names, and s
>             hould be compatible also with validating the
>             corresponding mmCIF data names.
>             James's and David's comments have given me every
>             reason to believe that this
>              would be straightforward, though the definitions
>             together with their requir
>             ed context might be bulky.  I am hoping that the
>             requested definitions are i
>             n fact already written.
>              I suspect we will get into trouble
>             in other areas of using existing CIF1 tags in CIF2
>             DDLm dictionaries.
>             One of the key promises of DDLm, as I see it, is
>             that the distinctions betwe
>             en various syntax versions and between DDL1 and DDL2
>             formalisms are relevant
>              to only two program activities:
>             1) On input, reading a file correctly and
>             associating data items with the co
>             rrect DDLm definitions.
>             2) On output, producing well-formed files for the
>             target syntax version that
>              are valid with respect to the DDL1 (or DDL2)
>             dictionaries with which the DD
>             Lm dictionary provides compatibility.
>             As long as those two features work correctly,
>             details of syntax version and original target
>             dictionary can be completely abstracted away from
>             validation
>              and dREL operations, leaving no room for other
>             areas of trouble.  Success i
>             n those areas will be a function of program, DDL,
>             and dictionary details.  C
>             IF2 syntax need only be sufficient to support the
>             required DDLm features; it
>              does not otherwise bear on the problem.
>             How important each of those trouble may be depends
>             on our goals, so I
>             respectfully urge that we make certain that we are
>             working from common
>             goals, so that we can then focus on whether we are
>             meeting those goals,
>             rather than have debates that seem to be based on
>             different goals for
>             different speakers.
>             That is a reasonable criticism of our process to
>             date.  I am willing to part
>             icipate in the proposed goal re-evaluation process,
>             and I hope it will help resolve some of our current
>             disputes.  Of late, however, we have also seen s
>             ignificant differences in technical analyses that
>             should be independent of p
>             articipants' goals.  Therefore, I do not anticipate
>             that the goal re-evaluat
>             ion exercise will provide clear resolutions to *all*
>             our current disputes.
>             Regards,
>             John
>             --
>             John C. Bollinger, Ph.D.
>             Department of Structural Biology
>             St. Jude Children's Research Hospital
>             Email Disclaimer:  www.stjude.org/emaildisclaimer
>             _______________________________________________
>             ddlm-group mailing list
>             ddlm-group@iucr.org
>             http://scripts.iucr.org/mailman/listinfo/ddlm-group
>     ____________________________________________________________________
> _______________________________________________
> ddlm-group mailing list
> ddlm-group@iucr.org
> http://scripts.iucr.org/mailman/listinfo/ddlm-group
ddlm-group mailing list

Reply to: [list | sender only]
International Union of Crystallography

Scientific Union Member of the International Science Council (admitted 1947). Member of CODATA, the ISC Committee on Data. Partner with UNESCO, the United Nations Educational, Scientific and Cultural Organization in the International Year of Crystallography 2014.

International Science Council Scientific Freedom Policy

The IUCr observes the basic policy of non-discrimination and affirms the right and freedom of scientists to associate in international scientific activity without regard to such factors as ethnic origin, religion, citizenship, language, political stance, gender, sex or age, in accordance with the Statutes of the International Council for Science.