Discussion List Archives

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [ddlm-group] Summary of proposed CIF syntax changes

Title:
When I said that the 'dictionaries will need to be re-written' I meant that
if anyone wanted to start using e.g. the list structures, they would most likely
not be able to within the confines of the dictionaries that they currently use.
For example, if I wanted to write a dictionary extension to any of the
DDL1 dictionaries currently used by Acta C, I would not be able to write it in CIF2
because the DDL1 dictionaries violate CIF2?
That is, as far as I can see, we cannot make any use of the new useful features of
CIF2 when working with CIF1-based CIFs and dictionaries until there are CIF2
versions of those dictionaries?

Cheers

Simon


From: David Brown <idbrown@mcmaster.ca>
To: Group finalising DDLm and associated dictionaries <ddlm-group@iucr.org>
Sent: Monday, 7 December, 2009 19:06:03
Subject: Re: [ddlm-group] Summary of proposed CIF syntax changes



Joe Krahn wrote:
Aren't the following in the core dictionary, and need changing?

_symmetry_space_group_name_H-M
_refine_ls_shift/esd_max
_refine_ls_class_[]

As I suggested below, CIF2 code should allow CIF1 names, possibly with
warnings, and just exclude them from dREL, unless they can be mapped
dictionary aliases.

It is true that these names, particularly the last one, contain characters that are not permitted in CIF2.  However, a CIF1 data file will always require a CIF1 lexer for which these data names present no problem.  If it is to be backwardly compatible, any application designed to make use of DDLm dictionaries will require a CIF1 as well as a CIF2 lexer.  Which lexer is used depends on the presence or absence of the magic code at the beginning of the file.  The lexer then passes these names and values (as undifferentiated strings) to the DDLm dictionary where the first task on receiving any file from the CIF1 lexer is to check the data names (which can be recognized from their leading underscore) against the DDLm data names and their aliases.  As soon as a match is found, the official DDLm name is substituted
for the original CIF1 data name (in the few cases where this is necessary), and from here on the data file is a conforming CIF2 data file to which methods can be applied without any concern for the original format of the data file.  Note that DDL1 and DDL2 dictionaries are never used or consulted in this process.  The only dictionary used is one written in DDLm, so no changes are needed to DDL1 and DDL2 dicrionaries.

Of course if someone decides to use a hybrid CIF1.5 format, a whole new set of problems arises.  But I suggest that we first of all agree on CIF2, and then, if necessary, consider what CIF1.5 might look like, but my view is that CIF1.5, if used at all, should be considered as a temporary non-conforming standard that should not be used for archival purposes.  Its main use, as I gather from Herbert, is to allow hand-entering of vectors and arrays. a use that is probably restricted to one or two specialized situations.  In most cases this information will be generated by computer, and DDLm dictionaries already have methods to convert the array elements defined in DDL1 and DDL2 dictionaries into the arrays that would normally be used by DDLm.

David


Joe


David Brown wrote:
Simon,

I am not sure what changes are needed in CIF1 dictionaries. I would be
interested to know since any changes have to be passed through the
coreCIF Dictionary Maintenance Group that I chair. It is my
understanding that no changes are needed, and if they are they must be
changes that do not invalidate the reading of any of the archive.

David

SIMON WESTRIP wrote:
I understand the name alias approach - what I was trying to highlight is
the fact that current dictionaries will need to be re-written and this
in itself might be more of an issue when selling CIF2 than the fact
that commas
as list separators could be on the table.

Cheers

Simon

------------------------------------------------------------------------
*From:* Joe Krahn <krahn@niehs.nih.gov> *To:* Group finalising DDLm and associated dictionaries <ddlm-group@iucr.org> *Sent:* Friday, 4 December, 2009 20:23:09 *Subject:* Re: [ddlm-group] Summary of proposed CIF syntax changes SIMON WESTRIP wrote:
I agree that a "rationale for all of the quotation rule
changes" might be welcome - I can imagine that at first glance many
people
will wonder what the """ and ''' are for.

I'm not sure that hinting that comma-separated lists
are also a possibilty is going to help matters?
My willingness to support commas is partly because Herbert finds it
usefule, and has already implemented it. Maybe the comma-delimited
variant can be useful as a CIF 1.5 transitional form?

Afterall, when it comes down to it, until there are
dictionaries that comply to CIF2, many disciplines
that already make use of CIF will find it difficult to
adopt CIF2 because their current dictionaries will be invalidated by
the restrictions on the dataname character set?
Name changes are not uncommon, at least for mmCIF. Hopefully, dictionary
aliases will ease the conversion. It would also help if early CIF2
software should probably allow CIF1 names within the CIF2 syntax, with
warnings, and just exclude them from dREL.

Joe
Cheers

Simon



------------------------------------------------------------------------
*From:* Joe Krahn <krahn@niehs.nih.gov <mailto:krahn@niehs.nih.gov>>
*To:* Group finalising DDLm and associated dictionaries
<ddlm-group@iucr.org <mailto:ddlm-group@iucr.org>>
*Sent:* Friday, 4 December, 2009 17:49:01
*Subject:* Re: [ddlm-group] Summary of proposed CIF syntax changes

The summary did not include a rationale for all of the quotation rule
changes, which is the area that makes the least sense to me.

The section defining the rationale for not allowing lexical characters
outside the 7-bit range (the first Reasoning paragraph) might mention
that it affords faster parsing by deferring any UTF-8 conversions.

I see that the commas were left out of the list syntax. It may be good
to put a short paragraph about the alternative comma-delimited syntax,
so that other people reviewing the proposal have a chance to comment.

Thanks,
Joe Krahn
_______________________________________________
ddlm-group mailing list
ddlm-group@iucr.org <mailto:ddlm-group@iucr.org>
<mailto:ddlm-group@iucr.org <mailto:ddlm-group@iucr.org>>
http://scripts.iucr.org/mailman/listinfo/ddlm-group

_______________________________________________
ddlm-group mailing list
ddlm-group@iucr.org <mailto:ddlm-group@iucr.org> http://scripts.iucr.org/mailman/listinfo/ddlm-group ------------------------------------------------------------------------ _______________________________________________ ddlm-group mailing list ddlm-group@iucr.org http://scripts.iucr.org/mailman/listinfo/ddlm-group
    
_______________________________________________
ddlm-group mailing list
ddlm-group@iucr.org http://scripts.iucr.org/mailman/listinfo/ddlm-group


_______________________________________________
ddlm-group mailing list
ddlm-group@iucr.org
http://scripts.iucr.org/mailman/listinfo/ddlm-group

Reply to: [list | sender only]
International Union of Crystallography

Scientific Union Member of the International Science Council (admitted 1947). Member of CODATA, the ISC Committee on Data. Partner with UNESCO, the United Nations Educational, Scientific and Cultural Organization in the International Year of Crystallography 2014.

International Science Council Scientific Freedom Policy

The IUCr observes the basic policy of non-discrimination and affirms the right and freedom of scientists to associate in international scientific activity without regard to such factors as ethnic origin, religion, citizenship, language, political stance, gender, sex or age, in accordance with the Statutes of the International Council for Science.