Discussion List Archives

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [ddlm-group] Technical issues with Proposal P. .

Title:
In the early days of CIF2 (whenever that may be) I would not expect to see much use of the triple-quotes in
CIF files. However, this situation could change overnight if for example SHELX were to start writing its CIFs
using the triple-quotes. Certainly, I see no reason to suppose that triple-quotes will be viewed as reserved for
CIF dictionaries, and over time I envisage increased use of the triple quotes as a convenient 'catch-all' means
of delimiting data (for one thing you dont have to worry about including an 'invisible' newline in the delimiter).

I too dont foresee users abandoning current practice of working with raw CIFs.

So I am in total agreement with David - "I can see serious problems arising with proposal P..."

Cheers

Simon


From: David Brown <idbrown@mcmaster.ca>
To: Group finalising DDLm and associated dictionaries <ddlm-group@iucr.org>
Sent: Thursday, 24 February, 2011 15:24:33
Subject: Re: [ddlm-group] Technical issues with Proposal P. .

I have been following this discussion with interest and have learned much about things I never knew existed, such as cooked and raw strings.  Since the relative merits of these are beyond my experience I have stayed out of the discussion.

However, it seeme that the ddlm-group consists primarily (or only?) of software developers and a comment from a software non-developer might be in order.  What gets lost in this discussion is the distinction between CIFs and CIF dictionaries.  If triple strings are desinged primarily for use in dictionaries, then the consitutency of users is limited to those writing software and those writing dictionaries, the first being a small group and the second consisting of no more than one can count on the fingers of one hand.  In this case the complexities of proposal P would be manageable.

If the triple quote delimiter is intended to be used in the CIFs themselves, the situation is very different.  There are hundreds of users, most quite innocent of python and its subtleties (and sometimes innocent of crystallography as well, but that is another matter).  I assume that the additional functionality of triple quotes may be needed with TEX and unicode text formats, though I have no experience with either.  If CIFs containing triple quotes are to be written and read entirely by software and these files are invisible to the user (as e.g., are the files written and read by word processors) then P should present no problem.  However, current practice often involves visually inspecting the CIF and editing it manually, e.g., under current practice, to shorten lines with more than 80 characters as required for submission to Acta Cryst.  Since the CIFs are currently easy to read, many people inspect the CIF directly to check the information on it without the filtering that inevitably occurs when viewed with the aid of a CIF editor.  I can see serious problems arising with proposal P if the use of triple quotes become widely adopted in the CIFs themselves unless we radically change the way in which CIFs are currently used.  Such a change would definitely need to be discussed widely by COMCIFS.

David

Bollinger, John C wrote:
On Thursday, February 24, 2011 7:51 AM, Herbert J. Bernstein wrote:

  The Python cooked strings are something many people are familiar with.
That is indisputable, but not directly relevant.  What matters in there is how familiar CIF stakeholders are with Python cooked strings.  I daresay that developers as a group are far more familiar with it than general users, but I have no basis for judging what proportion of either subgroup has any familiarity whatever.  However, over the past decade or so I have discovered that personally, I tend to *over*estimate crystallographers' technical proficiency.  Certainly it is not on average what it once was.

Any use of the treble quote is something new to CIF, with implications for both users and developers.
Yes.

 Use of the straight python versions should reduce the learning curve for both communities and the costs of data conversion for CIF 1.1 data to CIF2.
I see little basis for that evaluation.  Use of the straight Python version would reduce the learning curve for those developers and users who are already proficient (not merely familiar) with Python, but it would increase the curve for everyone else.  As long as we're pulling estimates out of the air, I say that on average, proposal P will increase the learning curve significantly.

Proposal P does not change the difficulty of data conversion in the general sense. ALL existing well-formed CIF 1.1 delimited data values can be converted to CIF 2.0 by expressing them in semicolon-delimited form. Existing multi-line values must already be in that form, and require no changes. To the extent that it is desired specifically to convert CIF 1.1 values to triple-quoted CIF 2.0 form, proposal P will require more changes to lexical values than any other proposal on the table. It is thus extremely generous even to attribute to it parity with the other alternatives in this regard.

 I don't deny that there can be better ways to do the same thing.  This reminds me of when IBM came up with a better keyboard for computers, shifting a few keys.  It drove everybody nuts, not because there was anything wrong with it, it just was sufficiently different to slow down typing in creased the error rate.  Somebody totally new to typing on a computer keyboard had not problem, but it certainly was not worth the costs involved for people who had established habits.
I accept that adopting Python triple-quote syntax wholesale would be of some benefit to some stakeholders.  It would be an obstacle to many others, however, and a substantial one to developers in particular.  It is and always has been a judgment call, and my judgment remains that the benefits of proposal P would not come close to balancing its costs.


John
--
John C. Bollinger, Ph.D.
Department of Structural Biology
St. Jude Children's Research Hospital



Email Disclaimer: www.stjude.org/emaildisclaimer _______________________________________________ ddlm-group mailing list ddlm-group@iucr.org http://scripts.iucr.org/mailman/listinfo/ddlm-group


_______________________________________________
ddlm-group mailing list
ddlm-group@iucr.org
http://scripts.iucr.org/mailman/listinfo/ddlm-group

Reply to: [list | sender only]
International Union of Crystallography

Scientific Union Member of the International Science Council (admitted 1947). Member of CODATA, the ISC Committee on Data. Partner with UNESCO, the United Nations Educational, Scientific and Cultural Organization in the International Year of Crystallography 2014.

International Science Council Scientific Freedom Policy

The IUCr observes the basic policy of non-discrimination and affirms the right and freedom of scientists to associate in international scientific activity without regard to such factors as ethnic origin, religion, citizenship, language, political stance, gender, sex or age, in accordance with the Statutes of the International Council for Science.