Discussion List Archives

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [ddlm-group] options/text vs binary/end-of-line. .. .. .. .. .... .. .. .

Then the solution is obvious -- have a CIF standard with some optional
feature that others of us will use, and have the IUCr instruct authors
depositing manuscripts that it does not are about and will not use or
check those features, just as the IUCr does not accept illustrations
as imgcif binaries.


At 1:23 PM -0700 6/25/10, SIMON WESTRIP wrote:
>  >I don't understand.  How is it worse to provide authors an 
>opportunity to specify the encoding they have used, even though they 
>may specify wrongly, than it is to deny them an opportunity to 
>specify the encoding at all?
>
>I dont think it is worse to provide them with an opportunity to 
>specify their encoding - I just dont think they should need to.
>
>>How is it a worse or more impactful mistake for an author to 
>>include an incorrect encoding tag than it is for them to use an 
>>encoding different from some small set that you are prepared to 
>>accept?
>
>I am not saying that it is a worse or more impactful mistake - 
>rather, if these signatures are to be part of the standard, then I 
>can foresee errors being raised by an incorrect flag even when the 
>rest of the CIF is encoded according to the specification. In my 
>experience, authors already find CIF slightly annoying in that they 
>have to adhere to seemingly pedantic rules (e.g. 'Monoclinic' should 
>be 'monoclinic' because the dictionary enumeration is case 
>sensitive, or <0.001 is not a number type). Requiring manually 
>edited encoding signatures which will have to be checked is of no 
>real help to anyone (no more than a 'hint')? Again, I feal that we 
>have to respect that in the world of CIF, users have been required 
>to edit raw CIF - this is rarely the case with xml, where end users 
>are rightly unaware of the encoding they are using as they 
>invariably work with tools that shield them from the raw xml. In the 
>short/medium term at least, I do not see this situation changing.
>
>The reason I am prepared to accept 'some small set' is that I would 
>like that set to be unambiguously identifiable, so that authors do 
>not have to worry about such things, and in the hope that 
>non-CIF-aware software might still do a good job of decoding the 
>text, without employing heuristics, thereby minimizing the impact on 
>curent practise of specifying an encoding at all in the new spec.
>
>You might note that I often refer to CIF users as authors - this is 
>my experience I'm afraid. It would be nice if the IUCr could exert 
>as much first-hand control over CIF content as say the PDB, whose 
>online data collection tools are used to populate mmCIFs, and whose 
>users seem quite happy for them to do that. So I stress, my views on 
>this are only based on experience with CIFs submitted to IUCr 
>journals by authors.
>
>>>We're also further restricting the number of non-CIF-aware 
>>>programs that can be used to read the text.
>
>>Can you expand on that?  I don't follow you.
>
>I was referring to the practice of editing CIFs with any available 
>text editor - however I concede that having an encoding flag makes 
>no difference to non-CIF-aware programs - they will simply save the 
>CIF in whatever is their default encoding if that is how they work.
>
>Cheers
>
>Simon
>
>
>
>From: "Bollinger, John C" <John.Bollinger@STJUDE.ORG>
>To: Group finalising DDLm and associated dictionaries <ddlm-group@iucr.org>
>Sent: Friday, 25 June, 2010 19:59:56
>Subject: Re: [ddlm-group] options/text vs binary/end-of-line. .. .. 
>.. .. .. .. .. .. .
>
>On Friday, June 25, 2010 12:41 PM, SIMON WESTRIP wrote:
>>Its using a field for specifying the encoding that worries me.
>>Who is to make such a declaration in the CIF - an author who may be 
>>blissfully unaware of the encoding they're using?
>>Or an author who is preparing a new CIF by editing an old one, 
>>again unaware that the text editor they are using is about to save
>  >the CIF in some other encoding? At least with UTF BOM's we have a 
>fighting chance - I'd rather only accept these.
>
>I don't understand.  How is it worse to provide authors an 
>opportunity to specify the encoding they have used, even though they 
>may specify wrongly, than it is to deny them an opportunity to 
>specify the encoding at all?
>
>How is it a worse or more impactful mistake for an author to include 
>an incorrect encoding tag than it is for them to use an encoding 
>different from some small set that you are prepared to accept?
>
>>We're also further restricting the number of non-CIF-aware programs 
>>that can be used to read the text.
>
>Can you expand on that?  I don't follow you.
>
>>You've also mentioned that we should learn from HTML - just because 
>>HTML has an encoding declaration does not mean it is correct,
>>which is why browsers seem to apply there own heuristics to 
>>determine the encoding.
>
>I see no way to write the specification that can eliminate all 
>possibility of encoding-related errors.  None.  All we can do is 
>choose which errors are possible.  In so doing, there are a lot of 
>competing factors consider, such as likelihood of various errors to 
>be committed, coverage and robustness of the resulting spec, implied 
>responsibilities of various parties, user convenience, and cultural 
>sensitivity.  I think when James's summary is ready it will help us 
>sort through all that.
>
>
>Regards,
>
>John
>--
>John C. Bollinger, Ph.D.
>Department of Structural Biology
>St. Jude Children's Research Hospital
>
>Email Disclaimer: 
><http://www.stjude.org/emaildisclaimer>www.stjude.org/emaildisclaimer
>_______________________________________________
>ddlm-group mailing list
><mailto:ddlm-group@iucr.org>ddlm-group@iucr.org
><http://scripts.iucr.org/mailman/listinfo/ddlm-group>http://scripts.iucr.org/mailman/listinfo/ddlm-group
>
>
>_______________________________________________
>ddlm-group mailing list
>ddlm-group@iucr.org
>http://scripts.iucr.org/mailman/listinfo/ddlm-group


-- 
=====================================================
  Herbert J. Bernstein, Professor of Computer Science
    Dowling College, Kramer Science Center, KSC 121
         Idle Hour Blvd, Oakdale, NY, 11769

                  +1-631-244-3035
                  yaya@dowling.edu
=====================================================
_______________________________________________
ddlm-group mailing list
ddlm-group@iucr.org
http://scripts.iucr.org/mailman/listinfo/ddlm-group

Reply to: [list | sender only]
International Union of Crystallography

Scientific Union Member of the International Council for Science (admitted 1947). Member of CODATA, the ICSU Committee on Data. Member of ICSTI, the International Council for Scientific and Technical Information. Partner with UNESCO, the United Nations Educational, Scientific and Cultural Organization in the International Year of Crystallography 2014.

ICSU Scientific Freedom Policy

The IUCr observes the basic policy of non-discrimination and affirms the right and freedom of scientists to associate in international scientific activity without regard to such factors as ethnic origin, religion, citizenship, language, political stance, gender, sex or age, in accordance with the Statutes of the International Council for Science.