Discussion List Archives

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [ddlm-group] options/text vs binary/end-of-line. .. .. .. .. .... .. .. .

>  We don't need everybody to be doing the same thing.  We need everybody
to be able to send everybody else their information in a form in which
other people can correctly undertstand what they have been sent.

I totally agree with this - which is why I have advocated that the standard should be totally unambigous and
at the same time be as accessible as possible. I beleive that I have expressed before an acceptance that we
may have to adopt a certain degree of heuristic encoding determination in order to accommodate user practice;
I do not shy away from this. I am, however, seeking a way to avoid, if possible, the amiguity that code-page based
encodings present.

Cheers

Simon

PS. My comments about about my 'garden shed' were meant to be 'light hearted' - its been a long day!
Please forgive me if this was inappropriate. I ought also to stress that in all this I do not speak for the IUCr, officially.




From: Herbert J. Bernstein <yaya@bernstein-plus-sons.com>
To: Group finalising DDLm and associated dictionaries <ddlm-group@iucr.org>
Sent: Friday, 25 June, 2010 23:38:10
Subject: Re: [ddlm-group] options/text vs binary/end-of-line. .. .. .. .. .. .. .. .. .

Dear Simon,

  The IUCr never has and probably never will make use of every feature in
core CIF and mmCIF, much less what is allowed in all of CIF as it now
exists.  That's reasonable.  You are publishing journals.  Other people
are maintaining archives, or producing experimental data sets, or
procssing data to refine structures.

  To now limit CIF to just the features that are needed for the publication process and can be managed by one bloke is neither in the interests of the IUCr nor of the broader user community.  What we should be pursuing is a reaonable degree of commonality and interchange capability, not an inadequate lowest common denominator.  That would result in a standard that is simply ignored, and "standard" with many non-interoperable dialects -- which seems to be where we are headed.  I would find that to be regretable.

  We don't need everybody to be doing the same thing.  We need everybody
to be able to send everybody else their information in a form in which
other people can correctly undertstand what they have been sent.

  Regards,
    Herbert

=====================================================
Herbert J. Bernstein, Professor of Computer Science
  Dowling College, Kramer Science Center, KSC 121
        Idle Hour Blvd, Oakdale, NY, 11769

                +1-631-244-3035
                yaya@dowling.edu
=====================================================

On Fri, 25 Jun 2010, SIMON WESTRIP wrote:

> I really do not think the IUCr will adopt a policy of rejecting something
> that conforms to the CIF2 standard,
> whatever it may turn out to be. Indeed, I suspect they will be amongst the
> first to support the standard with
> as many tools as they can provide. I know very little about the imgCIF
> issue, but I suspect this is more of a case of
> not having the resources rather than any other motive. Brian has often asked
> me whether I could make use of imgCIF
> in publCIF, but I'm afraid this has not been a priority. In an ideal world I
> would like to produce tools that were all thing to all people,
> and a CIF that encapsulates everything needed for publication as well as
> everything needed to review or validate the
> structures described therein, but I'm one bloke working at home in his
> garden shed :-)
>
> Cheers
>
> Simon
>
> ____________________________________________________________________________
> From: Herbert J. Bernstein <yaya@bernstein-plus-sons.com>
> To: Group finalising DDLm and associated dictionaries <ddlm-group@iucr.org>
> Sent: Friday, 25 June, 2010 21:30:10
> Subject: Re: [ddlm-group] options/text vs binary/end-of-line. .. .. .. .. ..
> .. .. .. .
>
> Then the solution is obvious -- have a CIF standard with some optional
> feature that others of us will use, and have the IUCr instruct authors
> depositing manuscripts that it does not are about and will not use or
> check those features, just as the IUCr does not accept illustrations
> as imgcif binaries.
>
>
> At 1:23 PM -0700 6/25/10, SIMON WESTRIP wrote:
> >  >I don't understand.  How is it worse to provide authors an
> >opportunity to specify the encoding they have used, even though they
> >may specify wrongly, than it is to deny them an opportunity to
> >specify the encoding at all?
> >
> >I dont think it is worse to provide them with an opportunity to
> >specify their encoding - I just dont think they should need to.
> >
> >>How is it a worse or more impactful mistake for an author to
> >>include an incorrect encoding tag than it is for them to use an
> >>encoding different from some small set that you are prepared to
> >>accept?
> >
> >I am not saying that it is a worse or more impactful mistake -
> >rather, if these signatures are to be part of the standard, then I
> >can foresee errors being raised by an incorrect flag even when the
> >rest of the CIF is encoded according to the specification. In my
> >experience, authors already find CIF slightly annoying in that they
> >have to adhere to seemingly pedantic rules (e.g. 'Monoclinic' should
> >be 'monoclinic' because the dictionary enumeration is case
> >sensitive, or <0.001 is not a number type). Requiring manually
> >edited encoding signatures which will have to be checked is of no
> >real help to anyone (no more than a 'hint')? Again, I feal that we
> >have to respect that in the world of CIF, users have been required
> >to edit raw CIF - this is rarely the case with xml, where end users
> >are rightly unaware of the encoding they are using as they
> >invariably work with tools that shield them from the raw xml. In the
> >short/medium term at least, I do not see this situation changing.
> >
> >The reason I am prepared to accept 'some small set' is that I would
> >like that set to be unambiguously identifiable, so that authors do
> >not have to worry about such things, and in the hope that
> >non-CIF-aware software might still do a good job of decoding the
> >text, without employing heuristics, thereby minimizing the impact on
> >curent practise of specifying an encoding at all in the new spec.
> >
> >You might note that I often refer to CIF users as authors - this is
> >my experience I'm afraid. It would be nice if the IUCr could exert
> >as much first-hand control over CIF content as say the PDB, whose
> >online data collection tools are used to populate mmCIFs, and whose
> >users seem quite happy for them to do that. So I stress, my views on
> >this are only based on experience with CIFs submitted to IUCr
> >journals by authors.
> >
> >>>We're also further restricting the number of non-CIF-aware
> >>>programs that can be used to read the text.
> >
> >>Can you expand on that?  I don't follow you.
> >
> >I was referring to the practice of editing CIFs with any available
> >text editor - however I concede that having an encoding flag makes
> >no difference to non-CIF-aware programs - they will simply save the
> >CIF in whatever is their default encoding if that is how they work.
> >
> >Cheers
> >
> >Simon
> >
> >
> >
> >From: "Bollinger, John C" <John.Bollinger@STJUDE.ORG>
> >To: Group finalising DDLm and associated dictionaries <ddlm-group@iucr.org>
> >Sent: Friday, 25 June, 2010 19:59:56
> >Subject: Re: [ddlm-group] options/text vs binary/end-of-line. .. ..
> >.. .. .. .. .. .. .
> >
> >On Friday, June 25, 2010 12:41 PM, SIMON WESTRIP wrote:
> >>Its using a field for specifying the encoding that worries me.
> >>Who is to make such a declaration in the CIF - an author who may be
> >>blissfully unaware of the encoding they're using?
> >>Or an author who is preparing a new CIF by editing an old one,
> >>again unaware that the text editor they are using is about to save
> >  >the CIF in some other encoding? At least with UTF BOM's we have a
> >fighting chance - I'd rather only accept these.
> >
> >I don't understand.  How is it worse to provide authors an
> >opportunity to specify the encoding they have used, even though they
> >may specify wrongly, than it is to deny them an opportunity to
> >specify the encoding at all?
> >
> >How is it a worse or more impactful mistake for an author to include
> >an incorrect encoding tag than it is for them to use an encoding
> >different from some small set that you are prepared to accept?
> >
> >>We're also further restricting the number of non-CIF-aware programs
> >>that can be used to read the text.
> >
> >Can you expand on that?  I don't follow you.
> >
> >>You've also mentioned that we should learn from HTML - just because
> >>HTML has an encoding declaration does not mean it is correct,
> >>which is why browsers seem to apply there own heuristics to
> >>determine the encoding.
> >
> >I see no way to write the specification that can eliminate all
> >possibility of encoding-related errors.  None.  All we can do is
> >choose which errors are possible.  In so doing, there are a lot of
> >competing factors consider, such as likelihood of various errors to
> >be committed, coverage and robustness of the resulting spec, implied
> >responsibilities of various parties, user convenience, and cultural
> >sensitivity.  I think when James's summary is ready it will help us
> >sort through all that.
> >
> >
> >Regards,
> >
> >John
> >--
> >John C. Bollinger, Ph.D.
> >Department of Structural Biology
> >St. Jude Children's Research Hospital
> >
> >Email Disclaimer:
> ><http://www.stjude.org/emaildisclaimer>www.stjude.org/emaildisclaimer
> >_______________________________________________
> >ddlm-group mailing list
> ><mailto:ddlm-group@iucr.org>ddlm-group@iucr.org
> ><http://scripts.iucr.org/mailman/listinfo/ddlm-group>http://scripts.iucr.o
> rg/mailman/listinfo/ddlm-group
> >
> >
> >_______________________________________________
> >ddlm-group mailing list
> >ddlm-group@iucr.org
> >http://scripts.iucr.org/mailman/listinfo/ddlm-group
>
>
> --
> =====================================================
>   Herbert J. Bernstein, Professor of Computer Science
>     Dowling College, Kramer Science Center, KSC 121
>         Idle Hour Blvd, Oakdale, NY, 11769
>
>                   +1-631-244-3035
>                   yaya@dowling.edu
> =====================================================
> _______________________________________________
> ddlm-group mailing list
> ddlm-group@iucr.org
> http://scripts.iucr.org/mailman/listinfo/ddlm-group
>
>
_______________________________________________
ddlm-group mailing list
ddlm-group@iucr.org
http://scripts.iucr.org/mailman/listinfo/ddlm-group

Reply to: [list | sender only]
International Union of Crystallography

Scientific Union Member of the International Science Council (admitted 1947). Member of CODATA, the ISC Committee on Data. Partner with UNESCO, the United Nations Educational, Scientific and Cultural Organization in the International Year of Crystallography 2014.

International Science Council Scientific Freedom Policy

The IUCr observes the basic policy of non-discrimination and affirms the right and freedom of scientists to associate in international scientific activity without regard to such factors as ethnic origin, religion, citizenship, language, political stance, gender, sex or age, in accordance with the Statutes of the International Council for Science.