Discussion List Archives

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [ddlm-group] Trailing whitespace in CIF2.0 text fields:statement from draft syntax chapter of Vol G

Hi Herbert and group,

When producing a printed volume we don't have the luxury of changing anything later on, so I think we should just resolve this now. You are concerned about different dialects, but text fields in CIF2 are already different from CIF1.1 in that (1) line folding *must* be recognised and (2) the prefix mechanism *must* be recognised.

The alternative is to stipulate that for text fields trailing white space is not significant for both CIF2.0 and CIF1.1, and for triple-quote-delimited fields (which are only in CIF2.0) it is (or also isn't). I view this as inferior because, while it would keep consistency with CIF1.1 (which in any case is illusory, see above), it is unnecessary and unintuitive for those coming to CIF2 with fresh eyes. We have been happy to break with CIF1 in the name of simplification when doing CIF2, I don't see why we can't continue this approach.

As a potential red herring, but at the same time demonstrating the unintuitiveness, can I point out that there are plenty of "whitespace" characters in Unicode that are not considered whitespace for CIF2 parsing purposes. These should presumably always be included even in trailing sections of text field lines.

James.

On Wed, 5 May 2021 at 12:30, Herbert J. Bernstein <yayahjb@gmail.com> wrote:
Dear James,
  While I think this wording will create confusion and divergence of dialects, and that is would be best to say nothing
in Volume G until there is a clear use case, if something does have to be said, I would suggest changing the wording
to just end with "line-folding protocol." on line 5.
  Regards,
      Herbert

On Tue, May 4, 2021 at 9:18 PM James H <jamesrhester@gmail.com> wrote:

Dear DDLm group, 

In your capacity as technical advisory committee for CIF, please note the following excerpt from the final draft of the new Volume G syntax chapter. I'm bringing this to your attention as it contains a statement about significant trailing whitespace that was not explicit in the CIF2.0 paper, so out of an abundance of caution I thought it best to make sure there were no objections.  If you do have substantial objections, please let the group know.

thanks,

James.

===

Additionally, in view of the record-oriented text file formats that are conventional in some environments, and because of some programming languages' record-oriented text input / output interfaces and fixed-length character data types, some CIF 1.1 processors may be unable easily to recognize significant in-line whitespace at the ends of lines of a text field, and some CIF 1.1 writers may pad lines, including lines of text fields, with insignificant whitespace characters. Therefore, trailing whitespace in CIF 1.1 text field lines is considered insignificant. It should be elided where feasible, and if not elided, it should be ignored. Significant trailing whitespace can be marked and protected by use of the line-folding protocol. On the other hand, because CIF 2.0 is a binary byte stream format with explicit line termination sequences, no such considerations apply to multiline data values (text fields and triple-quoted strings) expressed in that syntax, and all trailing whitespace is significant. In view of CIF 1.1 behaviour, however, it is recommended that significant trailing whitespace be avoided in CIF 2.0 text fields. As with CIF 1.1, this can be achieved by applying the line-folding protocol if necessary.

===
--
T +61 (02) 9717 9907
F +61 (02) 9717 3145
M +61 (04) 0249 4148
_______________________________________________
ddlm-group mailing list
ddlm-group@iucr.org
http://mailman.iucr.org/cgi-bin/mailman/listinfo/ddlm-group


--
T +61 (02) 9717 9907
F +61 (02) 9717 3145
M +61 (04) 0249 4148
_______________________________________________
ddlm-group mailing list
ddlm-group@iucr.org
http://mailman.iucr.org/cgi-bin/mailman/listinfo/ddlm-group

Reply to: [list | sender only]
International Union of Crystallography

Scientific Union Member of the International Science Council (admitted 1947). Member of CODATA, the ISC Committee on Data. Partner with UNESCO, the United Nations Educational, Scientific and Cultural Organization in the International Year of Crystallography 2014.

International Science Council Scientific Freedom Policy

The IUCr observes the basic policy of non-discrimination and affirms the right and freedom of scientists to associate in international scientific activity without regard to such factors as ethnic origin, religion, citizenship, language, political stance, gender, sex or age, in accordance with the Statutes of the International Council for Science.