Discussion List Archives

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [ddlm-group] Straw poll results

In terms of strict specification:

1.2 (2.a)
2.3

[However, I look forward to the outcome of any discussions you have on the 'deprecation' approach,
and as Nick suggests, I suspect (hope) many parsers/applications will adopt 'the third way' in practice - i.e. making it
as painless as possible to move to the new spec.]

and for UTF-8:

'binary' (1.a)

[i.e. beyond the ASCII byte! Basically, I think we should take this opportunity to introduce this change. In the short-term, I dont forsee CIFs being riddled with the stuff - afterall I'm not sure current dictionary definitions (DDL2 at least) of types would permit it - but thats off the top of my head...]

Cheers

Simon


From: James Hester <jamesrhester@gmail.com>
To: ddlm-group@iucr.org
Sent: Tuesday, 13 October, 2009 16:27:20
Subject: [ddlm-group] Straw poll results

Here are the results of the straw poll.  See the end of the email for
detailed vote counts, and note the request for a further vote on
certain issues.

CONCLUSIONS
===========

1. UTF8 will be supported. Not clear on asciified version or
binary. Therefore, please comment and vote on the following, given
that UTF8 will be included in the new standard:

(a) UTF8 should be supported in standard form only (i.e. 'binary'
characters with values above 127 will appear in CIF files)

(b) An asciified version only should be supported.  An example would
be the syntax \uxxxx, where xxxx refers to the Unicode code point of
the character in hexadecimal notation.  NB this is not strictly UTF8,
but simply a Unicode representation.

My vote: 1.a

2. Termination of quoted strings on first occurence of quote delimiter
and restriction of character set for non-delimited strings: Approved,
but not clear whether to deprecate first or move immediately to
requirement.  Upon long consideration of Brian's email and Herbert's
reservations, and two cups of tea, and some chocolate, I am happy to
change my votes to 1.2 and 2.3 (and perhaps call the new CIF syntax
2.0 rather than 1.2), therefore I declare these proposals approved as
a requirement in the new standard.  I'll write a separate email on
this.

However: Brian and James want to require whitespace between tokens
outside compound expressions regardless of it now becoming strictly
unnecessary in several cases.  Given that the above proposals have
been passed, please vote again on the following options:

(a) Whitespace is not required between tokens unless tokens could not
otherwise be separated; writers are encouraged to pad between tokens
(b) Whitespace must always appear between tokens outside compound expressions
(c) Whitespace must always appear between tokens both in and outside
compound expressions

My vote: 2.b

Detailed vote summary
=====================

Issue1:  Removing the requirement for a trailing whitespace after
quoted strings outside of bracketed constructs.
  Options:  1.1. Preserve the current convention as is
            1.2. Terminate all quoted strings on the occurance of the
trailing quoted delimiter without consideration of the next character
            1.3  Deprecate rather than require 1.2

1.1: Nobody (Herbert prefers if 1.3 not an option)
1.2: Brian (but whitespace required between tokens), Nick, Simon
1.3: Herbert, James (but whitespace required between tokens)

Difficult to determine any clear preference from John W., but he seems
happy to go along with the changes we are discussing so long as there
is a clear fallback position.

  Issue2:  Restriction of the character set for non-delimited strings
outside of bracketed constructs
  Options  2.1.  Preserve the current convention as is
          2.2.  Modify the current convention to deprecate use of
                any characters other than a strictly limited set
                of characters, adding a warning oon reads and
                defaulting to add quote marks on write
          2.3.  Modify the current convention to forbid the use of
                any characters other than a strctly limited set
                of characters, making it an error to read a non-delimited
                string that does not comply even if the intention
                can be inferred from context

2.1: Nobody
2.2: Herbert, James
2.3: Nick, Simon, (John)

UTF8:

Do not use: Nobody
Use: Simon, Brian, John
Use, binary: Herbert, James
Use, asciified: Nick

A clear preference for binary or ascii can't be gleaned from Brian and
Simon's and John's emails, so I've left them as simply 'Use'.


--
T +61 (02) 9717 9907
F +61 (02) 9717 3145
M +61 (04) 0249 4148
_______________________________________________
ddlm-group mailing list
ddlm-group@iucr.org
http://scripts.iucr.org/mailman/listinfo/ddlm-group
_______________________________________________
ddlm-group mailing list
ddlm-group@iucr.org
http://scripts.iucr.org/mailman/listinfo/ddlm-group

Reply to: [list | sender only]
International Union of Crystallography

Scientific Union Member of the International Science Council (admitted 1947). Member of CODATA, the ISC Committee on Data. Partner with UNESCO, the United Nations Educational, Scientific and Cultural Organization in the International Year of Crystallography 2014.

International Science Council Scientific Freedom Policy

The IUCr observes the basic policy of non-discrimination and affirms the right and freedom of scientists to associate in international scientific activity without regard to such factors as ethnic origin, religion, citizenship, language, political stance, gender, sex or age, in accordance with the Statutes of the International Council for Science.