[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[ddlm-group] Searching for a compromise on eliding

Dear DDLm-group,

I think we have all had a decent chance to argue our case for
Proposals P, F and F'.  I have also been in small side discussions
with Ralf and John W.  Their points of view can be summarised as
(i) Behaviour of triple-quoted strings will be too confusing unless
Python behaviour is followed (Ralf)
(ii) There is considerable criticism of CIF in the macromolecular
community because of idiosyncratic behaviour, particularly concerning
quoting.  We should therefore stick to accepted standards as much as
possible (John W)

For John W and Ralf these points outweigh any of the disadvantages of
Proposal P, and so Proposal P remains their first choice.  Proposal P
is therefore the first choice of 3 out of 5 COMCIFS voters, and the
last choice of the other two (I would rank it worse than doing
nothing, actually).  I note that non-voting members are uniformly
opposed to Proposal P.

I therefore want to try to seek some common middle ground in the hope
that I can find a proposal that could be at least as acceptable as
Proposal P to Ralf and/or Herbert and/or John W.

Consider the following four new proposals - P-prime, Q, G and null:

* Proposal P-prime: triple-quoted strings are treated as for Python
2.7.  No Unicode or raw strings are defined (ie no strings starting
u""" or r""").

I interpret John W and Ralf's position to be that they would be able
to support this proposal as the preferred choice, as our syntax would
still be entirely consistent with Python.  This proposal is a
considerable improvement on Proposal P, because the dangers of raw
strings are taken out of the equation, and the Unicode database is no
longer a dependency.  We are still left with a whole bunch of (frankly
pointless) elides, leading to Proposal Q:

* Proposal Q: As for Proposal P-prime, with the following changes:
(1) Only <backslash><delimiter> and <backslash><backslash> when it
precedes <backslash><delimiter> are recognised escape sequences at the
syntactical level
(2) A DDLm string type, e.g. "CText", is defined in com_val.dic for
which the remaining escape sequences have the meaning assigned to them
by the Python 2.7 standard.  mmCIF and related domains can standardise
their definitions on this string type and derivatives, making the
above division between syntax and dictionary invisible to users and
programmers in their domain.

* Proposal G: Proposal F', but with a different delimiter

Ralf has indicated that he actually thinks Proposal F' is best, but
only if the delimiters are not going to be confused with Python
delimiters.  I interpret John W's position to be that he would not
support such a change in delimiters as that would simply make CIF even
more idiosyncratic.  Anyway, any such replacement delimiter would need
to be multi-character, easy to type and unlikely to occur as the first
characters in CIF1 datavalues.  We would also need to reduce the
characterset of non-delimited CIF2 strings to exclude any such
delimiters.  Ideas?

* Null proposal: do nothing as we can't agree

I think I could support Proposal Q as an acceptable fallback from F',
and if somebody can find sensible delimiters for Proposal G that works
for me as well.  The preferred treatment for backslash rich text for
Proposals P,P' and Q will necessarily be semicolon-delimited strings.

T +61 (02) 9717 9907
F +61 (02) 9717 3145
M +61 (04) 0249 4148
ddlm-group mailing list

Reply to: [list | sender only]