[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [ddlm-group] The Grazulis eliding proposal: how to incorporateinto CIF?

Essentially its the legacy issue - but I have to confess to a degree of unease about adopting a
change to the semicolon delimiters (which so far have largely escaped the designs on CIF2)
and uncertainty with regard to the place of e.g. the line-folding protocol and where this leaves us
with respect to the triple-quote delimiters and the desire of many to see escape mechanisms
in place generally.

I agree that the proposal addresses the issue of 'delimiting the delimiters', but so have others.

Cheers

Simon




From: James Hester <jamesrhester@gmail.com>
To: Group finalising DDLm and associated dictionaries <ddlm-group@iucr.org>
Sent: Tuesday, 28 June, 2011 14:40:52
Subject: Re: [ddlm-group] The Grazulis eliding proposal: how to incorporate into CIF?

I believe that the point that David was making was that, while there are theoretical legacy issues, the chance of one actually being encountered is low.  I would guess (and this is provable) that you are more likely to find a syntax error in the current IUCr archive than you are to find a file that would be misinterpreted under the Grazulis proposal.  Are there any other reasons that lead you to reject this proposal?

On Tue, Jun 28, 2011 at 7:50 PM, SIMON WESTRIP <simonwestrip@btinternet.com> wrote:
In order to achieve its goal of tagging arbitrary content, this proposal would have to form part of
CIF syntax - i.e. option (1). In which case, it presents legacy issues as pointed out by David, and
I would reject it as it stands.

If it is to be slotted in according to the other options, then it becomes a semantic feature and should be
considered along with other possibilities for declaring the 'encoding' of the data value.

Cheers

Simon


From: James Hester <jamesrhester@gmail.com>
To: ddlm-group <ddlm-group@iucr.org>
Sent: Tuesday, 28 June, 2011 8:22:34
Subject: [ddlm-group] The Grazulis eliding proposal: how to incorporate into CIF?

Dear DDLm group,

As none of you have raised any substantial objections to the Grazulis eliding proposal, I think we can consider it accepted. The question now arises as to how it will fit into the CIF framework.  I see the following possibilities:

(1) As a required protocol for all CIF semicolon-delimited text strings (must be recognised by CIF readers)
(2) As an available protocol for all CIF semicolon-delimited text strings (may not be recognised by all CIF readers)
(3) As a string type defined in DDLm for use in domain dictionary definitions (only needs to be recognised by domain-specific software)

Under option (1), the "official" value of a given semicolon-delimited string would be unambiguously that which results from decoding the protocol.  Under option (2) there would be two "official" values: the undecoded value and the decoded value, either of which would be acceptable output for a conformant parser; under option (3) the dictionary determines how to process the string (identically to interpreting e.g. LaTeX strings today).  Under option (3) the "official" value from a CIF parser would be the undecoded value, and the "official" value after application of the dictionary definition would be the decoded value.

My comments:
Option (3) has the formal effect of requiring that either the type of string delimiter is carried forward to the dictionary layer, so that triple-quote delimited strings are not inadvertently "decoded", or else that the protocol is applied uniformly across all multi-line string constructs for that particular dictionary type.

Option (2) insofar as it involves optional behaviour essentially sidelines the proposal, as CIF writers cannot count on it being understood at the reading end and so cannot use it to encode important information

Option (1) imposes extra burdens on CIF parser writers, although as Saulius notes it is not particularly difficult to implement.

My preference is either (1) or (3), perhaps inclining towards (3) in order to shift complexity to the dictionary level.  If the protocol is seen to be generally useful, it would be reasonable to prefer (1).


--
T +61 (02) 9717 9907
F +61 (02) 9717 3145
M +61 (04) 0249 4148

_______________________________________________
ddlm-group mailing list
ddlm-group@iucr.org
http://scripts.iucr.org/mailman/listinfo/ddlm-group




--
T +61 (02) 9717 9907
F +61 (02) 9717 3145
M +61 (04) 0249 4148
_______________________________________________
ddlm-group mailing list
ddlm-group@iucr.org
http://scripts.iucr.org/mailman/listinfo/ddlm-group

Reply to: [list | sender only]