[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Reply to: [list | sender only]
Re: [ddlm-group] Technical issues with Proposal P
- To: Group finalising DDLm and associated dictionaries <ddlm-group@iucr.org>
- Subject: Re: [ddlm-group] Technical issues with Proposal P
- From: SIMON WESTRIP <simonwestrip@btinternet.com>
- Date: Tue, 22 Feb 2011 10:59:59 +0000 (GMT)
- In-Reply-To: <AANLkTi=kadbHikjabDyioDOw=L_pthGORgi6w2b45yX6@mail.gmail.com>
- References: <AANLkTi=kadbHikjabDyioDOw=L_pthGORgi6w2b45yX6@mail.gmail.com>
I think the CIF application would be forced to accept these strings as read - i.e.
with the backslashes - even if the user did not intend this interpretation.
The Python spec for raw strings states that:
"String quotes can be escaped with a backslash, but the backslash remains in the string"
By highlighting this ambiguity, I believe James has made the strongest argument yet
against proposal P (adopting a sophisticated programming syntax for a relatively simple data tagging task).
Cheers
Simon
From: James Hester <jamesrhester@gmail.com>
To: Group finalising DDLm and associated dictionaries <ddlm-group@iucr.org>
Sent: Tuesday, 22 February, 2011 3:53:55
Subject: [ddlm-group] Technical issues with Proposal P
I will focus this email on the technical issues and try to return to
the other issues at a later date (I've changed the subject
accordingly)
[edit]
My apologies for not being clear: my examples of embedded elides
already give the internal representation of the strings, deliberately
leaving out the particular delimiters that might have been used to
produce those strings. Herbert mistakenly thought I was giving
triple-double-quote delimited strings and asking what the internal
representation was. So, unfortunately, IDLE cannot help here, as the
internal representation is not in question.
My question therefore remains: how does the CIF application interpret
these strings? Is the <backslash><delimiter> in my examples simply an
elide that could not be removed from a raw string and therefore should
be ignored, or is it a character sequence intended for the application
(eg a LaTeX accent on the o or e)?
In your answer you may assume that the CIF application knows that the
string was a raw string delimited by triple double quotes (even though
requiring communication of such information would be a very
unfortunate loss of clean design).
Those strings again:
<start> I have no idea what the last characters of this string are\"<finish>
<start> Does this string have two\""" or three internal quotes?<finish>
Herbert writes:
> Now for your two examples of embedded elides of quotes:
>
> <start> I have no idea what the last characters of this string are\"<finish>
>
> is, internally, as a C-string
>
> I have no idea what the last characters of this string are"\0
>
> <start> Does this string have two\""" or three internal quotes?<finish>
>
> is, internally as a C-string
>
> Does this string have two""" or three internal quotes?\0
>
> I settled that by simply cranking up IDLE and doing:
>
>>>> print """I have no idea what the last characters of this string
>>>>are\"""" I have no idea what the last characters of this string
>>>>are" >>> print """Does this string have two\""" or three internal
>>>>quotes?""" Does this string have two""" or three internal quotes?
>
> As you well know, having IDLE around is a big help.
>
> Thank you again for taking the time to clarify your position
> on Ralf's proposal. I think I now understand why you prefer Simon's
> proposal.
>
> Regards,
> Herbert
>
>
>
>
>
>>One technical issue with Proposal P that has not been resolved is how
>>a CIF application is supposed to interpret the sequence
>><backslash><double quote> when encountered in a string returned from
>>the parser. Is this sequence:
>>(a) a terminator elide sequence that was left in a raw string, so
>>corresponds to <double quote>?
>>(b) something with meaning for the application so should be
>><backslash><double quote>?
>>
>>Please therefore advise how a CIF application will disambiguate the
>>following string content from a Proposal P parser:
>>
>><start> I have no idea what the last characters of this string are\"<finish>
>><start> Does this string have two\""" or three internal quotes?<finish>
>>
>>James
>>
--
T +61 (02) 9717 9907
F +61 (02) 9717 3145
M +61 (04) 0249 4148
_______________________________________________
ddlm-group mailing list
ddlm-group@iucr.org
http://scripts.iucr.org/mailman/listinfo/ddlm-group
with the backslashes - even if the user did not intend this interpretation.
The Python spec for raw strings states that:
"String quotes can be escaped with a backslash, but the backslash remains in the string"
By highlighting this ambiguity, I believe James has made the strongest argument yet
against proposal P (adopting a sophisticated programming syntax for a relatively simple data tagging task).
Cheers
Simon
From: James Hester <jamesrhester@gmail.com>
To: Group finalising DDLm and associated dictionaries <ddlm-group@iucr.org>
Sent: Tuesday, 22 February, 2011 3:53:55
Subject: [ddlm-group] Technical issues with Proposal P
I will focus this email on the technical issues and try to return to
the other issues at a later date (I've changed the subject
accordingly)
[edit]
My apologies for not being clear: my examples of embedded elides
already give the internal representation of the strings, deliberately
leaving out the particular delimiters that might have been used to
produce those strings. Herbert mistakenly thought I was giving
triple-double-quote delimited strings and asking what the internal
representation was. So, unfortunately, IDLE cannot help here, as the
internal representation is not in question.
My question therefore remains: how does the CIF application interpret
these strings? Is the <backslash><delimiter> in my examples simply an
elide that could not be removed from a raw string and therefore should
be ignored, or is it a character sequence intended for the application
(eg a LaTeX accent on the o or e)?
In your answer you may assume that the CIF application knows that the
string was a raw string delimited by triple double quotes (even though
requiring communication of such information would be a very
unfortunate loss of clean design).
Those strings again:
<start> I have no idea what the last characters of this string are\"<finish>
<start> Does this string have two\""" or three internal quotes?<finish>
Herbert writes:
> Now for your two examples of embedded elides of quotes:
>
> <start> I have no idea what the last characters of this string are\"<finish>
>
> is, internally, as a C-string
>
> I have no idea what the last characters of this string are"\0
>
> <start> Does this string have two\""" or three internal quotes?<finish>
>
> is, internally as a C-string
>
> Does this string have two""" or three internal quotes?\0
>
> I settled that by simply cranking up IDLE and doing:
>
>>>> print """I have no idea what the last characters of this string
>>>>are\"""" I have no idea what the last characters of this string
>>>>are" >>> print """Does this string have two\""" or three internal
>>>>quotes?""" Does this string have two""" or three internal quotes?
>
> As you well know, having IDLE around is a big help.
>
> Thank you again for taking the time to clarify your position
> on Ralf's proposal. I think I now understand why you prefer Simon's
> proposal.
>
> Regards,
> Herbert
>
>
>
>
>
>>One technical issue with Proposal P that has not been resolved is how
>>a CIF application is supposed to interpret the sequence
>><backslash><double quote> when encountered in a string returned from
>>the parser. Is this sequence:
>>(a) a terminator elide sequence that was left in a raw string, so
>>corresponds to <double quote>?
>>(b) something with meaning for the application so should be
>><backslash><double quote>?
>>
>>Please therefore advise how a CIF application will disambiguate the
>>following string content from a Proposal P parser:
>>
>><start> I have no idea what the last characters of this string are\"<finish>
>><start> Does this string have two\""" or three internal quotes?<finish>
>>
>>James
>>
--
T +61 (02) 9717 9907
F +61 (02) 9717 3145
M +61 (04) 0249 4148
_______________________________________________
ddlm-group mailing list
ddlm-group@iucr.org
http://scripts.iucr.org/mailman/listinfo/ddlm-group
_______________________________________________ ddlm-group mailing list ddlm-group@iucr.org http://scripts.iucr.org/mailman/listinfo/ddlm-group
Reply to: [list | sender only]
- References:
- [ddlm-group] Technical issues with Proposal P (James Hester)
- Prev by Date: [ddlm-group] Technical issues with Proposal P
- Next by Date: Re: [ddlm-group] Technical issues with Proposal P
- Prev by thread: [ddlm-group] Technical issues with Proposal P
- Next by thread: Re: [ddlm-group] Technical issues with Proposal P
- Index(es):