[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Reply to: [list | sender only]
Re: [ddlm-group] Searching for a compromise on eliding
- To: Group finalising DDLm and associated dictionaries <ddlm-group@iucr.org>
- Subject: Re: [ddlm-group] Searching for a compromise on eliding
- From: "Herbert J. Bernstein" <yaya@bernstein-plus-sons.com>
- Date: Sat, 26 Feb 2011 21:09:24 -0500
- In-Reply-To: <670779.27822.qm@web87009.mail.ird.yahoo.com>
- References: <AANLkTi=bEDjCpJgyuB07q1FBFZjA_jbG=4jgLsXEvw4g@mail.gmail.com><20110226125548.GA29624@emerald.iucr.org><670779.27822.qm@web87009.mail.ird.yahoo.com>
If at all possible, I would like to try at least a little more to achieve a consensus. I am sorry that James' P-prime does not seem to reach that goal. I just suggested that COMCIFS have a skype meeting. Perhaps this group should have one, as well. Maybe we can find common ground verbally and visually -- the latest version of Skype seems to be capable of video conference calls. Regards, Herbert At 9:52 PM +0000 2/26/11, SIMON WESTRIP wrote: >For what its worth, I summarise my position in the hope that this >issue will shortly return to COMCIFS. > >1) I see no reason to abandon use of triple-quotes: > >Python may be alone in its use of triple quotes (?), but it also >uses single quotes, >and in a markedly different way to many other programming languages >- so why can't CIF use >triple-quotes in a different way to python?. Alternatives may be >found, but I suspect it will prove >difficult to agree on any of them (of suggestions so far, John B's >is 'visually' confusing and Brian's >is questionable - a double quote could just delimit a null or empty string?). > >2) For various reasons, I think the P-type proposals are >inappropriate for CIF: > >I see little point in repeating my arguments and on the whole I >support John B's and James's arguments. > >3) I favour James's F' proposal to my initial F version: > >My version was intended to be minimalist but was formulated in terms >that might fit the python model (common escape sequences etc.), >but, among other considerations, James's version is all that is necessary. > >--- > >In conclusion, I would like this group to return an F' proposal to >COMCIFS for their consideration. > >Of the active participants in this group, Herbert seems to be in a >minority in his wholehearted supported for the >adoption of python, but as a voting member of COMCIFS, Herbert can >obviously influence matters in that forum. > >Cheers > >Simon > > > > > >From: Brian McMahon <bm@iucr.org> >To: Group finalising DDLm and associated dictionaries <ddlm-group@iucr.org> >Sent: Saturday, 26 February, 2011 12:55:48 >Subject: Re: [ddlm-group] Searching for a compromise on eliding > >Dear Colleagues > >I have been out of the office all week and largely away from email. I >apologise for not saying so when I last posted, but I had at the >time anticipated being able to keep in touch with this conversation. > >On technical grounds, I favour > >F' - requires least handling of special escapes >G - formally equivalent to F' > >You will recognise the first line as a verbatim extract from my >posting of 18 January in response to the first call for a vote in this >matter. I have not seen any new technical considerations to change >my preference for the economy of such a specification. Proposal G has >two possible disadvantages: the need to construct novel (but ideally >"natural") new delimiters - would paired double quotes "" suffice?; >and a break with the existing implementation of """ in Nick's initial >implementation. I know that we have ascertained that we are not bound >to retain specific novel syntactic features of that implementation, >but I see no technical advantage in moving away from it. > >I disfavour Option Q because it introduces what I consider an >unnecessary domain-specific interpretation of character strings. >The "domains" involved are not mutually exclusive: in IUCr journals >we would anticipate handling both core CIFs and mmCIFs, while >applications such as SHELX work with both small and >macromolecules. However, that's not strictly a technical >consideration. > >I found Ralf's intervention of 10 January very persuasive: > >> In my observation any language that persisted long term has a feature to >> escape the closing quote token. Therefore I conjecture it is a small but >> vital feature. > >This prompted me to revisit the need for a delimiter escape mechanism >that would then allow encapsulation of arbitrarily complex strings, >and thus, for example, remove the need for a new string concatenation >operator (a requested feature over which I was still rather unhappy). >In reviewing the discussions, I did note that Ralf's point had already >been made (by Nick, I think), but its importance was unfortunately >not appreciated at that time. > >Proposal F'/G therefore address this technical "vital feature" to >my satisfaction. > >=== > >The rest of the discussions seem to pivot around psychology more than >technical requirements. In my experience, analysis of psychology >provides useful insight into understanding a historical sequence of >events, but is rarely successful at predicting the future. > >I prefer, as I stated before, to consider policy based on such >psychological or social imperatives in the COMCIFS forum, but if it >helps to indicate here my opinions on the concerns raised, I would say >the following: > >> (i) Behaviour of triple-quoted strings will be too confusing unless >> Python behaviour is followed (Ralf) > >There is perhaps some opportunity for confusion; but in other areas >there are similar opportunities: shell file globbing shares some >syntactic features with regular expression processing. People who >really work with such systems manage to overcome the confusion - more >easily when there is a real difference in purpose (filename globbing >is indeed distinct from regexp processing, just as string delimiting >in a data file is different from string processing within an >interpreted program). As John B has pointed out, adoption of a >proposal P or close variants also has scope for confusion if a user is >not completely familiar with the version of Python chosen as the >underlying paradigm. > >> (ii) There is considerable criticism of CIF in the macromolecular >> community because of idiosyncratic behaviour, particularly concerning >> quoting. We should therefore stick to accepted standards as much as >> possible (John W) > >John W (and Herbert) are undoubtedly correct in identifying a distaste >within the macromolecular community for the idiosyncratic CIF formalism. >But I believe the second sentence is a non sequitur: I am not convinced >that adoption of a particular syntactic feature from Python is all >that is needed to persuade that community to embrace CIF with open >arms. As has been argued several times on this list, the technical >requirements on a data input parser for CIF are not very great (and by >opting for "economical" schemes such as James's proposal F' we would tend >towards minimising them). If the programmers within the macromolecular >community - many of whom I know to be extraordinarily competent and >intelligent - do not build CIF applications, I am sure it is because >they do not see sufficient scientific value in doing so, rather than >that the complexity or awkwardness of the file format defeats them. >Or at least I shall persist in believing that. > >Let us take this element of the discussion onto the COMCIFS list, >preferably on the back of the revised proposal that I encourage James >to present from the ddlm-group. > >=== > >Back to the technical considerations which I believe this group >should focus on. I consider the most desirable outcome to be a >clear and clean specification. Proposals F'/G will achieve that >elegantly. Proposal P has the potential to achieve that (though one >does need to specify the version of Python and perhaps reconsider the >handling of Unicode characters), although I still feel that as >a specification it carries too high a burden for compliance from >applications developers working outside of a Python framework. I >would strongly discourage attempts at a compromise that seeks to >provide a technical solution based on some minimisation of the >root mean square unhappiness of the members of this group, but that >ends up with an unstructured mish-mash of features from different >proposals. > >Regards >Brian > >On Fri, Feb 25, 2011 at 01:50:59PM +1100, James Hester wrote: >> Dear DDLm-group, >> >> I think we have all had a decent chance to argue our case for >> Proposals P, F and F'. I have also been in small side discussions > > with Ralf and John W. Their points of view can be summarised as >> follows: >> (i) Behaviour of triple-quoted strings will be too confusing unless >> Python behaviour is followed (Ralf) >> (ii) There is considerable criticism of CIF in the macromolecular >> community because of idiosyncratic behaviour, particularly concerning >> quoting. We should therefore stick to accepted standards as much as >> possible (John W) >> >> For John W and Ralf these points outweigh any of the disadvantages of >> Proposal P, and so Proposal P remains their first choice. Proposal P >> is therefore the first choice of 3 out of 5 COMCIFS voters, and the >> last choice of the other two (I would rank it worse than doing >> nothing, actually). I note that non-voting members are uniformly >> opposed to Proposal P. >> >> I therefore want to try to seek some common middle ground in the hope >> that I can find a proposal that could be at least as acceptable as >> Proposal P to Ralf and/or Herbert and/or John W. >> >> Consider the following four new proposals - P-prime, Q, G and null: >> >> * Proposal P-prime: triple-quoted strings are treated as for Python >> 2.7. No Unicode or raw strings are defined (ie no strings starting >> u""" or r"""). >> >> I interpret John W and Ralf's position to be that they would be able >> to support this proposal as the preferred choice, as our syntax would >> still be entirely consistent with Python. This proposal is a >> considerable improvement on Proposal P, because the dangers of raw >> strings are taken out of the equation, and the Unicode database is no >> longer a dependency. We are still left with a whole bunch of (frankly >> pointless) elides, leading to Proposal Q: >> >> * Proposal Q: As for Proposal P-prime, with the following changes: >> (1) Only <backslash><delimiter> and <backslash><backslash> when it >> precedes <backslash><delimiter> are recognised escape sequences at the >> syntactical level >> (2) A DDLm string type, e.g. "CText", is defined in com_val.dic for >> which the remaining escape sequences have the meaning assigned to them >> by the Python 2.7 standard. mmCIF and related domains can standardise >> their definitions on this string type and derivatives, making the >> above division between syntax and dictionary invisible to users and >> programmers in their domain. >> >> * Proposal G: Proposal F', but with a different delimiter >> >> Ralf has indicated that he actually thinks Proposal F' is best, but >> only if the delimiters are not going to be confused with Python >> delimiters. I interpret John W's position to be that he would not >> support such a change in delimiters as that would simply make CIF even >> more idiosyncratic. Anyway, any such replacement delimiter would need >> to be multi-character, easy to type and unlikely to occur as the first >> characters in CIF1 datavalues. We would also need to reduce the >> characterset of non-delimited CIF2 strings to exclude any such >> delimiters. Ideas? >> >> * Null proposal: do nothing as we can't agree >> >> I think I could support Proposal Q as an acceptable fallback from F', >> and if somebody can find sensible delimiters for Proposal G that works >> for me as well. The preferred treatment for backslash rich text for >> Proposals P,P' and Q will necessarily be semicolon-delimited strings. >> >> James. >> -- >> T +61 (02) 9717 9907 >> F +61 (02) 9717 3145 >> M +61 (04) 0249 4148 >> _______________________________________________ >> ddlm-group mailing list >> <mailto:ddlm-group@iucr.org>ddlm-group@iucr.org >> >><http://scripts.iucr.org/mailman/listinfo/ddlm-group>http://scripts.iucr.org/mailman/listinfo/ddlm-group >_______________________________________________ >ddlm-group mailing list ><mailto:ddlm-group@iucr.org>ddlm-group@iucr.org ><http://scripts.iucr.org/mailman/listinfo/ddlm-group>http://scripts.iucr.org/mailman/listinfo/ddlm-group > > >_______________________________________________ >ddlm-group mailing list >ddlm-group@iucr.org >http://scripts.iucr.org/mailman/listinfo/ddlm-group -- ===================================================== Herbert J. Bernstein, Professor of Computer Science Dowling College, Kramer Science Center, KSC 121 Idle Hour Blvd, Oakdale, NY, 11769 +1-631-244-3035 yaya@dowling.edu ===================================================== _______________________________________________ ddlm-group mailing list ddlm-group@iucr.org http://scripts.iucr.org/mailman/listinfo/ddlm-group
Reply to: [list | sender only]
- References:
- [ddlm-group] Searching for a compromise on eliding (James Hester)
- Re: [ddlm-group] Searching for a compromise on eliding (Brian McMahon)
- Re: [ddlm-group] Searching for a compromise on eliding (SIMON WESTRIP)
- Prev by Date: Re: [ddlm-group] Searching for a compromise on eliding
- Next by Date: [ddlm-group] Fwd: Fwd: Searching for a compromise on eliding
- Prev by thread: Re: [ddlm-group] Searching for a compromise on eliding
- Next by thread: [ddlm-group] Fwd: Fwd: Searching for a compromise on eliding
- Index(es):