[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Reply to: [list | sender only]
Re: Advice on COMCIFS policy regarding compatibility of CIF syntaxwith other domains
- To: "Discussion list of the IUCr Committee for the Maintenance of the CIFStandard (COMCIFS)" <comcifs@iucr.org>
- Subject: Re: Advice on COMCIFS policy regarding compatibility of CIF syntaxwith other domains
- From: Nick Spadaccini <nick@csse.uwa.edu.au>
- Date: Thu, 10 Mar 2011 15:01:38 +0800
- Conversation: Advice on COMCIFS policy regarding compatibility of CIF syntaxwith other domains
From: Doug <doug.duboulay@gmail.com> >> One CIF feature that no other software language supports natively >> are measurement numbers, with their SUs. Maybe they should be encoded as >> tuples for wider compatibility? I am not this does represent wider compatibility in reporting of uncertainty. The not so arcane approach of number(su) with an implied scale in su given the size of number, is one of the four ISO accepted reporting syntaxes for uncertainty. Enshrining them in a data structure like a tuple is not one of the accepted syntaxes. What's more we need separate the representation in software (which quite reasonably could be a tuple) from syntactic representation in a file. >> If you were comfortable with that and python was all important, you >> could define CIF2 to be a python data structure, suck it all in as a >> single string and simply hit eval() (or some wrapped "safe" form of eval). >> For that matter, JSON is very similar in structure and effectively >> standardised. Python is not more important than any other. Most other options have an eval() or exec() path to evaluations of strings. Contrary to popular misconception there is nothing in dREL that is Python. The syntax is generic and the string encapsulation is pure STAR. >> Other programming languages support various forms of string expansion. >> For instance in bash/sh/csh/perl/php/tcl typically double quoted >> strings expand with various forms of "$substitution ${of} $(variables)". That is why the current string syntax of the STAR that supports DDLm and dREL treats the strings as raw, save for a single escape rule to protect (only) the character that delimits the string. This is the way of incorporating a " in to a " delimited string. Similarly for the ' character. What to do with the string is left to the dictionary to define, in particular exploiting the presence of dREL as an operational language. >> In Python there is also string expansion from lists and dictionaries: >> >> """ %(substitution)s %(of)s %(variables)d""" % \ >> {'variables': _my_CIF_data_name's_value?, >> 'of' : 'very silly', >> 'substitution': 'this is' >> } >> >> Are these strings likely to be a construct that could exist in a CIF, or >> have a role in the post processing of parsed CIFs? I could see it as >> useful to ensure that values referenced in prose stay in sync with >> actual CIF data values. Appropriately delimited such a string could be supported. What to do with it, eg push it through an eval() method. However the ability to have the above as two separate data items, one being a script and the other being the data for the script while simultaneously being ordinary data in a CIF, would be a little harder to functionally achieve. If of course the script is written in dREL that would be much easier to achieve because the dictionary engine is always evaluating dREL code. >> Some CIF dictionaries contain regular expression definitions which >> generally are easier to understand as python raw strings r"..." >> That wouldn't have direct impact on CIF2 string handling, but if the >> handler was already present for the dictionary, then it could presumably >> be easily co-opted for the CIF, I suspect. Again a regex handler does not have to be Python specific. All that is necessary is an agreed syntax for the regex. >> If the primary CIF2 stakeholders were assumed to be the various databases, >> then maybe all CIF string values should really be optimised for direct >> injection via SQL (maybe its just convention but AFAIK, only single quotes >> seem to be significant)? A minor API issue, but you do highlight misconceptions about stakeholder uptake. If minor issues from databases systems actually force CIF2 to redefine syntax to meet such needs, then CIF2 will become highly restricted and beholden to vested interests of stakeholders. CIF2 should remain simple and expressive, and let the API handle everything else. >> As Peter indicated, there's a spectrum of compatibilities that >> could be argued for or against, but where do you draw the line? Exactly. >> My personal preference would be for a lightweight spec that I could >> easily implement myself, at a pinch, in my language of choice >> (or better, that someone else had already implemented), or for a >> more complicated spec when there were tools available that >> automatically built the parser and handler. That is how DDLm and dREL is implemented (as well you know since you are a programmer of it). >> If I was writing Tcl, I wouldn't really want to wrap and include python in >> order to handle a string correctly, if thats what the implications are. Again language choice is not part of the STAR at the syntactic level, and need not be part of CIF2. It is not necessary, and far more general not to lock a programming language in at the syntax level. It is more expressive to allow an sequence of characters within a string and then level the dictionary indicate how to deal with the internals. cheers Nick -------------------------------- Associate Professor N. Spadaccini, PhD School of Computer Science & Software Engineering The University of Western Australia t: +61 (0)8 6488 3452 35 Stirling Highway f: +61 (0)8 6488 1089 CRAWLEY, Perth, WA 6009 AUSTRALIA w3: www.csse.uwa.edu.au/~nick MBDP M002 CRICOS Provider Code: 00126G e: Nick.Spadaccini@uwa.edu.au
Reply to: [list | sender only]
- Prev by Date: RE: Advice on COMCIFS policy regarding compatibility of CIFsyntaxwith other domains.
- Next by Date: Re: Advice on COMCIFS policy regarding compatibility of CIF syntaxwith other domains
- Prev by thread: Madrid 2011
- Next by thread: Restraints CIF dictionary version 1.0 released
- Index(es):