[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Reply to: [list | sender only]
Re: [ddlm-group] Use of elides in strings
- To: Group finalising DDLm and associated dictionaries <ddlm-group@iucr.org>
- Subject: Re: [ddlm-group] Use of elides in strings
- From: Joe Krahn <krahn@niehs.nih.gov>
- Date: Mon, 23 Nov 2009 17:03:14 -0500
- In-Reply-To: <407817.81146.qm@web87008.mail.ird.yahoo.com>
- References: <C7306520.1258E%nick@csse.uwa.edu.au> <572182.92308.qm@web87003.mail.ird.yahoo.com> <279aad2a0911230240q278ab08fqc09349148202bed9@mail.gmail.com> <4B0ABDF3.4090108@niehs.nih.gov><407817.81146.qm@web87008.mail.ird.yahoo.com>
Simon, In general, I agree with you completely. In fact, my main reason for joining this list is to promote a CIF 2 format that avoids ambiguities. However, some people have taken the approach that current CIF implementations make the dictionary context mandatory. I would also drop rule #4, but it is better to have a default than to leave it completely ambiguous. Do the dictionary-based developers want to allow for an alternate conversion, or is the conflict only about where in the software is the conversion done? Joe SIMON WESTRIP wrote: > Hi Joe > > In reply to this and your subsequent comments (the subject line of the > emails seems to have overlapped): > > I'm not sure that rule 4) "An implementation may override the default > conversions in #2, but > should be avoided in most cases to maintain compatibility" > is appropriate in defining a base syntax for CIF (in any flavour, CIF1 > or 2, or ....). > As I understand it and encounter it, one of the purposes of CIF is as an > archive format. > As such, the way data are stored in the CIF should _not_ be > implementation dependent, > i.e. anything that suggests that a data item could be interpreted in > more than one way > depending on the particular software that the CIF is passed to is in > danger of making > the standard 'non-standard'. Though it can be argued that > context-sensitivity could be extended to > the dictionary (some sort of interpretation of whether an elide used in > a string type was intended or > not, or whether for a particular item it will always be interpreted in a > certain way), > the fact is there will be applications that are only interested in > obtaining the value of a data item > without any consideration of the CIF dictionary, and they will need to > know the rules for identifyng that value. > For example, a molecular graphics program may just want the site data - > it may encounter _site_label "A\"BC" in one loop, but (for whatever > reason), the associated site data may be given as > _site_label 'A"BC' in another loop. The application needs to know > whether it is looking at the same key value, > but cannot do this if the rules say that "A\"BC" might be A\"BC or it > might be A"BC depending on > the interpretation described in an associated dictionary or by a > particular discipline or organization. > So when it comes down to defining a base syntax that all CIFs should > adhere to, I don't think there is any scope for > offering various interpretations of what the value of a data item > actually is. > > Forgive me if I've missed the point here or misunderstood your comments, > but seems to me that establishing strict rules > about the use of elides is quite important, whether they produce A\"BC > or A"BC by default (some time back I interpreted > them as A"BC, but that was rejected, but subsequently A"BC is on the > table!). So whichever way it goes, I look forward to > the results of any straw vote on this. > > Cheers > > Simon > > > > ------------------------------------------------------------------------ > *From:* Joe Krahn <krahn@niehs.nih.gov> > *To:* Group finalising DDLm and associated dictionaries > <ddlm-group@iucr.org> > *Sent:* Monday, 23 November, 2009 16:53:07 > *Subject:* Re: [ddlm-group] Use of elides in strings > > I think the solution is to define the CIF2 syntax in a way that allows > more flexibility in the software implementation. IMHO, if you are going > to leave the reverse-solidus intact, you should leave the quotes intact > as well, because the elides are dependent on the quoting context. > Obviously, RCSB software is designed in a way that they prefer all > character conversions at the dictionary level. Other developers want the > conversion done at the same time quotes are removed, so it can be done > in the correct quoting context. > > It should be possible to allow both approaches, with syntax definitions > something like this: > > Within quoted strings, the following rules apply: > > 1) all close-quote definitions include the look-behind assertion that > they are not preceded by an odd number of ASCII reverse solidus characters. > > 2) By default, <REVERSE SOLIDUS><REVERSE SOLIDUS> represents <REVERSE > SOLIDUS>, and <REVERSE SOLIDUS><CLOSE QUOTE> represents <CLOSE QUOTE>. > > 3) It is implementation dependent whether the conversions defined in #2 > are applied at the file I/O formatting level (i.e. parser on input). > > 4) An implementation may override the default conversions in #2, but > should be avoided in most cases to maintain compatibility. > > Joe > > James Hester wrote: > > The outstanding issue seems to be around where in the process these > > elides get stripped; Herb and John argue that it should be possible to > > do this in an optional way at the dictionary stage. As I've already > > indicated, I don't think that it is that straightforward. > > > > On Mon, Nov 23, 2009 at 9:35 PM, SIMON WESTRIP > > <simonwestrip@btinternet.com <mailto:simonwestrip@btinternet.com>> wrote: > >> So at the risk of repeating myself, at this stage there seems to be > majority > >> acceptance of > >> what I've been refering to as context-sensitive treatment of elides: > >> > >> Using the trivial example of _label "A\"BC" > >> > >> James and Nick would return A"BC > >> > >> Herb and John would return A\"BC > >> > >> I would return A"BC > >> > >> I wont address Herb's examples as I performed a similar exercise back in > >> THREAD3 > >> which was then received with a different opinion :-) > >> > > > > _______________________________________________ > ddlm-group mailing list > ddlm-group@iucr.org <mailto:ddlm-group@iucr.org> > http://scripts.iucr.org/mailman/listinfo/ddlm-group > _______________________________________________ ddlm-group mailing list ddlm-group@iucr.org http://scripts.iucr.org/mailman/listinfo/ddlm-group
Reply to: [list | sender only]
- Follow-Ups:
- Re: [ddlm-group] Use of elides in strings (James Hester)
- References:
- Re: [ddlm-group] Use of elides in strings (Nick Spadaccini)
- Re: [ddlm-group] Use of elides in strings (SIMON WESTRIP)
- Re: [ddlm-group] Use of elides in strings (James Hester)
- Re: [ddlm-group] Use of elides in strings (Joe Krahn)
- Re: [ddlm-group] Use of elides in strings (SIMON WESTRIP)
- Prev by Date: Re: [ddlm-group] Use of elides in strings
- Next by Date: Re: [ddlm-group] CIF header
- Prev by thread: Re: [ddlm-group] Use of elides in strings
- Next by thread: Re: [ddlm-group] Use of elides in strings
- Index(es):