[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Accent escape sequences

To: "Discussion list of the IUCr Committee for the Maintenance of the CIFStandard (COMCIFS)" <comcifs@iucr.org>
Subject: Re: Accent escape sequences
From: Brian McMahon <bm@iucr.org>
Date: Mon, 5 Mar 2007 16:00:44 +0000
In-Reply-To: <45EC3846.5070001@niehs.nih.gov>
References: <45E72969.1090100@niehs.nih.gov><20070302101147.GA26353@emerald.iucr.org><Pine.BSF.4.58.0703020830490.46806@epsilon.pair.com><45EA0C29.5060604@niehs.nih.gov><a06230900c20fde7910a9@[192.168.2.101]><45EC3846.5070001@niehs.nih.gov>

> The advantage of a simple escape mechanism, like the current scheme, is
> that it is fairly easy to read directly. The disadvantage is that it has
> limited abilities. With MIME, the multipart/alternative could be used,
> where simple ASCII escapes are combined with a more accurate version
> that is not directly readable. This give the advantages of both forms.

In principle, this is a great idea. Consider the CIF dictionaries,
where the pure-text _definition field sometimes carries inventive
representations of maths (e.g.
http://www.iucr.org/iucr-top/cif/cifdic_html/1/cif_core.dic/Irefine_ls_restrained_S_gt.html )
that have to be reverse-engineered into something more useful (e.g. TeX)
when typesetting these for International Tables. It would make it
easier to keep these representations in sync if they were both
transported as multipart/alternative content in the same text field.

But ... this does come at the expense of significantly more
complexity in applications that need to do something with the
content of text fields. Most scientific CIF applications (the
ones that work on the data) won't be affected - they just skip
over text fields. The others will need to have the ability to
parse and extract MIME content (not too difficult), but also
to *write* proper multipart content, and that's not necessarily
so easy if you're to provide tools that ingest content from
different input streams (TeX-savvy editors, html editors,
clipboards...). In practice the Acta office doesn't see a
critical mass of content provision to justify this complexity
at this stage (it's still really only Acta C and E that use
CIF text fields extensively, and they're catered for through
publCIF). Having said which, there's no harm in working through
the details of how such a system could operate.

Going back to Joe's original wishes to rationalise and perhaps
extend the existing CIF markup, it's important also to remember
that some data items will also occasionally require markup for
simple string fields - e.g. how to markup the "alpha" Wyckoff
position in the symmetry CIF dictionary? The use of
the '\a' digraph in
http://www.iucr.org/iucr-top/cif/cifdic_html/2/cif_sym.dic/Ispace_group_Wyckoff.letter.html
clearly derives from the "usual" CIF markup for alpha, but that is
nowhere made formally clear. It looks like we need unambiguous
markup rules in these cases too.

(I'm hoping to see our publCIF developer later this week so that
we can discuss the specifics of the proposal Joe posted recently.)

Brian

Reply to: [list | sender only]

Follow-Ups:

Re: Accent escape sequences (James Hester)

Re: Accent escape sequences (Joe Krahn)

References:

Accent escape sequences (Joe Krahn)

Re: Accent escape sequences (Brian McMahon)

Re: Accent escape sequences (Herbert J. Bernstein)

Re: Accent escape sequences (Joe Krahn)

Re: Accent escape sequences (Herbert J. Bernstein)

Re: Accent escape sequences (Joe Krahn)

Prev by Date: Re: Accent escape sequences

Next by Date: Re: Accent escape sequences

Prev by thread: Re: Accent escape sequences

Next by thread: Re: Accent escape sequences

Index(es):

Date

Thread

Discussion List Archives

Re: Accent escape sequences