[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Treatment of Greek characters in CIF2

Subject: Re: Treatment of Greek characters in CIF2
From: James Hester <jamesrhester@xxxxxxxxx>
Date: Thu, 20 Apr 2017 17:57:00 +1000
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025;h=mime-version:in-reply-to:references:from:date:message-id:subject:to;bh=aChQyZP77FE6h6/aZqdHgsUax5jSEkNpMgi+2bkrx7c=;b=Q8yJadkoFZTKX7MQSiYkags73AxkfO498o/mmnEkgEWYhtYbTZs5UEd8eEsKVpaYsLtooR/kEsJrt+j0Rqn98tzRCYf/Na1NBNHnzy4LmLhQsbg66bKGYm4RuVgM1ubXLSZJzKx/lChBi5n8qLCFDbrIH6Dng+4uqK2wPQe/34NYnmqjRBEyOUv2DELMOOpmNAEiXwYO801WaY596vDk0I7quAadRJIxIGS5F1b7Punl5lqv3n3bQ8fsVccvIvo4MQBXNjAwtCBfRGZdXsjS6QypuK5n5/dWubKRY96DgGa68o4enO4WwSKcBcQnCBKEO0qSxHBV5pIzsvL1jHNreA==
In-Reply-To: <c228b94c-95d3-3d40-7acb-d2669ef9af20@gmail.com>
References: <CAM+dB2d5NCbCb1Zc_QS3KkjscDH7Sk9NQVbQxhLn0nPtO6E+zA@mail.gmail.com><c228b94c-95d3-3d40-7acb-d2669ef9af20@gmail.com>

Dear Andrius,

This is interesting information as it gives some insight into the commitments that large projects such as COD have to particular CIF constructs. The issue for you with CIF2 would be that your software can turn "\a" into "α", but it will not expect an actual "alpha" code point in the input text and so would need to be updated to handle this. The intent of my suggestion is that you and other maintainers of CIF1-era code can write an input filter in front of your current codebase to transform actual "alpha" etc. letters to their ASCII markup equivalent, after which all of your code will continue to work as before. In other words, COMCIFS is not going to turn around and do something that would make such a scheme fail.

all the best,

James.

On 20 April 2017 at 16:32, Andrius Merkys <andrius.merkys@gmail.com> wrote:

Dear James,

On 20/04/17 08:38, James Hester wrote:
> I toyed with the idea of allowing '\Uxxxxxx' for arbitrary Unicode
> code points, but (i) this would clash with '\U' for capital upsilon
> and (ii) is not expected by legacy applications and so would therefore
> require that they be updated, in which case adapting them to just
> ingest Unicode would be more straightforward.

we at the COD have been using XML character entities for this purpose:
upon parsing/printing CIF 1.1 our software converts all non-ASCII values
to their named or numeric XML entities. This is easy to implement as
most of the programming languages have encoders and decoders for XML
entities. Furthermore, named entities are easy to understand for humans.
However, such method (i) may clash with XML entities already present in
CIF and (ii) is not expected by other software. Therefore, it's not much
better.

Best wishes,
Andrius

--
Andrius Merkys
PhD student at Vilnius University Institute of Biotechnology, Saulėtekio al. 7, V325
LT-10257 Vilnius, Lithuania
Lecturer at Vilnius University Faculty of Mathematics and Informatics, Naugarduko g. 24
LT-03225 Vilnius, Lithuania

_______________________________________________
cif-developers mailing list
cif-developers@iucr.org
http://mailman.iucr.org/cgi-bin/mailman/listinfo/cif-developers

T +61 (02) 9717 9907
F +61 (02) 9717 3145
M +61 (04) 0249 4148

_______________________________________________
cif-developers mailing list
cif-developers@iucr.org
http://mailman.iucr.org/cgi-bin/mailman/listinfo/cif-developers

Reply to: [list | sender only]

References:

Treatment of Greek characters in CIF2 (James Hester)

Re: Treatment of Greek characters in CIF2 (Andrius Merkys)

Prev by Date: Re: Draft JSON specification, round 2

Next by Date: Re: Draft JSON specification, round 2

Prev by thread: Re: Treatment of Greek characters in CIF2

Next by thread: RE: Treatment of Greek characters in CIF2

Index(es):

Date

Thread

Discussion List Archives

Re: Treatment of Greek characters in CIF2