Discussion List Archives

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Treatment of Greek characters in CIF2

Dear Andrius,

This is interesting information as it gives some insight into the commitments that large projects such as COD have to particular CIF constructs. The issue for you with CIF2 would be that your software can turn "\a" into "α", but it will not expect an actual "alpha" code point in the input text and so would need to be updated to handle this. The intent of my suggestion is that you and other maintainers of CIF1-era code can write an input filter in front of your current codebase to transform actual "alpha" etc. letters to their ASCII markup equivalent, after which all of your code will continue to work as before. In other words, COMCIFS is not going to turn around and do something that would make such a scheme fail.

all the best,
James.

On 20 April 2017 at 16:32, Andrius Merkys <andrius.merkys@gmail.com> wrote:
Dear James,

On 20/04/17 08:38, James Hester wrote:
> I toyed with the idea of allowing '\Uxxxxxx' for arbitrary Unicode
> code points, but (i) this would clash with '\U' for capital upsilon
> and (ii) is not expected by legacy applications and so would therefore
> require that they be updated, in which case adapting them to just
> ingest Unicode would be more straightforward.

we at the COD have been using XML character entities for this purpose:
upon parsing/printing CIF 1.1 our software converts all non-ASCII values
to their named or numeric XML entities. This is easy to implement as
most of the programming languages have encoders and decoders for XML
entities. Furthermore, named entities are easy to understand for humans.
However, such method (i) may clash with XML entities already present in
CIF and (ii) is not expected by other software. Therefore, it's not much
better.

Best wishes,
Andrius

--
Andrius Merkys
PhD student at Vilnius University Institute of Biotechnology, SaulÄ—tekio al. 7, V325
LT-10257 Vilnius, Lithuania
Lecturer at Vilnius University Faculty of Mathematics and Informatics, Naugarduko g. 24
LT-03225 Vilnius, Lithuania


_______________________________________________
cif-developers mailing list
cif-developers@iucr.org
http://mailman.iucr.org/cgi-bin/mailman/listinfo/cif-developers



--
T +61 (02) 9717 9907
F +61 (02) 9717 3145
M +61 (04) 0249 4148
_______________________________________________
cif-developers mailing list
cif-developers@iucr.org
http://mailman.iucr.org/cgi-bin/mailman/listinfo/cif-developers

Reply to: [list | sender only]
International Union of Crystallography

Scientific Union Member of the International Council for Science (admitted 1947). Member of CODATA, the ICSU Committee on Data. Member of ICSTI, the International Council for Scientific and Technical Information. Partner with UNESCO, the United Nations Educational, Scientific and Cultural Organization in the International Year of Crystallography 2014.

ICSU Scientific Freedom Policy

The IUCr observes the basic policy of non-discrimination and affirms the right and freedom of scientists to associate in international scientific activity without regard to such factors as ethnic origin, religion, citizenship, language, political stance, gender, sex or age, in accordance with the Statutes of the International Council for Science.