Some thoughts on the first part of Herbert's proposals:

Herbert proposes:
  C1:  that the character set for a "new cif" be unicode, and
  C2:  that the default encoding be UTF-8; and
  C3:  that other encodings be permitted as an optional
system-dependent feature when an explicit encoding
has been specified by
    C3.1:  a unicode BOM (byte-order-mark) (see
http://en.wikipedia.org/wiki/Byte-order_mark) has been introduced
into a character stream, or
    C3.2.  the first or second line being a comment of the form:
      # -*- coding: <encoding-name> -*-
    as recognized by GNU Emacs, or
    C3.3.  the first or second line being a comment of the form:
      # vim:fileencoding=<encoding-name>
    as recognized by Bram Moolenaar's VIM
(see section 2.1.4 of
http://docs.python.org/reference/lexical_analysis.html for a more

(James again:)
I agree with C1 and C2.  Regarding C3, I don't see the need for other
encodings at all.  Furthermore, I want to run screaming from the room
when I see the words 'system dependent'.  As a file transfer standard,
we care most about the (possibly different) sending and receiving
systems agreeing on the contents, and so 'system-dependent' is
completely unacceptable. In contrast to CIF, system-independence is a
lower priority for a programming language, as a programmer who does
not wish to distribute their program widely can usefully take
advantage of system-dependent features.
