[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Reply to: [list | sender only]
Re: [ddlm-group] Recommended character set and use restrictions. .
- To: Group finalising DDLm and associated dictionaries <[email protected]>
- Subject: Re: [ddlm-group] Recommended character set and use restrictions. .
- From: "Herbert J. Bernstein" <[email protected]>
- Date: Mon, 21 Jun 2010 20:26:37 -0400 (EDT)
- In-Reply-To: <[email protected]>
- References: <[email protected]><[email protected]><[email protected]><8F77913624F7524AACD2A92EAF3BFA54166122951A@SJMEMXMBS11.stjude.sjcrh.local><[email protected]><8F77913624F7524AACD2A92EAF3BFA54166122951B@SJMEMXMBS11.stjude.sjcrh.local><[email protected]><[email protected]>
The meaning of "non-printing" is tricky. The structure of joining and
non-joing characters in unicode is intrinsic to its approach in
printing Arabic. It is similar in concept to dead-key accents --
it is the sequence of code points that determines what will be
printed.
=====================================================
Herbert J. Bernstein, Professor of Computer Science
Dowling College, Kramer Science Center, KSC 121
Idle Hour Blvd, Oakdale, NY, 11769
+1-631-244-3035
[email protected]
=====================================================
On Tue, 22 Jun 2010, James Hester wrote:
> Yes, I think the correct approach for COMCIFS would be to restrict the
> character set of definitions in DDLm dictionaries, but not overly
> restrict the character set in the format specifications. This would
> allow third parties to use generic CIF parsers to construct data files
> and dictionaries in other languages.
>
> I'm beginning to think that the original proposal to simply exclude
> all non-printing characters from tags might be the simplest approach,
> and then specific characters could be included on a case-by-case
> basis.
>
> On Tue, Jun 22, 2010 at 6:26 AM, SIMON WESTRIP
> <[email protected]> wrote:
>> The draft I have has the following 'disclaimer', which basically allows
>> unicode for data names unless they're defined in a DDLm dictionary?
>>
>> "Important restriction: In the case where the contents of a CIF2 data file
>> are defined in the
>> new DDLm dictionary there is an imposed restriction on the character set of
>> a data name..."
>>
>> See Brian's email of 1/3/2010
>>
>> As I read it, you can use unicode in your datanames, but you shouldnt
>> 'officially' be able to define those names in your own DDLm dictionary?
>>
>> Cheers
>>
>> Simon
>>
>>
>> ________________________________
>> From: "Bollinger, John C" <[email protected]>
>> To: Group finalising DDLm and associated dictionaries <[email protected]>
>> Sent: Monday, 21 June, 2010 16:58:07
>> Subject: Re: [ddlm-group] Recommended character set and use restrictions. .
>>
>>
>> On Monday, June 21, 2010 10:28 AM, David Brown wrote:
>>> I can see the advantages of using Unicode in data values where one may
>>> wish to render text is some non-ascii formmat, but is there any reason
>>> why data names should not be restricted (at least for the forseeable
>>> future) to ASCII characters?� These names are assigned by COMCIFS and
>>> we are in no real danger of running out of ASCII data names.� One day
>>> we may need to write our dictionaries in Arabic, but I doubt that any
>>> of us will be around wheb that happens.� If we only allowed non-ASCII
>>> characters in delimited strings we would meet all the needs of the
>>> community for many years to come, and save ourselves a lot of grief
>>> trying to sort out which code points to allow.
>>
>> That's a fair point.� I observe, though, that COMCIFS controls data names
>> only in the official dictionaries it maintains, not in local dictionaries or
>> other third-party dictionaries.� It appears to be parties maintaining such
>> dictionaries that have the most potential benefit from an expanded character
>> repertoire for data names.� Additionally, general users might receive a
>> small benefit from having a larger character repertoire available for use in
>> data block codes.
>>
>> Having come late to the party, I hadn't before considered whether there was
>> a real use case for general Unicode data names, etc..� It was already in the
>> first spec draft I saw.� If there is no persuasive use case for it then I
>> don't have any objection to restricting use of non-ASCII characters to
>> within the bounds of one of the multitude of quoted string syntaxes.� That
>> would be the conservative choice, suitable to be relaxed later if need be.
>>
>>
>> John
>> --
>> John C. Bollinger, Ph.D.
>> Department of Structural Biology
>> St. Jude Children's Research Hospital
>>
>>
>> Email Disclaimer:� www.stjude.org/emaildisclaimer
>>
>> _______________________________________________
>> ddlm-group mailing list
>> [email protected]
>> http://scripts.iucr.org/mailman/listinfo/ddlm-group
>>
>> _______________________________________________
>> ddlm-group mailing list
>> [email protected]
>> http://scripts.iucr.org/mailman/listinfo/ddlm-group
>>
>>
>
>
>
> --
> T +61 (02) 9717 9907
> F +61 (02) 9717 3145
> M +61 (04) 0249 4148
> _______________________________________________
> ddlm-group mailing list
> [email protected]
> http://scripts.iucr.org/mailman/listinfo/ddlm-group
>
_______________________________________________ ddlm-group mailing list [email protected] http://scripts.iucr.org/mailman/listinfo/ddlm-group
Reply to: [list | sender only]
- References:
- [ddlm-group] Vote on BOM (James Hester)
- Re: [ddlm-group] Vote on BOM (SIMON WESTRIP)
- Re: [ddlm-group] Recommended character set and use restrictions. . (Bollinger, John C)
- Re: [ddlm-group] Recommended character set and use restrictions. . (David Brown)
- Re: [ddlm-group] Recommended character set and use restrictions. . (Bollinger, John C)
- Re: [ddlm-group] Recommended character set and use restrictions. . (SIMON WESTRIP)
- Re: [ddlm-group] Recommended character set and use restrictions. . (James Hester)
- Prev by Date: Re: [ddlm-group] options/text vs binary/end-of-line. .. .
- Next by Date: Re: [ddlm-group] options/text vs binary/end-of-line. .. .
- Prev by thread: Re: [ddlm-group] Recommended character set and use restrictions. .
- Next by thread: Re: [ddlm-group] Character set for data block and save frame codes
- Index(es):

