[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Reply to: [list | sender only]
Re: [ddlm-group] Recommended character set and use restrictions. .
- To: Group finalising DDLm and associated dictionaries <ddlm-group@iucr.org>
- Subject: Re: [ddlm-group] Recommended character set and use restrictions. .
- From: "Herbert J. Bernstein" <yaya@bernstein-plus-sons.com>
- Date: Mon, 21 Jun 2010 20:26:37 -0400 (EDT)
- In-Reply-To: <AANLkTikvzEdqMmQb9orHAU9myrUYYwY331YfXg76Hsue@mail.gmail.com>
- References: <AANLkTikPRP0zLmeWCde-UjR599qJBDP4ps8mpT2FB07E@mail.gmail.com><84803.69690.qm@web87001.mail.ird.yahoo.com><a0624080ac84197c8f154@192.168.2.104><8F77913624F7524AACD2A92EAF3BFA54166122951A@SJMEMXMBS11.stjude.sjcrh.local><4C1F84F2.3060608@mcmaster.ca><8F77913624F7524AACD2A92EAF3BFA54166122951B@SJMEMXMBS11.stjude.sjcrh.local><350805.14101.qm@web87013.mail.ird.yahoo.com><AANLkTikvzEdqMmQb9orHAU9myrUYYwY331YfXg76Hsue@mail.gmail.com>
The meaning of "non-printing" is tricky. The structure of joining and non-joing characters in unicode is intrinsic to its approach in printing Arabic. It is similar in concept to dead-key accents -- it is the sequence of code points that determines what will be printed. ===================================================== Herbert J. Bernstein, Professor of Computer Science Dowling College, Kramer Science Center, KSC 121 Idle Hour Blvd, Oakdale, NY, 11769 +1-631-244-3035 yaya@dowling.edu ===================================================== On Tue, 22 Jun 2010, James Hester wrote: > Yes, I think the correct approach for COMCIFS would be to restrict the > character set of definitions in DDLm dictionaries, but not overly > restrict the character set in the format specifications. This would > allow third parties to use generic CIF parsers to construct data files > and dictionaries in other languages. > > I'm beginning to think that the original proposal to simply exclude > all non-printing characters from tags might be the simplest approach, > and then specific characters could be included on a case-by-case > basis. > > On Tue, Jun 22, 2010 at 6:26 AM, SIMON WESTRIP > <simonwestrip@btinternet.com> wrote: >> The draft I have has the following 'disclaimer', which basically allows >> unicode for data names unless they're defined in a DDLm dictionary? >> >> "Important restriction: In the case where the contents of a CIF2 data file >> are defined in the >> new DDLm dictionary there is an imposed restriction on the character set of >> a data name..." >> >> See Brian's email of 1/3/2010 >> >> As I read it, you can use unicode in your datanames, but you shouldnt >> 'officially' be able to define those names in your own DDLm dictionary? >> >> Cheers >> >> Simon >> >> >> ________________________________ >> From: "Bollinger, John C" <John.Bollinger@STJUDE.ORG> >> To: Group finalising DDLm and associated dictionaries <ddlm-group@iucr.org> >> Sent: Monday, 21 June, 2010 16:58:07 >> Subject: Re: [ddlm-group] Recommended character set and use restrictions. . >> >> >> On Monday, June 21, 2010 10:28 AM, David Brown wrote: >>> I can see the advantages of using Unicode in data values where one may >>> wish to render text is some non-ascii formmat, but is there any reason >>> why data names should not be restricted (at least for the forseeable >>> future) to ASCII characters? These names are assigned by COMCIFS and >>> we are in no real danger of running out of ASCII data names. One day >>> we may need to write our dictionaries in Arabic, but I doubt that any >>> of us will be around wheb that happens. If we only allowed non-ASCII >>> characters in delimited strings we would meet all the needs of the >>> community for many years to come, and save ourselves a lot of grief >>> trying to sort out which code points to allow. >> >> That's a fair point. I observe, though, that COMCIFS controls data names >> only in the official dictionaries it maintains, not in local dictionaries or >> other third-party dictionaries. It appears to be parties maintaining such >> dictionaries that have the most potential benefit from an expanded character >> repertoire for data names. Additionally, general users might receive a >> small benefit from having a larger character repertoire available for use in >> data block codes. >> >> Having come late to the party, I hadn't before considered whether there was >> a real use case for general Unicode data names, etc.. It was already in the >> first spec draft I saw. If there is no persuasive use case for it then I >> don't have any objection to restricting use of non-ASCII characters to >> within the bounds of one of the multitude of quoted string syntaxes. That >> would be the conservative choice, suitable to be relaxed later if need be. >> >> >> John >> -- >> John C. Bollinger, Ph.D. >> Department of Structural Biology >> St. Jude Children's Research Hospital >> >> >> Email Disclaimer: www.stjude.org/emaildisclaimer >> >> _______________________________________________ >> ddlm-group mailing list >> ddlm-group@iucr.org >> http://scripts.iucr.org/mailman/listinfo/ddlm-group >> >> _______________________________________________ >> ddlm-group mailing list >> ddlm-group@iucr.org >> http://scripts.iucr.org/mailman/listinfo/ddlm-group >> >> > > > > -- > T +61 (02) 9717 9907 > F +61 (02) 9717 3145 > M +61 (04) 0249 4148 > _______________________________________________ > ddlm-group mailing list > ddlm-group@iucr.org > http://scripts.iucr.org/mailman/listinfo/ddlm-group >
_______________________________________________ ddlm-group mailing list ddlm-group@iucr.org http://scripts.iucr.org/mailman/listinfo/ddlm-group
Reply to: [list | sender only]
- References:
- [ddlm-group] Vote on BOM (James Hester)
- Re: [ddlm-group] Vote on BOM (SIMON WESTRIP)
- Re: [ddlm-group] Recommended character set and use restrictions. . (Bollinger, John C)
- Re: [ddlm-group] Recommended character set and use restrictions. . (David Brown)
- Re: [ddlm-group] Recommended character set and use restrictions. . (Bollinger, John C)
- Re: [ddlm-group] Recommended character set and use restrictions. . (SIMON WESTRIP)
- Re: [ddlm-group] Recommended character set and use restrictions. . (James Hester)
- Prev by Date: Re: [ddlm-group] options/text vs binary/end-of-line. .. .
- Next by Date: Re: [ddlm-group] options/text vs binary/end-of-line. .. .
- Prev by thread: Re: [ddlm-group] Recommended character set and use restrictions. .
- Next by thread: Re: [ddlm-group] Character set for data block and save frame codes
- Index(es):