[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Reply to: [list | sender only]
Re: [ddlm-group] Recommended character set and use restrictions. .
- To: Group finalising DDLm and associated dictionaries <ddlm-group@iucr.org>
- Subject: Re: [ddlm-group] Recommended character set and use restrictions. .
- From: James Hester <jamesrhester@gmail.com>
- Date: Tue, 22 Jun 2010 10:18:19 +1000
- In-Reply-To: <350805.14101.qm@web87013.mail.ird.yahoo.com>
- References: <AANLkTikPRP0zLmeWCde-UjR599qJBDP4ps8mpT2FB07E@mail.gmail.com><84803.69690.qm@web87001.mail.ird.yahoo.com><a0624080ac84197c8f154@192.168.2.104><8F77913624F7524AACD2A92EAF3BFA54166122951A@SJMEMXMBS11.stjude.sjcrh.local><4C1F84F2.3060608@mcmaster.ca><8F77913624F7524AACD2A92EAF3BFA54166122951B@SJMEMXMBS11.stjude.sjcrh.local><350805.14101.qm@web87013.mail.ird.yahoo.com>
Yes, I think the correct approach for COMCIFS would be to restrict the character set of definitions in DDLm dictionaries, but not overly restrict the character set in the format specifications. This would allow third parties to use generic CIF parsers to construct data files and dictionaries in other languages. I'm beginning to think that the original proposal to simply exclude all non-printing characters from tags might be the simplest approach, and then specific characters could be included on a case-by-case basis. On Tue, Jun 22, 2010 at 6:26 AM, SIMON WESTRIP <simonwestrip@btinternet.com> wrote: > The draft I have has the following 'disclaimer', which basically allows > unicode for data names unless they're defined in a DDLm dictionary? > > "Important restriction: In the case where the contents of a CIF2 data file > are defined in the > new DDLm dictionary there is an imposed restriction on the character set of > a data name..." > > See Brian's email of 1/3/2010 > > As I read it, you can use unicode in your datanames, but you shouldnt > 'officially' be able to define those names in your own DDLm dictionary? > > Cheers > > Simon > > > ________________________________ > From: "Bollinger, John C" <John.Bollinger@STJUDE.ORG> > To: Group finalising DDLm and associated dictionaries <ddlm-group@iucr.org> > Sent: Monday, 21 June, 2010 16:58:07 > Subject: Re: [ddlm-group] Recommended character set and use restrictions. . > > > On Monday, June 21, 2010 10:28 AM, David Brown wrote: >>I can see the advantages of using Unicode in data values where one may >>wish to render text is some non-ascii formmat, but is there any reason >>why data names should not be restricted (at least for the forseeable >>future) to ASCII characters? These names are assigned by COMCIFS and >>we are in no real danger of running out of ASCII data names. One day >>we may need to write our dictionaries in Arabic, but I doubt that any >>of us will be around wheb that happens. If we only allowed non-ASCII >>characters in delimited strings we would meet all the needs of the >>community for many years to come, and save ourselves a lot of grief >>trying to sort out which code points to allow. > > That's a fair point. I observe, though, that COMCIFS controls data names > only in the official dictionaries it maintains, not in local dictionaries or > other third-party dictionaries. It appears to be parties maintaining such > dictionaries that have the most potential benefit from an expanded character > repertoire for data names. Additionally, general users might receive a > small benefit from having a larger character repertoire available for use in > data block codes. > > Having come late to the party, I hadn't before considered whether there was > a real use case for general Unicode data names, etc.. It was already in the > first spec draft I saw. If there is no persuasive use case for it then I > don't have any objection to restricting use of non-ASCII characters to > within the bounds of one of the multitude of quoted string syntaxes. That > would be the conservative choice, suitable to be relaxed later if need be. > > > John > -- > John C. Bollinger, Ph.D. > Department of Structural Biology > St. Jude Children's Research Hospital > > > Email Disclaimer: www.stjude.org/emaildisclaimer > > _______________________________________________ > ddlm-group mailing list > ddlm-group@iucr.org > http://scripts.iucr.org/mailman/listinfo/ddlm-group > > _______________________________________________ > ddlm-group mailing list > ddlm-group@iucr.org > http://scripts.iucr.org/mailman/listinfo/ddlm-group > > -- T +61 (02) 9717 9907 F +61 (02) 9717 3145 M +61 (04) 0249 4148 _______________________________________________ ddlm-group mailing list ddlm-group@iucr.org http://scripts.iucr.org/mailman/listinfo/ddlm-group
Reply to: [list | sender only]
- Follow-Ups:
- Re: [ddlm-group] Recommended character set and use restrictions. . (Herbert J. Bernstein)
- References:
- [ddlm-group] Vote on BOM (James Hester)
- Re: [ddlm-group] Vote on BOM (SIMON WESTRIP)
- Re: [ddlm-group] Recommended character set and use restrictions. . (Bollinger, John C)
- Re: [ddlm-group] Recommended character set and use restrictions. . (David Brown)
- Re: [ddlm-group] Recommended character set and use restrictions. . (Bollinger, John C)
- Re: [ddlm-group] Recommended character set and use restrictions. . (SIMON WESTRIP)
- Prev by Date: Re: [ddlm-group] options/text vs binary/end-of-line. .. .
- Next by Date: Re: [ddlm-group] options/text vs binary/end-of-line. .. .
- Prev by thread: Re: [ddlm-group] Recommended character set and use restrictions... .
- Next by thread: Re: [ddlm-group] Recommended character set and use restrictions. .
- Index(es):