[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Reply to: [list | sender only]
Re: [ddlm-group] UTF-8 BOM
- To: Group finalising DDLm and associated dictionaries <ddlm-group@iucr.org>
- Subject: Re: [ddlm-group] UTF-8 BOM
- From: Brian McMahon <bm@iucr.org>
- Date: Tue, 15 Jun 2010 12:24:42 +0100
- In-Reply-To: <8F77913624F7524AACD2A92EAF3BFA54165DF3381E@SJMEMXMBS11.stjude.sjcrh.local>
- References: <8F77913624F7524AACD2A92EAF3BFA54165DF337DB@SJMEMXMBS11.stjude.sjcrh.local><alpine.BSF.2.00.1005131228500.12350@epsilon.pair.com><8F77913624F7524AACD2A92EAF3BFA54165DF337DD@SJMEMXMBS11.stjude.sjcrh.local><AANLkTimlen0jl2p5SsvvizSNN37HZmMs2XOCc0KW7RMG@mail.gmail.com><alpine.BSF.2.00.1005180700530.27091@epsilon.pair.com><8F77913624F7524AACD2A92EAF3BFA54165DF337E1@SJMEMXMBS11.stjude.sjcrh.local><alpine.BSF.2.00.1005181330210.38662@epsilon.pair.com><AANLkTimOLbOkIqCwqgsKJ36eVctlZccsAN4XAjYDr4Qd@mail.gmail.com><20100614142541.GA356@emerald.iucr.org><8F77913624F7524AACD2A92EAF3BFA54165DF3381E@SJMEMXMBS11.stjude.sjcrh.local>
On Mon, Jun 14, 2010 at 03:09:56PM -0500, Bollinger, John C wrote: >> Of course the world does >> contain CIFs created other than by fully-conformant CIF writers. To >> an extent the community should decide for itself how best to attempt >> to handle deviations from full conformance. It would help, perhaps, if >> those of us writing CIF readers would document specific practices that >> the software takes to accommodate such deviations. Ideally, such >> software should have a verbose logging mode that can be activated >> whenever surprising behaviour in reading CIFs is encountered by >> the user. > > I think it's exceedingly optimistic to expect "the community" to arrive > at and abide by a single, consistent set of best practices. The best > you can hope for is that a small number of organizations and / or > programs will exert enough influence to establish their own de facto > standards. I'm an optimist :-) > We can exert some influence there, however. Either the CIF spec or > a companion spec could establish conformance requirements for CIF > *processors*, including, for example, the ability to diagnose > particular malformations. The XML spec does this, as do some > programming language specs. > > Such a document could also establish, perhaps, that CIF processors > must be able to accept the UTF-8 encoding, and maybe even that they > must assume UTF-8 by default. That would establish the baseline and > a guaranteed interoperability mode that we would otherwise lose by > pushing character encoding outside the format specification. Probably this is the route that I would prefer. Make the formal CIF spec as clean as possible, even if it appears somewhat harsh, but sanction particular processing protocols to accommodate well-defined and somewhat frequent edge cases. We've grappled with this sort of thing before, in the context of coercion rules for robust lexer/parsers. Again my preference would be for the CIF spec to be strict, but the coercion rules to be documented as a basis for building processing hardware capable of handling certain well-characterised deviations from the strict specification. Having said that, I am not in favour of unpicking what we have already effectively agreed by consensus. I'll be very happy to respond to James's forthcoming call for a vote on the BOM issue and help if I can with integrating the recent small refinements to the final draft specification. It's more important to have a fixed spec that we can work with, than to spend forever striving for a perfect solution. Regards Brian _______________________________________________ ddlm-group mailing list ddlm-group@iucr.org http://scripts.iucr.org/mailman/listinfo/ddlm-group
Reply to: [list | sender only]
- Follow-Ups:
- Re: [ddlm-group] UTF-8 BOM (Bollinger, John C)
- References:
- Re: [ddlm-group] [SPAM] ASSP UTF-8 BOM (Bollinger, John C)
- Re: [ddlm-group] [SPAM] ASSP UTF-8 BOM (Herbert J. Bernstein)
- Re: [ddlm-group] [SPAM] ASSP UTF-8 BOM (Bollinger, John C)
- Re: [ddlm-group] [SPAM] ASSP UTF-8 BOM (James Hester)
- Re: [ddlm-group] [SPAM] ASSP UTF-8 BOM (Herbert J. Bernstein)
- Re: [ddlm-group] [SPAM] ASSP UTF-8 BOM (Bollinger, John C)
- Re: [ddlm-group] [SPAM] ASSP UTF-8 BOM (Herbert J. Bernstein)
- Re: [ddlm-group] [SPAM] ASSP UTF-8 BOM (James Hester)
- Re: [ddlm-group] UTF-8 BOM (Brian McMahon)
- Re: [ddlm-group] UTF-8 BOM (Bollinger, John C)
- Prev by Date: Re: [ddlm-group] [SPAM] ASSP UTF-8 BOM
- Next by Date: Re: [ddlm-group] [SPAM] ASSP UTF-8 BOM
- Prev by thread: Re: [ddlm-group] UTF-8 BOM
- Next by thread: Re: [ddlm-group] UTF-8 BOM
- Index(es):