[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[ddlm-group] Data-name character restrictions - one last time

To: [email protected]
Subject: [ddlm-group] Data-name character restrictions - one last time
From: Brian McMahon <[email protected]>
Date: Wed, 9 Dec 2009 14:40:35 +0000

I have one remaining niggle that I'd like to revisit before we put
this finally to bed. As has been mentioned a couple of times
recently, restricting the data-name character set does invalidate
syntactically many existing CIF 1 files (e.g. _refine_ls_shift/esd_max ).
We have discussed strategies for handling this, and I think these
are workable strategies, but will involve investment and hence expense
in workflow management in CIF archives.

I understand the rationale behind this restriction is to simplify
future processing of data names in areas such as dREL
applications. The question really is whether we're choosing the right
trade-off in making things cleaner at that end of the processing
chain. I would suppose that a dREL or other application could ingest a
data name with dangerous characters, convert it internally into a
"safe" identifier that's used for all processing, and then restore the
original form upon output; but writing that intermediate layer of
processing is of course expensive (especially if there aren't readily
available libraries that will do this transparently).

I suspect that some of the original proposed syntactic changes also
had the effect (whether by design or collaterally) of simplifying i/o,
data structure management, symbol table processing etc., but those may
have suffered in the subsequent revision exercise we've just been
practising. Given the consensus we are now approaching, would the code
builders now be prepared to incur the addition expense of handling
"dangerous" data names?

I really don't want to spark off a long discussion on this - if a
quick round of response shows that there's no appetite to allow
the additional punctuation characters in data names, I'll accept that
gracefully.

***

One last comment while I have the floor, though it is related in part
to the above question. A concern raised in the editorial office was
that there would be circumstances where users didn't know if they were
dealing with a CIF 1 or 2 ("users" meaning authors, perhaps resorting
to the vi editor - and we're imagining most of them are dealing with
small-molecule/inorganic CIFs). My supposition is that the IUCr
editorial offices would only want to use CIF2 seriously in association
with DDLm dictionaries, and that we would expect the revised core
dictionaries to use the dot component in data names to signal this
further evolution. So even a superficial glimpse of the middle of a
CIF would make it clear whether it was CIF1 or CIF2.

Does that fit in with how others see this progressing?

Cheers
Brian
_______________________________________________
ddlm-group mailing list
[email protected]
http://scripts.iucr.org/mailman/listinfo/ddlm-group

Reply to: [list | sender only]

Follow-Ups:

Re: [ddlm-group] Data-name character restrictions - one last time (Herbert J. Bernstein)

Prev by Date: Re: [ddlm-group] Revised version of syntax change summary document

Next by Date: Re: [ddlm-group] Data-name character restrictions - one last time

Prev by thread: Re: [ddlm-group] Are empty lists/tables valid?

Next by thread: Re: [ddlm-group] Data-name character restrictions - one last time

Index(es):

Date

Thread

Discussion List Archives

[ddlm-group] Data-name character restrictions - one last time