Discussion List Archives

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [ddlm-group] Use of elides in strings

Unlike others here, I feel that a proper text archive library should be 
able to take any string from the calling application, and return that 
exact same string when reading it back in. It is the job of the archive 
format to avoid delimiter problems. An applications should be able to 
store and retrieve strings without such worries, and interface to an SQL 
database the same is it would interface to CIF. All commonly used 
database libraries work this way. Why should CIF continue to take an 
archaic approach?

I essentially agree with the design below, except that the library 
should handle insertion and removal of the reverse solidus for the 
limited cases where it is required.

If it is the client application's responsibility to deal with reverse 
solidus escape sequences, then the description below doesn't make sense. 
In that case, the reverse solidus never has any special meaning to CIF2. 
Instead, CIF2 simply disallows certain character sequences. A client 
application can use whatever it wants to encode/decode the disallowed 
character sequences.

The advantage of having well-defined escape sequences at the I/O library 
level is that updates to the format do not require updates to client 
applications. A CIF client application should be able to send a string 
to the CIF library, and not have to know in advance what CIF revision is 
in use, or whether the string is semicolong block quoted or triple 
quoted. By requiring the client to escape invalid sequences, the client 
will have to escape strings differently, i.e. triple quote is OK withing 
semi-colon quotes, and a leading semicolon is OK within triple quotes, 
but not the other way around.

Joe Krahn


Nick Spadaccini wrote:
> 
> SUMMARISING.
> 
> (a) The contents of delimited strings are returned as raw, with the token
> delimiters removed.
> (b) Where a delimiter character is to be part of the string, that character
> must be preceded by a reverse solidus when written out to the file. When
> read, any reverse solidus preceding a terminating character is deleted.
> (c) It is the responsibility of the writing and reading application to
> insert and remove the reverse solidus preceding the terminating character.
> (d) Otherwise the presence of a reverse solidus in the string has no
> meaning.
_______________________________________________
ddlm-group mailing list
ddlm-group@iucr.org
http://scripts.iucr.org/mailman/listinfo/ddlm-group

Reply to: [list | sender only]
International Union of Crystallography

Scientific Union Member of the International Science Council (admitted 1947). Member of CODATA, the ISC Committee on Data. Partner with UNESCO, the United Nations Educational, Scientific and Cultural Organization in the International Year of Crystallography 2014.

International Science Council Scientific Freedom Policy

The IUCr observes the basic policy of non-discrimination and affirms the right and freedom of scientists to associate in international scientific activity without regard to such factors as ethnic origin, religion, citizenship, language, political stance, gender, sex or age, in accordance with the Statutes of the International Council for Science.