On 6/02/10 12:48 AM, "Herbert J. Bernstein" <yaya@bernstein-plus-sons.com>

> The real issue here is the data model that we are supporting -- databases
> or something with tighter control over internals.  Especially after the
> discussions at the ESRF HDF5 hyperspectral data workshop last month, I am
> increasingly convinced that it is a serious mistake to move away from
> the database model.  While tighter control over internals is tempting, in
> the end, as we move more into multithreaded, multiprocessor multiwriter
> applications, the greater the performance penalty we will pay for
> over-specifying the internal representation of a CIF, especially in ways
> that deviate from the relational model.

I agree that in essence we are enshrining a database model in what we do.
Though I am not keen that the only model we consider is the relational
model. Though the current form of CIF maps easily in to the relation model
it is not required to. But I agree having database model(s) in mind with
what we do is essential.
> Nothing is gained for users in making a mandatory distinction between
> single row loops and the same tags with individual values.  I propose
> that CIF2 adopt the DDL2 mmCIF approach of treating them as equivalent.
> Joe is right that having the distinction in the DDL then forces all
> parsers to refer to the dictionary to be able to make this pointless
> distinction.

But in most databases you define the schema (attribute ordering and type
etc) and then load multiple records. Yes you can update and insert one at a
time where you have a flattening of the loop, but that is more with updating
and changing. I can see a lot of application level software that would
handle individual rows in this way, but CIF is meant to be an archiving
formalism and to that extent I think it is more helpful to (at  least in
archiving) maintain a clear consistency between the schema (DDL) and the
data file.



