[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [ddlm-group] CIF2 semantics

To: Group finalising DDLm and associated dictionaries <[email protected]>
Subject: Re: [ddlm-group] CIF2 semantics
From: "Herbert J. Bernstein" <[email protected]>
Date: Tue, 9 Aug 2011 15:40:39 -0400 (EDT)
In-Reply-To: <[email protected]>
References: <CAM+dB2eL5jrEFBcmGpDe6RTvpv4qfmxXa722XXzaS_zgCjsxKw@mail.gmail.com><[email protected]><[email protected]><CAM+dB2eT83aTPYc_Dg2aQAsp9VoWTpBA79RPLne61LFWfcFEZQ@mail.gmail.com><[email protected]><CAM+dB2cskNxHZ3mDeJ0uFLG7KbHba7hj=+=mUiqczdj6ivVb7g@mail.gmail.com><[email protected]><CAM+dB2cQLX7OGoLkMAQm3iuamNYAp7WJazvftQAriT02Po_ybA@mail.gmail.com><CAM+dB2eG29P3UWmbfR2JxUTScB9uE=MN_baasJkzi3arnRodpg@mail.gmail.com><a06240800ca66b35a4c4a@[192.168.2.101]><[email protected]>

Normally, I read dictionaries before I read data, so the question of
having a number declared as such in a dictionary is known while reading
the data.  You seem to be assuming that data will be read before the
dictionary.  That certainly is a possible approach, but really
an implementation choice, not something intrinsic to CIF.  Lazy
evaluation has both advantages and disadvantages.
   -- Herbert

=====================================================
  Herbert J. Bernstein, Professor of Computer Science
    Dowling College, Kramer Science Center, KSC 121
         Idle Hour Blvd, Oakdale, NY, 11769

                  +1-631-244-3035
                  [email protected]
=====================================================

On Tue, 9 Aug 2011, Bollinger, John C wrote:

>
> On Tuesday, August 09, 2011 5:08 AM, Herbert J. Bernstein wrote:
>
>>   The heart of the difference lies in "Therefore, datavalues that can
>> be interpreted as numbers (CIF1.1 'numb' type) must retain knowledge
>> of the source string so that the DDLs and dictionaries are free to
>> interpret them as 'char' type (see Vol G 2.2.7.4.7.1(17), reproduced
>> below)"  I do not see any requirement in Volume G that we must retain
>> the original source string for _any_ data value, just that we must
>> faithfully preserve the information in that data value.  In the case
>> of a number, that is the numeric value, not the particular choice of
>> character string used to represent that numeric value, so we may
>> freely change from 123 to 1.23e2 and back.
>
>
> James's is a practical argument, predicated on the idea of a hypothetical CIF parser, operating without dictionary knowledge but supporting dictionary-based applications.  The hypothetical processor's output is a complex object conforming to the abstract CIF data model that James is attempting to define.  He postulates that it is both necessary and sufficient for such an abstract data model to retain the original character sequence of every data value, including those having numeric lexical form.
>
> James's argument is supported by Vol G 2.2.5.2, wherein it is specified that given
>
> _unknown_data_name 1
>
> and a dictionary definition assigning type 'char' to that name, "the value should be stored as the literal character 1."  As a practical matter, then, a CIF 1.1 processor must retain the original character sequence at least until it is known whether there is a definition.  Therefore James's abstract data model, output by a processor ignorant of any dictionary yet supporting dictionary-based applications, indeed must retain the original character sequence.
>
>
>>   Saying, as 2.2.7.4.7.1.(17), that "it may be assumed that a
>> character string interpretable as a number should be taken to
>> represent an item of type 'numb'" does _not_ say that we need
>> to retain the original source string
>
>
> No, but saying, as 2.2.7.4.7.1.(17)'s next sentence does, "However, an explicit dictionary declaration of type will override such an assumption," _does_ require the original source string to be retained in the event that a dictionary definition declares type 'char' for the value.  James's hypothetical processor must retain the original source string because it doesn't yet know whether there is such a definition.
>
> [...]
>
>
>>   If we wish to preserve a particular string, we should quote it, but
>> then it is type char, not type numb.
>
>
> That is unquestionably the most pragmatic approach for writing CIFs, but what does or should the CIF specifications require when that approach is not taken?  There seems more of a sore spot here than I appreciated before this discussion, but I now believe that CIF 1.1's approach for determining data types is flawed, and furthermore that it is inconsistently implemented in practice.  It is unsatisfactory that CIF 1.1 does not require dictionaries to be used, yet mandates different, incompatible, data typing analysis when they are used than when they are not.
>
> I think CIF2 can and should adopt a different position, wherein there are three base primitive data types for values (char, numb, and null), and values are assigned to one of these base types upon parsing, based on their lexical form.  If a dictionary definition requires a different base type than the one a value was expressed in, then the value must be coerced to the needed type after parsing, according to a standard set of platform-independent coercion rules.  That would achieve a separation between the (optional) dictionary layer and the underlying data model that CIF 1.1 lacks.  Without something like that one needs the kind of workaround that James describes, at least in principle.
>
>
> John
>
> --
> John C. Bollinger, Ph.D.
> Department of Structural Biology
> St. Jude Children's Research Hospital
>
>
> Email Disclaimer:  www.stjude.org/emaildisclaimer
>
> _______________________________________________
> ddlm-group mailing list
> [email protected]
> http://scripts.iucr.org/mailman/listinfo/ddlm-group
>
_______________________________________________
ddlm-group mailing list
[email protected]
http://scripts.iucr.org/mailman/listinfo/ddlm-group

Reply to: [list | sender only]

Follow-Ups:

Re: [ddlm-group] CIF2 semantics (James Hester)

References:

[ddlm-group] CIF2 semantics (James Hester)

Re: [ddlm-group] CIF2 semantics (Bollinger, John C)

Re: [ddlm-group] CIF2 semantics (James Hester)

Re: [ddlm-group] CIF2 semantics (James Hester)

Re: [ddlm-group] CIF2 semantics (Herbert J. Bernstein)

Re: [ddlm-group] CIF2 semantics (James Hester)

Re: [ddlm-group] CIF2 semantics (James Hester)

Re: [ddlm-group] CIF2 semantics (Herbert J. Bernstein)

Re: [ddlm-group] CIF2 semantics (Bollinger, John C)

Prev by Date: Re: [ddlm-group] CIF2 semantics

Next by Date: Re: [ddlm-group] Triple-quoted strings in light of latest CIF2 draft

Prev by thread: Re: [ddlm-group] CIF2 semantics

Next by thread: Re: [ddlm-group] CIF2 semantics

Index(es):

Date

Thread

Discussion List Archives

Re: [ddlm-group] CIF2 semantics