Discussion List Archives

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [ddlm-group] CIF2 semantics

I take it from the comment below that Herbert agrees to continue with the IT Vol G descriptions of the meanings of . and ?.  I am aware that one often finds a relatively harmless confusion between the two, most obviously when ? is used as a placeholder in a loop  instead of the usually more appropriate <full point>.  This confusion should encourage us to provide clarification in the formal specification.

Regarding numbers, could Herbert or others who wish 4.5 and "4.5" to have different abstract types , whereas kkkkk and "kkkkk" have the same abstract type, please explain why this behaviour is preferable, how it allows useful work to be done etc.   Meanwhile I'll prepare a post describing my reasoning for more uniform behaviour.

On Tue, Jul 26, 2011 at 10:13 PM, Herbert J. Bernstein <yaya@bernstein-plus-sons.com> wrote:
On null values, I believe "." and "?" are different in meaning from
their unquoted versions, but that unquoted . and ? are both essentially
equivalent null values.

On numbers, past practice has been to treat 4.5 and "4.5" as very
different, the former being a type numb value and the latter being
a type char value.  This was an important and significant early
difference between CIF and STAR and has been used in the handling of
the number-like strings that arise in PDB bib entries, e.g.
1234-5678 is the number 1234e-5678, but "1234-5678" is a string

At 1:24 PM +1000 7/26/11, James Hester wrote:
>Dear DDLm group,
>In order to minimise the number of issues we have to discuss in
>Madrid to clean up CIF2, I would like to turn discussion to those
>semantic issues which are relevant to the syntax.  I believe that
>there are three possible types of datavalue: "inapplicable",
>"unknown" and "string", represented by <full point> (commonly called
>a "full stop" or "period"), <question mark> and everything else,
>Do we all agree with the following assertion regarding full point
>and question mark?
>(1) A full point/question mark inside string delimiters is *not*
>equivalent to an undelimited full point/question mark
>Numbers: I believe that strings that could be interpreted as numbers
>are nevertheless (in a formal sense) just strings in the context of
>the post-parse abstract data model.  Therefore, whether or not a
>numerical string is delimited does not change its value: 4.5 and
>"4.5" are identical values.
>Note that this latter assertion does *not* require that
>CIF-conformant software must always handle numbers as strings; I am
>making these statements in order to clarify the abstract data model
>on which the various DDLs and domain dictionaries operate, not to
>dictate software design.  If your software can manage any potential
>need to swap between string and number representation of your data
>value, then more power to you.
>Please state whether you agree or disagree with the above.
>T +61 (02) 9717 9907
>F +61 (02) 9717 3145
>M +61 (04) 0249 4148
>ddlm-group mailing list

 Herbert J. Bernstein, Professor of Computer Science
   Dowling College, Kramer Science Center, KSC 121
        Idle Hour Blvd, Oakdale, NY, 11769

ddlm-group mailing list

T +61 (02) 9717 9907
F +61 (02) 9717 3145
M +61 (04) 0249 4148
ddlm-group mailing list

Reply to: [list | sender only]
International Union of Crystallography

Scientific Union Member of the International Science Council (admitted 1947). Member of CODATA, the ISC Committee on Data. Partner with UNESCO, the United Nations Educational, Scientific and Cultural Organization in the International Year of Crystallography 2014.

International Science Council Scientific Freedom Policy

The IUCr observes the basic policy of non-discrimination and affirms the right and freedom of scientists to associate in international scientific activity without regard to such factors as ethnic origin, religion, citizenship, language, political stance, gender, sex or age, in accordance with the Statutes of the International Council for Science.