Discussion List Archives

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [ddlm-group] CIF2 semantics

Dear Colleagues,

   I see no reason to change the current wording on . and ?, but if James
has specific wording to suggest, let's look at it.

   Changing the behavior of 4.5 versus "4.5" will corrupt the processing
of many existing PDB file that depend on the distinction between
1234-5678 and "1234-5678".  If we are just talking about a proposal
for CIF 2.0, and not a change is existing CIF practice, this is
manageable problem, provided some mechanism is provided to prevent
the parsing of "1234-5678" as a number, e.g. by making dictionary use
mandatory (it is not with CIF1.1), but I suspect this one change will
be sufficient to delay real use of CIF2 by a few years.  Therefore,
I think it to be a mistake.

   XML tried to mandate the use of DTDs and schema.  It did not go over
well there and is largely ignored.  Do we really want to try the same


At 11:24 PM +1000 7/26/11, James Hester wrote:
>I take it from the comment below that Herbert agrees to continue 
>with the IT Vol G descriptions of the meanings of . and ?.  I am 
>aware that one often finds a relatively harmless confusion between 
>the two, most obviously when ? is used as a placeholder in a loop 
>instead of the usually more appropriate <full point>.  This 
>confusion should encourage us to provide clarification in the formal 
>Regarding numbers, could Herbert or others who wish 4.5 and "4.5" to 
>have different abstract types , whereas kkkkk and "kkkkk" have the 
>same abstract type, please explain why this behaviour is preferable, 
>how it allows useful work to be done etc.   Meanwhile I'll prepare a 
>post describing my reasoning for more uniform behaviour.
>On Tue, Jul 26, 2011 at 10:13 PM, Herbert J. Bernstein 
>On null values, I believe "." and "?" are different in meaning from
>their unquoted versions, but that unquoted . and ? are both essentially
>equivalent null values.
>On numbers, past practice has been to treat 4.5 and "4.5" as very
>different, the former being a type numb value and the latter being
>a type char value.  This was an important and significant early
>difference between CIF and STAR and has been used in the handling of
>the number-like strings that arise in PDB bib entries, e.g.
>1234-5678 is the number 1234e-5678, but "1234-5678" is a string
>At 1:24 PM +1000 7/26/11, James Hester wrote:
>>Dear DDLm group,
>>In order to minimise the number of issues we have to discuss in
>>Madrid to clean up CIF2, I would like to turn discussion to those
>>semantic issues which are relevant to the syntax.  I believe that
>>there are three possible types of datavalue: "inapplicable",
>>"unknown" and "string", represented by <full point> (commonly called
>>a "full stop" or "period"), <question mark> and everything else,
>>Do we all agree with the following assertion regarding full point
>>and question mark?
>>(1) A full point/question mark inside string delimiters is *not*
>>equivalent to an undelimited full point/question mark
>>Numbers: I believe that strings that could be interpreted as numbers
>>are nevertheless (in a formal sense) just strings in the context of
>>the post-parse abstract data model.  Therefore, whether or not a
>>numerical string is delimited does not change its value: 4.5 and
>>"4.5" are identical values.
>>Note that this latter assertion does *not* require that
>>CIF-conformant software must always handle numbers as strings; I am
>>making these statements in order to clarify the abstract data model
>>on which the various DDLs and domain dictionaries operate, not to
>>dictate software design.  If your software can manage any potential
>>need to swap between string and number representation of your data
>>value, then more power to you.
>  >
>>Please state whether you agree or disagree with the above.
>>T <tel:%2B61%20%2802%29%209717%209907>+61 (02) 9717 9907
>>F <tel:%2B61%20%2802%29%209717%203145>+61 (02) 9717 3145
>>M <tel:%2B61%20%2804%29%200249%204148>+61 (04) 0249 4148
>  >_______________________________________________
>>ddlm-group mailing list
>  Herbert J. Bernstein, Professor of Computer Science
>    Dowling College, Kramer Science Center, KSC 121
>         Idle Hour Blvd, Oakdale, NY, 11769
>                  <tel:%2B1-631-244-3035>+1-631-244-3035
>                  <mailto:yaya@dowling.edu>yaya@dowling.edu
>ddlm-group mailing list
>T +61 (02) 9717 9907
>F +61 (02) 9717 3145
>M +61 (04) 0249 4148
>ddlm-group mailing list

  Herbert J. Bernstein, Professor of Computer Science
    Dowling College, Kramer Science Center, KSC 121
         Idle Hour Blvd, Oakdale, NY, 11769

ddlm-group mailing list

Reply to: [list | sender only]
International Union of Crystallography

Scientific Union Member of the International Science Council (admitted 1947). Member of CODATA, the ISC Committee on Data. Partner with UNESCO, the United Nations Educational, Scientific and Cultural Organization in the International Year of Crystallography 2014.

International Science Council Scientific Freedom Policy

The IUCr observes the basic policy of non-discrimination and affirms the right and freedom of scientists to associate in international scientific activity without regard to such factors as ethnic origin, religion, citizenship, language, political stance, gender, sex or age, in accordance with the Statutes of the International Council for Science.