Discussion List Archives

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [ddlm-group] Is the string '2.0' an Integer?

  • To: "james.r.hester@gmail.com" <james.r.hester@gmail.com>, "Group finalisingDDLm and associated dictionaries" <ddlm-group@iucr.org>
  • Subject: Re: [ddlm-group] Is the string '2.0' an Integer?
  • From: "Bollinger, John C" <John.Bollinger@STJUDE.ORG>
  • Date: Wed, 10 Nov 2021 17:07:14 +0000
  • Accept-Language: en-US
  • ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=passsmtp.mailfrom=stjude.org; dmarc=pass action=none header.from=stjude.org;dkim=pass header.d=stjude.org; arc=none
  • ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901;h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1;bh=VVLhsg8vhYPwAfTV2+RGtncZLbYVU6mVR9a+rn0rtV4=;b=gGzL6oXeY/UosFiTd89RTg079GVT2oRWf3HKnQfiZVaPLKJWzO1/+fdjKem3vgrNenxi/CJ2v54JI1uA+rxU2h6R9Z9lp5JV6rsk+0/q4/1lmoye7B9I7ITWgUdjbwpkeiPJ8AJSO9M397q4/uxYuGbMT0GfMXjfVV8RCYOgWnpjQZ/tqkQzOtwMfu02OHsF3Ndh7ujcgdcdnQ0ovekXq/5kffWEEfKN7U78wOFLVJs017S5wPVNWnIpKWHxWKWrwbyHH/DFGbCaUlZL5HppTYLs5f/6mEU0GDLqcaeZjiT/SXmwofqLOL8WOEFs2nZWGrQs628lYAU3gJfbSYivzA==
  • ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none;b=R0gcoxkV1/8kGFjoBHgty4uuZBwsSPKPnXi79IKcJrjTOPWTJH+nSdZJcYA524U6ck2hwo2wKjeK65p+Bq2tBgLbD4oT/nhpe455ftuL40I+s4vTrNUFNyFdbP8qx61LKi1RX8UjSreUdCUUGk3hnFoo9JxH6bxaGZeR1Tx/9DS3G0gNPu0IpuMzikc1AENwZFproqeRNoX7aCwfOraE/1fjlj4EHwqXQZcSL4jB9XlLm1BPdkUMK6NBww7JVubDIKXVYSEcBmRg0n8f3Pz8kpnECTSd7aayuJwn6w94jD63ckYkLca9j+ZNA+N+JhmAepQ4/K+LAc7CUerJgl40Ug==
  • In-Reply-To: <CAM+dB2fcY6ZvrH8nfbeYKJZbqRcmtPhai33Y5isQ7M2VKCVUtg@mail.gmail.com>
  • IronPort-SDR: 1Y4XJ1RXDgue6JWBW/N8ATlMhxohUKInA6oXGPNAfWTG3/rM1Xkwkja/X1TBabOM4tMRVzNyazW8/oQKabajwNKQqdIg1cPk5GoQl1b1OH8gPT9kTCfzMpY1iORYMNJMnsRBG8zXmiMZ3MVnZs9D/5hsuoWg4wCiNPb4UMeamTDSvZ3txsSHjLHWBT5bbbuTUS5VYE4VQnlTxQsNN91aansaqnvGF2oyMpCjfZvwX/vDLVCGH6/qW1xEmb7YLnkYViTnrSbrtHlc9ht1J1A/VOJ7HXH0fMIdGCPKTO78nCM=
  • References: <CAM+dB2fcY6ZvrH8nfbeYKJZbqRcmtPhai33Y5isQ7M2VKCVUtg@mail.gmail.com>

CIF 2 does not have syntactic data typing other than for the special null-value placeholder values (. and ?).  In fact, CIF 1.1 doesn’t either, notwithstanding the fact that there are productions for different flavors of number in the published formal grammar.


It comes down, then, to a question of DDL[12m] and possibly of specific dictionaries.  It would not be an issue with a DDL2 dictionary, because those provide details of the syntactic form of each dictionary-recognized data type within the dictionary itself.  For example, inasmuch as this is inspired by concerns about the core CIF _diffrn_refln.counts_* items, it is perhaps relevant to note that the mmCIF definitions of these assign type ‘int’, which unambiguously disallows any decimal point or exponent field.


But what about DDLm?  There are several considerations, but as a threshold matter, is its Integer intended to be interchangeable with mmCIF’s int?  If so, then that settles it: decimal points and exponents cannot then be allowed in values of (DDLm) type Integer.


Suppose, however, that we did not insist on that level of compatibility.  Should '2.0' (or 2.0) be accepted as conforming to type Integer?  I am inclined to say not.  I would find that surprising from the perspective of a data consumer.  Expressing a number with those extra significant zeroes suggests that a (limited) precision is being conveyed, which is incorrect when the number is in fact an infinitely precise integer. I would also find that surprising from the perspective of a programmer, for none of the languages I know that distinguish between integers and floating-point numbers permit integers to be expressed with a decimal point.


But also, I think the question is probably based on a faulty premise.  This originated from a concern about conveying standard uncertainties.  If a number is uncertain in the first place, so that it makes sense to compute or convey a standard uncertainty for it, then I don’t think it makes sense to insist on the base number being an integer in the first place.  Such a number is necessarily an estimate of some kind (else it would not be uncertain) and even if the value it estimates is constrained to be an integer, the best statistical estimate for that value does not have the same constraint.








John C. Bollinger, Ph.D., RHCSA

Computing and X-Ray Scientist

Department of Structural Biology

St. Jude Children's Research Hospital


(901) 595-3166 [office]




From: ddlm-group <ddlm-group-bounces@iucr.org> On Behalf Of James H
Sent: Tuesday, November 9, 2021 7:42 PM
To: ddlm-group <ddlm-group@iucr.org>
Subject: [ddlm-group] Is the string '2.0' an Integer?


Caution: External Sender. Do not open unless you know the content is safe.


Antanas has asked an interesting question [1]. In essence, if data name _xyz has DDLm type 'Integer', would '2.0' conform to this specification?


I think yes, for the same reasons that both delimited and undelimited strings can be interpreted as numbers. There is a single unambiguous interpretation as an integer.


The reason this has come up is that e.g. total counts might be restricted to an integral value, but the SU will most often be non-integer, and using the form '2' would not allow appending the SU, whereas 2.0(14) allows this.


Any comments?



T +61 (02) 9717 9907
F +61 (02) 9717 3145
M +61 (04) 0249 4148

Email Disclaimer: www.stjude.org/emaildisclaimer
Consultation Disclaimer: www.stjude.org/consultationdisclaimer
ddlm-group mailing list

Reply to: [list | sender only]