Discussion List Archives

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [ddlm-group] Adding a DDLm attribute for uniqueness

  • To: Group finalising DDLm and associated dictionaries <ddlm-group@iucr.org>
  • Subject: Re: [ddlm-group] Adding a DDLm attribute for uniqueness
  • From: "Bollinger, John C" <John.Bollinger@STJUDE.ORG>
  • Date: Mon, 17 Feb 2020 18:19:16 +0000
  • Accept-Language: en-US
  • ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=passsmtp.mailfrom=stjude.org; dmarc=pass action=none header.from=stjude.org;dkim=pass header.d=stjude.org; arc=none
  • ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901;h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck;bh=Qm3fwGxY6Rc7q57BeV8Xuo6J4M+qII+VVPhbxJm9EMQ=;b=WtN22o7LfsG9yN2Tdnuscve3WOxpd3qfgJyKIEirYqcSwa1IB6IjGFA/x+n6rB4KaeSCx8t8zAtXXJffXKdYTRC7bU0h69Uw1jaO4/o46kP7ndfN/Z7GSSGF1Hher4FrfgPAtbu0hAmKlv7zA8Iroq0bUCTAPwjNwMZ40atgW/L/dFd5RkcVPMTastkwiD3CzU8PK7WsUTi5y2jLmiNCWYIPfEW/uYFG4BZUiJA5fS7JSVqUivZNtZ5D8vHqUxx12oS7zsDRkPGD/mOIP6JZFHrm3DwEoLjWsg3odH4QG9g/QJPJbvrUen3Rex9nb+JAkKgzejQzfipLqK6z10OgQg==
  • ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none;b=E/TsQO/Vqvx29aIdQM2nfkeBiDHVrooEgXRQ0siIimrvJ+4MgT2YvmV1en22VDn5pB5pmxElXfVF0sOJNvqoUrKB3rc1iQRuA+8o2mqXEjLroNJgD62F7qLc9W7TJ5x6Wgsml42ZuBLzBiF2BX6AhvphAooIjxIzxpUM3F4UFIsdkk3EOWEelbFOHvEgCy3HKrT0yV+CnPBItZAMjZuk9p3OY/AS/hHhZBhaVTXlQm2WbDOpPLux0g2IQwlJpLXYjf+GXfZtFI+B5PGD6PikekWPmgHyetKnEJQqADGQ2ttjS9c+TnTWblh85SiS4cBGC5E16naDY0d/hOdpWWDB0g==
  • authentication-results: spf=none (sender IP is )smtp.mailfrom=John.Bollinger@STJUDE.ORG;
  • In-Reply-To: <CAM+dB2ethtf2-+erxftd3mt651kdqBFJp+JxRz=LuWssRf9uyw@mail.gmail.com>
  • IronPort-SDR: FJbdfRTAwV87avOHnFNFgIjhCL0kimPqfS0L5nezddWIElalQQbyS45JOTFgqZjh1rrjwIQmPdqUxm7IJLcMKF6gcp0K0pLVpwgF1Nl/DSIp2JGZhVNni2gHSgROWlftS8HuOUwUOj6DYmH0gaLGfWG0UZh/3xML388xfrj7vVO2WlrfDb02tvVeBLXSrFYP7mvQrq5xpr0HTh5s4hrEeKctyFH98OuC4Xe1ofKv5NwJN5PBsCv4fjqIpn7rzng2PTwfFKKKl6FGl1+isDBrmxsMYxkTPCegNNRGbDakSpc=
  • References: <CALHYoX6573gXqabRS0TwY5O0-wVtexjVrWs9KZi2jpH2u_Tm8A@mail.gmail.com><CAM+dB2crUCGAD+fUVgG38OFujxfNLc-Rc5r-6Cbhc6FijsJBuw@mail.gmail.com><CALHYoX6St7GzJociroTv=DMpsMovpuot5+Zkq6Zk5dh2Gva-LA@mail.gmail.com><CAM+dB2ethtf2-+erxftd3mt651kdqBFJp+JxRz=LuWssRf9uyw@mail.gmail.com>

Dear DDLm Group,


James’s comments initially confused me, because I took ...


> I think there may be a way to approach uniqueness that resolves enough issues to make it worthwhile. I suggest we view 'uniqueness' through a functional lens, so what we really mean when we say values of a dataname are unique is that there is a one-to-one mapping from that data name to the keys of the category (of course, there is already a one-to-one mapping from the keys to the dataname by definition of "key").


... to be a description of something new.  In fact, it seems to be meant as a definition of what we typically mean by "unique".  It’s a pretty natural one, for it is equivalent to identifying the property of being “unique” with what would be described in relational terms as being a candidate key.


I then take that to serve as context for a proposal to define a new and slightly different kind of definable relationship between data items, which I’ll refer to as Hester-uniqueness, given by:


>  So if we explicitly specify the datanames whose collective values are unique for a given value of our 'unique' data name, we have specified 'uniqueness' in a way that is immune to expansions changes in the category key. We are also not limited by relationships involving keys.


And that brings Herbert’s objection into focus for me: applying a Hester-uniqueness constraint to a category can denormalize that category in the relational sense.  In the event that the set of data names that are required to specify Hester-uniqueness for some attribute is permitted to be a proper subset of a candidate key of the category, applying such a constraint can break second-normal form.  In the event that the set of data names that are required to specify Hester-uniqueness for some attribute is not required to be a (possibly-trivial) superkey of a candidate key, applying such a constraint can break third-normal form.  Yet, forbidding those possibilities leaves us effectively at the uniqueness definition we started with.


Whether such constraints are enforced in a given relational database representation is not really the issue.


With that said, and now having given the matter some thought, it does not seem unreasonable on its face to provide for defining more than the two(-ish) candidate keys that DDLm already affords.  This is, for me, a question of defining data more completely.  I don’t see robustness against key expansion to be a significant issue.  The matter of handling unknown values of items belonging to a candidate key does bear some thought, but there are several possible approaches, including "validator’s choice".  Inasmuch as DDLm does not already offer the feature, however, there does remain a cost / benefit question around adding attributes to DDLm for this purpose.  I’m generally inclined to prefer descriptive attributes to methods for defining data, but it is not clear to me how much benefit would be gained from new uniqueness / candidate key attributes.







Email Disclaimer: www.stjude.org/emaildisclaimer
Consultation Disclaimer: www.stjude.org/consultationdisclaimer
ddlm-group mailing list

Reply to: [list | sender only]
International Union of Crystallography

Scientific Union Member of the International Science Council (admitted 1947). Member of CODATA, the ISC Committee on Data. Partner with UNESCO, the United Nations Educational, Scientific and Cultural Organization in the International Year of Crystallography 2014.

International Science Council Scientific Freedom Policy

The IUCr observes the basic policy of non-discrimination and affirms the right and freedom of scientists to associate in international scientific activity without regard to such factors as ethnic origin, religion, citizenship, language, political stance, gender, sex or age, in accordance with the Statutes of the International Council for Science.