Discussion List Archives

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: _publ_author_name is not a good key for _publ_author

  • To: James Hester <james.r.hester@gmail.com>, Distribution list of the IUCr COMCIFS Core Dictionary Maintenance Group<coredmg@iucr.org>
  • Subject: Re: _publ_author_name is not a good key for _publ_author
  • From: "Herbert J. Bernstein via coreDMG" <coredmg@iucr.org>
  • Date: Tue, 23 Jun 2020 05:53:35 -0400
  • Cc: "Herbert J. Bernstein" <yayahjb@gmail.com>
  • DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025;h=mime-version:references:in-reply-to:from:date:message-id:subject:to:cc; bh=2FxicOgCqr0qwVOP+AOGVVike/zEIRmljeZTG75bLOU=;b=iFLFU8zHFm7i/+O1YBqzUBqXBdhxNVKRheBXBDFUsfAUw9XyxhmMuYBjJHKdqVlojmh33iIro7rrK9X31dUxxS0Y0XzVGLwStkQ4VSr4ox3zEuJt62uZwGQxPFpV+9RbP2l2kBb5SHix4+LALS+YUQY7JDiE41RByq0b8fQTPA+B+Oa/BC57HV5tmxg7XXNOiQP6+WWciNV1IL69968TkdFbLK+SL8iEjyUiYoVjqptiLz7I0KHaKjFGnUNPdZIl0WKM2rwy+E4vK5MBYVSJGID1OARXura4HC5T5NU69v+Hn6FignvoqEOzslpWE8SwX3ODhqbbMQ04oYMOwl8tmg==
  • In-Reply-To: <CAM+dB2f=N7aPzwJmF5edeNS1fUijUvFpcj3YrrYvkJ1u-E6GPg@mail.gmail.com>
  • References: <CAM+dB2dMXpi-RhkwH7PhjNV8bAaYzT_mHg_12c2TR4M7joFGCQ@mail.gmail.com><CAM+dB2f=N7aPzwJmF5edeNS1fUijUvFpcj3YrrYvkJ1u-E6GPg@mail.gmail.com>
Dear James,
  Having an unambiguous key is a fine idea, but shouldn't it just be the ORCID id, as in _publ_aithor_id_orcid
  Regards,
    Herbert

On Tue, Jun 23, 2020 at 12:51 AM James Hester via coreDMG <coredmg@iucr.org> wrote:
Dear Core DMG,

Please see proposed new definitions for the publ_author category, as suggested in my previous email (below). If no objections are forthcoming, I will be updating the dictionary a week from today.  Note that this change will in turn pave the way for adding author roles to a CIF file.

best wishes,
James.
===========

save_PUBL_AUTHOR

_definition.id                          PUBL_AUTHOR
_definition.scope                       Category
_definition.class                       Loop
_definition.update                      2020-06-30
_description.text                      
;
     Category of data items recording the author information.
;
_name.category_id                       PUBL
_name.object_id                         PUBL_AUTHOR
_category.key_id                        '_publ_author.id'
loop_
  _category_key.name
         '_publ_author.id'

save_

save_publ_author.id
    _definition.id              '_publ_author.id'
    _definition.update          2020-06-30
    _description.text
;              Arbitrary identifier for this author
;
    _name.category_id                       publ_author
    _name.object_id                         id
    _type.purpose                           Key
    _type.source                             Assigned
    _type.container                         Single
    _type.contents                          Code
save_

On Tue, 26 Mar 2019 at 17:28, James Hester <jamesrhester@gmail.com> wrote:
Dear Core CIF group,

The publ_author category has _publ_author.name as the category key, meaning that _publ_author.name can be used to select a unique row of the loop. However, it has been pointed out that _publ_author.name is insufficient as a key for the _publ_author loop, as some authors have the same name (there are apparently 40 entries in the COD with this feature). One way to fix this is to add a further disambiguating data name to _publ_author. The suggestion is that something like "_publ_author.id" could be defined, which would contain an arbitrary code and which would be added to the category key.  The lack of this dataname in historical CIFs can be worked around by assuming it has a constant value, and CIF curators can auto-generate it when a situation is encountered with multiple identically-named authors.

An alternative approach would be to define "_publ_author.id" as the new key of the category that can be auto-generated from each packet. While in a formal sense this makes all current CIFs non-conformant, I can't imagine that it would affect most software, which will continue to work with _publ_author.name

Please indicate your preference or alternative solutions. Particularly important is the perspective of software authors who may be impacted.

thanks,
James.

--
T +61 (02) 9717 9907
F +61 (02) 9717 3145
M +61 (04) 0249 4148


--
T +61 (02) 9717 9907
F +61 (02) 9717 3145
M +61 (04) 0249 4148
_______________________________________________
coreDMG mailing list
coreDMG@iucr.org
http://mailman.iucr.org/cgi-bin/mailman/listinfo/coredmg
_______________________________________________
coreDMG mailing list
coreDMG@iucr.org
http://mailman.iucr.org/cgi-bin/mailman/listinfo/coredmg

[Send comment to list secretary]
[Reply to list (subscribers only)]