[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Draft namespace recommendations

Can I get clarification on the interpretation of the namespace for the
case in your example -

_IUCR::atom_site.occupancy

Do you interpret that the above item name is a reference to a logically distinct
data category named 'atom_site' which exists in the IUCR namespace?  This case
would have formally been represented as something like _iucr_atom_site.occupancy.
Or, do you interpret this as adding the data item _atom_site.occupancy from
the IUCR namespace to the standard data category, atom_site?

John

On 7/17/13 6:44 AM, yayahjb wrote:
> The discipline proposal does not provide a convenient way to mix datanames from multiple
> disciplines in the same datablock.  XML DOM does it with a colon separator.   C++ does it
> with a "::" separator.  I suggest we adopt the C++ convention, treating a discipline
> similarly to a C++ namespace.  I would suggest adding the following:
>
>   " To facilitate use of datanames from multiple disciplines in a single CIF the discipline
> from which a dataname is drawn may be specified by using C++ namespace-style
> conventions, i.e. by optionally prefixing the dataname with the discipline followed immediately
> by a double colon, as in
>
>    _IUCR::atom_site.occupancy
> as equivalent to
>    _atom_site.occupancy
> when the discipline is know to be IUCR
>
> Note that leading underscore appears before the prefixed discipline and and the
> double colon, and that no spaces are permitted in the construct."
>
> Neither the discipline proposal nor the statement of the existing prefix system
> talks about the interaction with dotted notation as it is used in DDL2. In DDL2,
> the prefix may be used on either dotted component:  the category or the
> column.  I suggest adding the following statements in the prefix section.
>
> "When an IUCr CIF dictionary uses formal dotted notation separating a dataname
> into category and column components (as in DDL2) the prefix may appear
> on either component."
>
> Finally we should explicitly allow the combination of dicsciplines and prefixes
> as in
>
> _IUCR:: audit_author.PDBX_ordinal
>
>
>
> On 7/16/13 9:46 PM, James Hester wrote:
>> Dear COMCIFS members,
>>
>> There has been little discussion on the two namespace proposals linked
>> in my original email in February (apologies for the delay), which
>> leads me to conclude that they are acceptable.  For archival purposes,
>> I have included the full text of the two proposals in this email, and
>> I now request the COMCIFS voting members to formally indicate their
>> agreement.  In the case of a disagreement, please note that
>> disagreement briefly in your reply and then follow up the issue in the
>> namespace forum.  In accordance with COMCIFS practice, if more than 6
>> weeks pass from today's date with no reply, agreement will be assumed,
>> although explicit and rapid assent is always preferable.
>>
>> Note that the second namespace proposal below differs slightly from
>> the one originally linked: I have added a sentence clarifying the
>> meaning of 'IUCr domain' in the preamble and clarified the meaning of
>> 'adopting' a third party dictionary.
>>
>> James.
>> =========================
>> http://forums.iucr.org/viewtopic.php?f=28&t=319
>>
>> Proposal for a new dataname to support a CIF namespace mechanism
>>
>> Background
>>
>> We wish to build some sort of namespace mechanism into CIF so that
>> other communities can use CIF with minimal, if any, coordination with
>> COMCIFS. The key requirement is that datanames and the corresponding
>> dictionary definitions must be unambiguously matchable. Currently,
>> COMCIFS guarantees the uniqueness and immutable nature of datanames,
>> so there is no need for any disambiguation mechanism. If CIF is to be
>> usable outside COMCIFS, there must be a mechanism so that the readers
>> and writers of CIF data files from a given community can agree on the
>> correct definition for a given dataname.
>>
>> Two partial solutions already exist:
>> (1) people and organisations register an opaque 'prefix' for a
>> dataname with the IUCr. This allows users to populate their own
>> namespaces safely and devolves management of dataname collisions to
>> the relevant community. From the point of view of the outside
>> discipline, there remains the annoyance that the datanames and
>> dictionaries are cluttered with a redundant prefix.
>> (2) The _audit tags in a datablock can specify which dictionary the
>> datanames come from. The problem then becomes one of encouraging
>> programs to read and write these _audit items, given that simply
>> finding a matching dataname in a datafile is already a pretty solid
>> guarantee that it means what the programmer thought, as COMCIFS has up
>> until now guaranteed the stability and uniqueness of datanames.
>>
>> Some discussion has taken place in the namespaces forum and members
>> are invited to read the comments there as well.
>>
>> Proposed solution
>>
>> We define an enumerated dataname, _audit.discipline, which takes
>> values assigned by COMCIFS and should never be redefined by any
>> CIF-using organisation - in effect it becomes part of the CIF
>> specification. We can formally define a 'discipline' here as a
>> collection of dictionaries which define datanames that are guaranteed
>> to always have a constant, unambiguous meaning. This guarantee would
>> presumably be provided by some organisation using policies chosen by
>> that organisation. A CIF datafile wishing to explicitly specify which
>> discipline its datanames are drawn from would set the value of
>> _audit.discipline inside its datablocks. Likewise, programmers who are
>> concerned about possible ambiguity in datanames can explicitly check
>> for the value of this dataname.
>>
>> Note the following:
>>
>> * The IUCr would maintain a registry of accepted disciplines. In
>> minimal form this could be the dictionary entries for
>> _audit.discipline and something like _audit.discipline_URI
>> * There is no requirement to use the _audit.discipline dataname, nor
>> to register disciplines. It is provided as a tool for those wishing to
>> avoid ambiguity
>> * Disciplines not wishing to register their discipline name but still
>> wishing to use _audit.discipline, must never choose 'IUCr' (or
>> whatever it is we decide) for their discipline name
>> * Minimal checking is required compared to the current _audit
>> datanames, but similar guarantees of uniqueness and correctness are
>> obtainable
>> * The _audit.discipline dataname should never be looped. Datanames
>> drawn from multiple disciplines may not have overlapped when a
>> datafile was produced, but may overlap when it is read, as there is no
>> coordination between disciplines.
>>
>> The scope of the _audit.discipline dataname is the entire datablock
>> and all save frames within that data block, unless a save frame gives
>> a different value for _audit.discipline, in which case that new value
>> will apply to all nested save frames within that save frame.
>>
>> ===================================================================
>> http://forums.iucr.org/viewtopic.php?f=28&t=315
>>
>> Draft COMCIFS dataname and dictionary policy within the IUCr domain
>>
>> COMCIFS must ensure the uniqueness of all extant datanames within the
>> IUCr domain. The following policy is designed to maximise the chances
>> that the status and meaning of any dataname encountered in the IUCr
>> domain is unambiguous. A dataname is considered to be within the IUCr
>> domain if the proposed _audit.discipline dataname has the value
>> 'IUCr'.
>>
>> (1) Datanames not explicitly approved by COMCIFS and appearing in CIF
>> datafiles should either contain the string '[local]' or commence with
>> a prefix handed out by COMCIFS
>> (2) COMCIFS makes no undertakings as to the uniqueness of datanames
>> containing the string '[local]'.
>> (3) In the register of approved prefixes, COMCIFS may provide
>> certification that datanames with a given prefix will be unique. In
>> order to obtain this certification, a prefix assignee should:
>>
>> (i) publish a publically-available dictionary defining all datanames
>> with that prefix
>> (ii) have an organisational structure judged capable of enforcing
>> dataname policy (a single person also suits this criterion)
>>
>> (4) Alternatively, if a prefix assignee provides to the IUCr a
>> dataname dictionary and advises that the prefix is no longer in use,
>> the IUCr will archive that dictionary and certify that the prefix is
>> unique. If later workers wish to re-use such a 'closed' prefix, they
>> must not define any items that appear in such archived dictionaries.
>> (5) The IUCr cannot provide any guarantees as to the correctness or
>> uniqueness of definitions in dictionaries published by third parties.
>> COMCIFS may choose, on request, to bring such third party dictionaries
>> into the IUCr domain, in which case datanames and details of
>> definitions may change.
>>
>>
>> --
>> T +61 (02) 9717 9907
>> F +61 (02) 9717 3145
>> M +61 (04) 0249 4148
>> _______________________________________________
>> comcifs mailing list
>> comcifs@iucr.org
>> http://mailman.iucr.org/mailman/listinfo/comcifs
>
> _______________________________________________
> comcifs mailing list
> comcifs@iucr.org
> http://mailman.iucr.org/mailman/listinfo/comcifs

-- 

John Westbrook, Ph.D.
RCSB, Protein Data Bank
Rutgers, The State University of New Jersey
Department of Chemistry and Chemical Biology
174 Frelinghuysen Rd
Piscataway, NJ 08854-8087
e-mail: jwest@rcsb.rutgers.edu
Ph: (848) 445-4290 Fax: (732) 445-4320
_______________________________________________
comcifs mailing list
comcifs@iucr.org
http://mailman.iucr.org/mailman/listinfo/comcifs

Reply to: [list | sender only]