Discussion List Archives

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Draft namespace recommendations

Dear Colleagues,

   I think it is unrealistic to expect users not to mix multiple domains 
in a single
CIF.   One of the virtues of the current prefix mechanism is that is permits
mixing of multiple prefixes in the same CIF, and I believe a similar benefit
would come from allowing the mixing of namespaces.

   I don't understand John B.'s  comment that  "The CIF domain proposal, on
the other hand, is not intended to bind CIF instance documents or 
data names directly to particular dictionaries."  If there are not
going to be particular dictionaries involved, where will the 
tags be?   How will anybody know what a particular tag means if we are not
going to refer particular tags back to particular dictionaries?  I am 
missing something here.

   I disagree strongly with the assertion that "adopting the proposal 
should have
little or no practical  effect on current CIF uses, dictionaries, or 
software."  As
soon as somebody does a mixed experiment drawing on information from
dictionaries in other disciplines we will have to deal with this.
It is best to plan ahead for the likely use cases.  If what John B. seem 
to be
saying we intend to create a system of conflicting tag names and not to 
a way to disambiguate them on a fine grain level.  That is a bit like 
saying that
we will permit English books and French books, but we will not permit an
English book that uses some French words or a French book that
uses some English words.   I don't see what is gained by this approach.

   Inasmuch as this is a voting issue, in light of John B..'s 
clarification, and the clear intent
that what is on the table not actually be used, I vote no.  Hopefully we 
can discuss this
further and come up with something that we all support and that is useful.


On 7/17/13 3:45 PM, Bollinger, John C wrote:
> James and I in fact had a discussion on the forum that touches directly on some of the issues that Herbert raises: http://forums.iucr.org/viewtopic.php?f=28&t=101.  In particular, one upshot of the discussion was that it is NOT a goal of the proposal to permit items from different "domains" to be included in the same save frame, or to permit them to be mixed in the same data block except by using save frames as an encapsulation mechanism (with no semantics for such use being specified at this time).
> Also, as I understand it, this proposal is a separate initiative from the current practice of using IUCr-issued prefixes, not an evolution of it.  The simple fact that an organization obtains a sanctioned prefix from IUCr implies that, at least with respect to that prefix, its CIFs are targeted at the IUCr domain.
> Furthermore, the analogy between CIF domains (as proposed) and XML namespaces is rather weak.  As they have come to be used in XML, namespaces typically have a direct connection to an XML schema, and are directly and intimately involved in instance document validation.  The CIF domain proposal, on the other hand, is not intended to bind CIF instance documents or individual data names directly to particular dictionaries.
> This whole thing is simply about a framework for separate administrative domains at the topmost level.  Inasmuch as substantially all current CIF uses fall within the IUCr domain, adopting the proposal should have little or no practical effect on current CIF uses, dictionaries, or software.
> John
> -----Original Message-----
> From: comcifs-bounces@iucr.org [mailto:comcifs-bounces@iucr.org] On Behalf Of yayahjb
> Sent: Wednesday, July 17, 2013 5:45 AM
> To: Discussion list of the IUCr Committee for the Maintenance of the CIF Standard (COMCIFS)
> Subject: Re: Draft namespace recommendations. .
> The discipline proposal does not provide a convenient way to mix datanames from multiple disciplines in the same datablock.  XML DOM does it with a colon
> separator.   C++ does it
> with a "::" separator.  I suggest we adopt the C++ convention, treating a discipline similarly to a C++ namespace.  I would suggest adding the following:
>    " To facilitate use of datanames from multiple disciplines in a single CIF the discipline from which a dataname is drawn may be specified by using C++ namespace-style conventions, i.e. by optionally prefixing the dataname with the discipline followed immediately by a double colon, as in
>     _IUCR::atom_site.occupancy
> as equivalent to
>     _atom_site.occupancy
> when the discipline is know to be IUCR
> Note that leading underscore appears before the prefixed discipline and and the double colon, and that no spaces are permitted in the construct."
> Neither the discipline proposal nor the statement of the existing prefix system talks about the interaction with dotted notation as it is used in DDL2.
> In DDL2,
> the prefix may be used on either dotted component:  the category or the column.  I suggest adding the following statements in the prefix section.
> "When an IUCr CIF dictionary uses formal dotted notation separating a dataname into category and column components (as in DDL2) the prefix may appear on either component."
> Finally we should explicitly allow the combination of dicsciplines and prefixes as in
> _IUCR:: audit_author.PDBX_ordinal
> On 7/16/13 9:46 PM, James Hester wrote:
>> Dear COMCIFS members,
>> There has been little discussion on the two namespace proposals linked
>> in my original email in February (apologies for the delay), which
>> leads me to conclude that they are acceptable.  For archival purposes,
>> I have included the full text of the two proposals in this email, and
>> I now request the COMCIFS voting members to formally indicate their
>> agreement.  In the case of a disagreement, please note that
>> disagreement briefly in your reply and then follow up the issue in the
>> namespace forum.  In accordance with COMCIFS practice, if more than 6
>> weeks pass from today's date with no reply, agreement will be assumed,
>> although explicit and rapid assent is always preferable.
>> Note that the second namespace proposal below differs slightly from
>> the one originally linked: I have added a sentence clarifying the
>> meaning of 'IUCr domain' in the preamble and clarified the meaning of
>> 'adopting' a third party dictionary.
>> James.
>> =========================
>> http://forums.iucr.org/viewtopic.php?f=28&t=319
>> Proposal for a new dataname to support a CIF namespace mechanism
>> Background
>> We wish to build some sort of namespace mechanism into CIF so that
>> other communities can use CIF with minimal, if any, coordination with
>> COMCIFS. The key requirement is that datanames and the corresponding
>> dictionary definitions must be unambiguously matchable. Currently,
>> COMCIFS guarantees the uniqueness and immutable nature of datanames,
>> so there is no need for any disambiguation mechanism. If CIF is to be
>> usable outside COMCIFS, there must be a mechanism so that the readers
>> and writers of CIF data files from a given community can agree on the
>> correct definition for a given dataname.
>> Two partial solutions already exist:
>> (1) people and organisations register an opaque 'prefix' for a
>> dataname with the IUCr. This allows users to populate their own
>> namespaces safely and devolves management of dataname collisions to
>> the relevant community. From the point of view of the outside
>> discipline, there remains the annoyance that the datanames and
>> dictionaries are cluttered with a redundant prefix.
>> (2) The _audit tags in a datablock can specify which dictionary the
>> datanames come from. The problem then becomes one of encouraging
>> programs to read and write these _audit items, given that simply
>> finding a matching dataname in a datafile is already a pretty solid
>> guarantee that it means what the programmer thought, as COMCIFS has up
>> until now guaranteed the stability and uniqueness of datanames.
>> Some discussion has taken place in the namespaces forum and members
>> are invited to read the comments there as well.
>> Proposed solution
>> We define an enumerated dataname, _audit.discipline, which takes
>> values assigned by COMCIFS and should never be redefined by any
>> CIF-using organisation - in effect it becomes part of the CIF
>> specification. We can formally define a 'discipline' here as a
>> collection of dictionaries which define datanames that are guaranteed
>> to always have a constant, unambiguous meaning. This guarantee would
>> presumably be provided by some organisation using policies chosen by
>> that organisation. A CIF datafile wishing to explicitly specify which
>> discipline its datanames are drawn from would set the value of
>> _audit.discipline inside its datablocks. Likewise, programmers who are
>> concerned about possible ambiguity in datanames can explicitly check
>> for the value of this dataname.
>> Note the following:
>> * The IUCr would maintain a registry of accepted disciplines. In
>> minimal form this could be the dictionary entries for
>> _audit.discipline and something like _audit.discipline_URI
>> * There is no requirement to use the _audit.discipline dataname, nor
>> to register disciplines. It is provided as a tool for those wishing to
>> avoid ambiguity
>> * Disciplines not wishing to register their discipline name but still
>> wishing to use _audit.discipline, must never choose 'IUCr' (or
>> whatever it is we decide) for their discipline name
>> * Minimal checking is required compared to the current _audit
>> datanames, but similar guarantees of uniqueness and correctness are
>> obtainable
>> * The _audit.discipline dataname should never be looped. Datanames
>> drawn from multiple disciplines may not have overlapped when a
>> datafile was produced, but may overlap when it is read, as there is no
>> coordination between disciplines.
>> The scope of the _audit.discipline dataname is the entire datablock
>> and all save frames within that data block, unless a save frame gives
>> a different value for _audit.discipline, in which case that new value
>> will apply to all nested save frames within that save frame.
>> ===================================================================
>> http://forums.iucr.org/viewtopic.php?f=28&t=315
>> Draft COMCIFS dataname and dictionary policy within the IUCr domain
>> COMCIFS must ensure the uniqueness of all extant datanames within the
>> IUCr domain. The following policy is designed to maximise the chances
>> that the status and meaning of any dataname encountered in the IUCr
>> domain is unambiguous. A dataname is considered to be within the IUCr
>> domain if the proposed _audit.discipline dataname has the value
>> 'IUCr'.
>> (1) Datanames not explicitly approved by COMCIFS and appearing in CIF
>> datafiles should either contain the string '[local]' or commence with
>> a prefix handed out by COMCIFS
>> (2) COMCIFS makes no undertakings as to the uniqueness of datanames
>> containing the string '[local]'.
>> (3) In the register of approved prefixes, COMCIFS may provide
>> certification that datanames with a given prefix will be unique. In
>> order to obtain this certification, a prefix assignee should:
>> (i) publish a publically-available dictionary defining all datanames
>> with that prefix
>> (ii) have an organisational structure judged capable of enforcing
>> dataname policy (a single person also suits this criterion)
>> (4) Alternatively, if a prefix assignee provides to the IUCr a
>> dataname dictionary and advises that the prefix is no longer in use,
>> the IUCr will archive that dictionary and certify that the prefix is
>> unique. If later workers wish to re-use such a 'closed' prefix, they
>> must not define any items that appear in such archived dictionaries.
>> (5) The IUCr cannot provide any guarantees as to the correctness or
>> uniqueness of definitions in dictionaries published by third parties.
>> COMCIFS may choose, on request, to bring such third party dictionaries
>> into the IUCr domain, in which case datanames and details of
>> definitions may change.
>> --
>> T +61 (02) 9717 9907
>> F +61 (02) 9717 3145
>> M +61 (04) 0249 4148
>> _______________________________________________
>> comcifs mailing list
>> comcifs@iucr.org
>> http://mailman.iucr.org/mailman/listinfo/comcifs
> _______________________________________________
> comcifs mailing list
> comcifs@iucr.org
> http://mailman.iucr.org/mailman/listinfo/comcifs
> Email Disclaimer:  www.stjude.org/emaildisclaimer
> Consultation Disclaimer:  www.stjude.org/consultationdisclaimer
> _______________________________________________
> comcifs mailing list
> comcifs@iucr.org
> http://mailman.iucr.org/mailman/listinfo/comcifs

comcifs mailing list

Reply to: [list | sender only]
International Union of Crystallography

Scientific Union Member of the International Science Council (admitted 1947). Member of CODATA, the ISC Committee on Data. Partner with UNESCO, the United Nations Educational, Scientific and Cultural Organization in the International Year of Crystallography 2014.

International Science Council Scientific Freedom Policy

The IUCr observes the basic policy of non-discrimination and affirms the right and freedom of scientists to associate in international scientific activity without regard to such factors as ethnic origin, religion, citizenship, language, political stance, gender, sex or age, in accordance with the Statutes of the International Council for Science.