Discussion List Archives

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Theorem proving / reasoning using CIF dictionaries

If Joe's interest is more towards logical proofs based on 'has a' 'is a' 'contains' etc., then I'm not sure we have much to offer.  These concepts are rather orthogonal to our approach, because we fundamentally assume the involvement of human programmers in order to make the initial assignments of data names based on human-readable definitions. Relationships like 'has a', 'is a' are embedded in the discipline knowledge of the programmer and are of little further practical use for typical scientific computational tasks and so we haven't provided support for them.

In case I have misunderstood Joe's interests, there are some interesting mathematical approaches related to DDLm and dREL, as follows. Assuming that any data file has a relational representation, there is a direct one-to-one correspondence between the relational schema describing the file and a category (in the sense of mathematical category theory). This one-to-one correspondence is shown and leveraged by Spivak in several publications (e.g. Spivak, D. I. "Functorial data migration", Information and Computation 217 (2012) 31-51). A given data file or collection of data files represent an instance of the category.   Functors between different ontologies (i.e. categories) can be set up to enable automatic data migration between files whose schema are not identical. I gave a talk on this at the recent AsCA meeting in New Zealand.

Because DDLm and dREL work within a relational model of data, DDLm dictionaries and dREL can be used to describe and manipulate all forms of data.  This seems not to be generally appreciated, perhaps because we do lack the tools and visibility. Note that I have asserted above that any data file has a relational representation, without providing a strict mathematical proof, although I think one would be easy starting from the point that a data file is a mapping from file position to byte. Meanwhile, I have demonstrated the universality of DDLm and dREL by using DDLm dictionaries and dREL to manipulate data expressed in HDF5 format (hierarchical, binary). The code is available from Zenodo, and can be found via the publication at https://datascience.codata.org/articles/10.5334/dsj-2016-012/.

Hope that helps in some way!

On Thu, 13 Dec 2018 at 20:22, Peter Murray-Rust <pm286@cam.ac.uk> wrote:
Many thanks Nick,
CIF and DRel are ahead of their time. I find this all the time with CML. noone gets the point.
It's so obvious to you and us and me that published knowledge is computable. But so few people get it. I didn't realise when started withe early dictionaries (what was the precursor of CIF? ) that this was anything unusual. What's amazing is how few other disciplines have adopted the semantic dictionary idea.
I am now starting this with Wikidata, which I think is the future of scientific metadata and can be computed (it's in RDF/SPARQL).
Is there any activity in getting CIF definitions into Wikidata? All the PDB codes are already in.

I think CIF.DRel is a good playground for developing theorem proving

On Thu, Dec 13, 2018 at 3:57 AM Nick Spadaccini <nick.spadaccini.uwa@gmail.com> wrote:
In our original design considerations of how to structure a dictionary and define dREL we explicitly discussed the concept of a knowledge engine trawling the dictionary to find hitherto undefined relations from the definitions - we were blue sky, spit balling in essence but we were confident eventually such a process could be manufactured. 

The realities of mapping ideas around dREL into concrete systems means you make compromises. While we tried to engineer dREL around "relations" it is quite imperative in nature. Not to sure what knowledge engines consume as information these days and whether it has moved on from is_a, has_a etc etc semantic networks or whether programming code is trawlable, but good luck - I would love to see DDL and dREL move to the next level.


Perth,  WA  6009 AUSTRALIA

e: Nick.Spadaccini@uwa.edu.au

On Fri, Dec 7, 2018 at 8:53 PM Peter Murray-Rust <pm286@cam.ac.uk> wrote:
Joe Cornell is a mathematician who is interested in theorem proving starting from semantic definitions. The CIF dictionaries could be an interesting starting point where definitions and executable code are coupled. Do you know of efforts to use CIF to calculate properties or relations that are not explicitly defined in the dictionaries - possible with constraints.

Hope this makes some sense!


Peter Murray-Rust
Reader Emeritus in Molecular Informatics
Unilever Centre, Dept. Of Chemistry
University of Cambridge

Peter Murray-Rust
Reader Emeritus in Molecular Informatics
Unilever Centre, Dept. Of Chemistry
University of Cambridge
comcifs mailing list

T +61 (02) 9717 9907
F +61 (02) 9717 3145
M +61 (04) 0249 4148

Reply to: [list | sender only]