[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Reply to: [list | sender only]
[ddlm-group] Adding namespaces to dREL
- To: ddlm-group <ddlm-group@iucr.org>
- Subject: [ddlm-group] Adding namespaces to dREL
- From: James Hester <jamesrhester@gmail.com>
- Date: Fri, 27 Nov 2020 14:17:05 +1100
Dear DDLm group,
Please have a look at the following proposal for adding namespaces to dREL and provide any feedback you have. Note that this has no implications for how data names are written in data files or for DDLm.
I have implemented this syntax by machine-parsing the adjusted grammar to create executable code and am happy that it is workable and does not introduce any parsing ambiguity.
The marked-up version is here:Â https://github.com/COMCIFS/comcifs.github.io/blob/master/draft/drel_namespace_draft.md
and the plain text is below.
all the best,
James.
=============================================================================
# Proposal to add namespaces to dREL
## Summary
It is proposed that the new construction `<namespace> "|" <identifer>`
be added to dREL to allow disambiguation of `<identifier>` when
necessary. Note this is not relevant to the way in which data names are
written in data files.
## Introduction
dREL methods operate within a relational context. Tables
("categories") from this context are automatically bound to variables
within the dREL method. Occasionally, contexts may identify tables using a
two-part name: for example, if the table names are drawn from
different domains which happen to use the same table identifier, a
label associated with the domain itself could be used for
disambiguation.
dREL currently does not have syntax to allow for such two-part
names.
## New syntax
Currently
[the formal grammar for dREL](https://github.com/COMCIFS/dREL/blob/master/annotated-grammar.rst)
contains the following line defining an identifier:
```
ident = ID ;
```
where `ID` is defined by a suitable regular expression. The proposal is to change this to
the following:
```
  ident = ID ;
  nspace = ID BAR ;
  nident = [ nspace ] ident ;
```
and replacing `ident` by `nident` in those productions that may refer to categories.
In the following productions any appearances of `nident` were previously `ident`:
```
call = nident  LEFTPAREN [expression_list] RIGHTPAREN ;
dotlist_assign = nident "("  dotlist  ")" ;
att_primary = nident | attributeref | subscription | call ;
loop_stmt =  LOOP ident AS nident [ COLON  ident  [restricted_comp_operator  ident]] suite ;
with_stmt = WITH  ident  AS  nident  suite ;
```
## Example
Code to transform dictionary information from ddl2 to ddlm.
```
loop dd as ddl2|dictionary {
  ddlm|dictionary(.title = dd.title,
          .version = dd.version
          )
}
```
## Comments
1. Use of a namespace for a table identifier is optional (`nident = [ nspace ] ident`)
2. Vertical bar `|` has been chosen as the separator as this leads to an unambiguous grammar.
The only appearance of vertical bar in the grammar is as `||` meaning "bitwise or". `:` (colon)
leads to ambiguity due to its widespread use in other places in the grammar.
3. Namespaces are made available for function calls as DDLm dictionaries can define functions,
which may therefore have namespaces associated with them.
4. DDLm dictionaries already include a `_dictionary.namespace` attribute.
5. This proposal has no implications for the structure or appearance of data names
in data files.
6. Use of this facility is only likely (i) when interoperating with outside
domains, over which we have no naming control, or (ii) categories are named identically,
even if the data name as a whole is distinct.
7. The productions above have been tested using the [Lerche grammar parser](https://github.com/jamesrhester/Lerche.jl) to create Julia code that successfully transforms between DDL2
and DDLm dictionaries using dREL expressions, disambiguating the "dictionary" and
"category_key" categories that appear in both.
-- ## Summary
It is proposed that the new construction `<namespace> "|" <identifer>`
be added to dREL to allow disambiguation of `<identifier>` when
necessary. Note this is not relevant to the way in which data names are
written in data files.
## Introduction
dREL methods operate within a relational context. Tables
("categories") from this context are automatically bound to variables
within the dREL method. Occasionally, contexts may identify tables using a
two-part name: for example, if the table names are drawn from
different domains which happen to use the same table identifier, a
label associated with the domain itself could be used for
disambiguation.
dREL currently does not have syntax to allow for such two-part
names.
## New syntax
Currently
[the formal grammar for dREL](https://github.com/COMCIFS/dREL/blob/master/annotated-grammar.rst)
contains the following line defining an identifier:
```
ident = ID ;
```
where `ID` is defined by a suitable regular expression. The proposal is to change this to
the following:
```
  ident = ID ;
  nspace = ID BAR ;
  nident = [ nspace ] ident ;
```
and replacing `ident` by `nident` in those productions that may refer to categories.
In the following productions any appearances of `nident` were previously `ident`:
```
call = nident  LEFTPAREN [expression_list] RIGHTPAREN ;
dotlist_assign = nident "("  dotlist  ")" ;
att_primary = nident | attributeref | subscription | call ;
loop_stmt =  LOOP ident AS nident [ COLON  ident  [restricted_comp_operator  ident]] suite ;
with_stmt = WITH  ident  AS  nident  suite ;
```
## Example
Code to transform dictionary information from ddl2 to ddlm.
```
loop dd as ddl2|dictionary {
  ddlm|dictionary(.title = dd.title,
          .version = dd.version
          )
}
```
## Comments
1. Use of a namespace for a table identifier is optional (`nident = [ nspace ] ident`)
2. Vertical bar `|` has been chosen as the separator as this leads to an unambiguous grammar.
The only appearance of vertical bar in the grammar is as `||` meaning "bitwise or". `:` (colon)
leads to ambiguity due to its widespread use in other places in the grammar.
3. Namespaces are made available for function calls as DDLm dictionaries can define functions,
which may therefore have namespaces associated with them.
4. DDLm dictionaries already include a `_dictionary.namespace` attribute.
5. This proposal has no implications for the structure or appearance of data names
in data files.
6. Use of this facility is only likely (i) when interoperating with outside
domains, over which we have no naming control, or (ii) categories are named identically,
even if the data name as a whole is distinct.
7. The productions above have been tested using the [Lerche grammar parser](https://github.com/jamesrhester/Lerche.jl) to create Julia code that successfully transforms between DDL2
and DDLm dictionaries using dREL expressions, disambiguating the "dictionary" and
"category_key" categories that appear in both.
T +61 (02) 9717 9907
F +61 (02) 9717 3145
M +61 (04) 0249 4148
F +61 (02) 9717 3145
M +61 (04) 0249 4148
_______________________________________________ ddlm-group mailing list ddlm-group@iucr.org http://mailman.iucr.org/cgi-bin/mailman/listinfo/ddlm-group
Reply to: [list | sender only]
- Follow-Ups:
- Re: [ddlm-group] Adding namespaces to dREL (Herbert J. Bernstein)
- Prev by Date: Re: [ddlm-group] Clarification on dREL tables
- Next by Date: Re: [ddlm-group] Adding namespaces to dREL
- Prev by thread: Re: [ddlm-group] Clarification of SU in DDLm dictionaries
- Next by thread: Re: [ddlm-group] Adding namespaces to dREL
- Index(es):