Re: [ddlm-group] _enumerated_set.table_id
- To: Group finalising DDLm and associated dictionaries <ddlm-group@iucr.org>
- Subject: Re: [ddlm-group] _enumerated_set.table_id
- From: James Hester <jamesrhester@gmail.com>
- Date: Wed, 22 Apr 2015 14:45:22 +1000
- DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;h=mime-version:in-reply-to:references:date:message-id:subject:from:to:content-type; bh=gYF4TThfW7qPwD6f8JOorjLHrn7GnhUfiuaol5nN5Fw=;b=EW0/PBuoWmj2rY1p8ua3TDjnEQlJHlKivmkFJTTbVeQ3KeOJuKJ/EsJVTOveEQMpNZrOK/ezuZZp8GTdyZhQMex8iNzE50LlAnbfBg2oFDl/FAoc+1ih2TaYJGXk2nEfI6g7YfSPrg5tFlecIe+0+vyGuaz6IKk+b2J/NWa2AfvnvMFUyksolkvdzicwfosr3Ld6+gQTPFrWNdcPKz4IKD374FDGjmIN9iHLXRbun267As+Nz2nA3TeQmr2nwrVDapRepE1B5aH4fB601Q6gxk+UjdSiNWgXE7SwCXS65AAB2HdbYO0UZIPv3Tzegqa55TUaR5BbuNNIqhHg2dPsgw==
- In-Reply-To: <BY2PR0401MB0936F054069A9F1DEA47ABD7E0EF0@BY2PR0401MB0936.namprd04.prod.outlook.com>
- References: <CAM+dB2ecOvjBT8OnV2tLy6rpJF2s=j4mLwJ09+x9AePUiByyXQ@mail.gmail.com><BY2PR0401MB0936963785E7A96CE3BBDE7AE0E00@BY2PR0401MB0936.namprd04.prod.outlook.com><CAM+dB2covps-EK0K-kpz9j_E1nJUmVvdNraHTSGJXKst=mo=SQ@mail.gmail.com><BY2PR0401MB0936F054069A9F1DEA47ABD7E0EF0@BY2PR0401MB0936.namprd04.prod.outlook.com>
Hi James,
Yes, we agree that _enumeration_set.table_id can be dropped. I am uncertain whether we agree about whether it should be replaced with something else.
I am prepared to accept these limitations on the data types that can be defined by a DDLm dictionary (including DDLm itself), if indeed DDLm itself and the other existing DDLm dictionaries can be expressed adequately under such constraints:
- The allowed types of values within a list cannot depend on their position in the list
- The allowed types of values within a table cannot depend on their associated keys
These assign primacy to categories / loops for defining complex, heterogeneous data, so that it is unnecessary (I think) to be able to define data types that use lists and / or tables analogously to C structs.
I am inclined to think that one of the greater weaknesses of the 2012 version of the DDLm dictionary is its provisions for defining complex data types. They are somewhat inconsistent, and the provided definition text is unclear about exactly how one would go about defining complex data. Moreover, if _type.dimension is intended to be the primary vehicle for defining complex internal structure then it must bear the weight of an entire schema language. That seems to be exactly what it’s trying to do, but the details of that language are by no means adequately documented, and it seems an odd approach given that it’s hosted inside another language that itself can serve as a schema language.
This is what I think we should do:
1. Remove _enumeration_set.table_id. It doesn’t work well for its intended purpose.
2. Redefine _type.dimension so that it is used only to specify the dimension(s) of values of items having _type.container in { 'List', 'Array', 'Matrix' }. Relieve it of any responsibility for defining element types. Possibly remove the ability to define ragged multi-dimensional arrays (which conflict with the proposed limitation that allowed types of values within a list cannot depend on their position in the list).
3. Clarify that when _type.container has value 'Table', _type.contents defines the characteristics of the *values* in the table.
4. Add a replacement mechanism to define constraints on table keys. It might be sufficient, and consistent with the apparent intent of the current dictionary, to establish a parallel to the _enumeration_set category for constraining key values, maybe _key_enumeration_set. It would be a smaller change at the dictionary level, however, to add a mechanism by which constraints on key type could be defined by reference to the type of another item (see also next).
5. Add a mechanism to allow items' content type to be defined by reference to another item. This could be signaled by a new code for _type.contents, with a new attribute defining which other item’s type is to be used. I don’t think that the existing contents code 'Inherited' can serve this purpose, but perhaps I’m mistaken.
(1) the elements of the _import.get List are items of the same type as _import.get_contents_type
(2) _import.get_contents_type is a Table, so _type.contents for it is the type of values in the table i.e. Text
(3) The possible key values are given by the possible values taken by the _type.key_type_reference dataname
save_import.get_contents_type
# ...
_type.purpose 'Internal'
_type.container 'Table'
_type.contents 'Text'
loop_
_table_key_set.state
_table_key_set.detail
'file' 'filename/URI of source dictionary'
'save' 'save framecode of source definition'
'mode' 'mode for including save frames'
'dupl' 'option for duplicate entries'
'miss' 'option for missing duplicate entries'
save_
which results in new attributes _type.key_content_reference, _table_key_set.state and _table_key_set.detail with one internal attribute _import.get_contents_type, and also reduces the non-locality of the definition - that is, one less reference to track through the file. _import.get is admittedly an extreme example, because it is the only occurrence of a list of tables rather than just a table, which is what requires the creation of the 'internal' data attribute. It is, however, a nice demonstration of how the attributes might work for future dictionary writers. The new 'internal' dataname does have some meaning along the lines of 'a single import instruction' so a better dataname might be _import.single. Is there any reason that you introduced a reference in order to specify the table keys? And do you agree that the alternative I've proposed above would also be sufficient?
On a final note for _import.get, the dREL is broken as it assumes that there is only one value for each of the constituent _import datanames, which would make a list superfluous (only one element), but what it really wants to do is to create a list from a loop of _import.file etc. values. To do this it needs a sequence number, which isn't defined. Once this *is* defined, we could alternatively present the import instructions as a loop over _import.sequence and _import.single, or else _import.seqence, _import.file etc.
Allowing types of keys / values to be defined by reference to the types of other items raises the possibility that dictionaries will occasionally want to define items solely for the purpose of defining their content type for reference by other definitions. I don’t think this is harmful, but it might be best supported by a new value for _type.purpose, as demonstrated below.
If all those changes were implemented then the definition for DDLm_import.get might be revised like so:
_type.purpose 'Import'
_type.container 'List'
_type.contents 'Text'
_type.keys 'ByReference'
_type.key_type_reference 'import.get_key_type'
That would require addition of a new attribute to category IMPORT, its definition containing the following (among other necessary attributes not shown):
save_import.get_key_type
# ...
_type.purpose 'Internal' # New value
_type.container 'Single'
_type.contents 'Code'
loop_
_enumeration_set.state
_enumeration_set.detail
'file' 'filename/URI of source dictionary'
'save' 'save framecode of source definition'
'mode' 'mode for including save frames'
'dupl' 'option for duplicate entries'
'miss' 'option for missing duplicate entries'
save_
Additional attributes needed in category TYPE would be _type.keys (accepting the same values as _type.contents where those values describe string data), _type.key_type_reference (containing the _definition.id of the referenced item), and _type.contents_type_reference (not demonstrated; analogous to _type.key_type_reference).
John
--
F +61 (02) 9717 3145
M +61 (04) 0249 4148
_______________________________________________ ddlm-group mailing list ddlm-group@iucr.org http://mailman.iucr.org/cgi-bin/mailman/listinfo/ddlm-group
Reply to: [list | sender only]
- Follow-Ups:
- Re: [ddlm-group] _enumerated_set.table_id (Bollinger, John C)
- References:
- [ddlm-group] _enumerated_set.table_id (James Hester)
- Re: [ddlm-group] _enumerated_set.table_id (Bollinger, John C)
- Re: [ddlm-group] _enumerated_set.table_id (James Hester)
- Re: [ddlm-group] _enumerated_set.table_id (Bollinger, John C)
- Prev by Date: Re: [ddlm-group] _enumerated_set.table_id
- Next by Date: Re: [ddlm-group] _enumerated_set.table_id
- Prev by thread: Re: [ddlm-group] _enumerated_set.table_id
- Next by thread: Re: [ddlm-group] _enumerated_set.table_id
- Index(es):