[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [ddlm-group] Updated draft from subgroup discussing encodings

To me that looks very doable.  I would suggest we put the concatenation
into CIF2 and try to get this sort of on-the-fly evaluation into some
early update.  It looks like it would obviate a lot of methods calls
and be very readable.

=====================================================
  Herbert J. Bernstein, Professor of Computer Science
    Dowling College, Kramer Science Center, KSC 121
         Idle Hour Blvd, Oakdale, NY, 11769

                  +1-631-244-3035
                  yaya@dowling.edu
=====================================================

On Sat, 9 Oct 2010, SIMON WESTRIP wrote:

> Dear all
> 
> The following is something that ought to be left until CIF3, but
> perhaps there is no harm in introducing the concept here, in support of a
> string concatenation mechanism (and partly because any feedback would be
> useful
> to me in exploring ways to include 'richer' content in CIFs for publication
> purposes).
> 
> With the introduction of a string-concatenation mechanism, something I have
> been thinking about
> for some time has become a real possibility (to my mind at least).
> 
> Basically, I've been looking for a mechanism to reference another data value
> in a data value without having to
> make the reference using some sort of application-specific escape sequence
> within the value.
> 
> With a concatenation mechanism, something like the following beomes
> possible:
> 
> 
> _publ_text
> ;
> ...Most of this text is written according to traditional cif text markup
> conventions...
> However, here is an equation written in TeX...
> 
> ; _ $_publ_object_value{"data_":"I" "_publ_object_id":"1"} _
> ;
> ...If it were included in TeX directly in this text field there would be no
> way of knowing that the syntax was intended to be interpretted and processed
> as TeX...
> ;
> 
> 
> In this example the publ_object is defined by a loop:
> 
> loop_
> _publ_object_id
> _publ_object_type
> _publ_object_value
> 1 'tex' 'W^{1} = \matrix{1 & 0 & 0 & 0 \cr 0 & 1 & 0 & 0 \cr 0 & 0 & 1 & 0
> \cr 0 & 0 & 0 & 1 \cr}'
> ...
> 
> The structure of the 'reference' could be seen as a 'query language' in this
> context -
> 
> $_publ_object_value reads 'VALUE OF _publ_object_value'
> 
> {"data_":"I" "_publ_object_id":"1"} reads "WHERE the data block id is 'I'
> and _publ_object_id is '1'"
> 
> Such queries could be nested when e.g. loops are related by key values.
> 
> This concept is based on using the $ and assumes that $ has no other
> syntactic use in this context
> at this level (I suspect it may have use in dREL but havent found reference
> to it in the docs I've seen - nor the DDLm docs)?
> 
> In addition, although it utilizes syntax 'structures' that will already be
> in CIF2 and builds on the concepts of dREL/DDLm,
> it would still require its own 'chapter' in the specification and close
> scrutiny to ensure that it is a robust specification
> (there are a number of issues to address, e.g. the value returned by the
> query must obviously be of the appropriate type
> for the item that invoked the query, or castable; ... if more than one value
> matches the query, should it be returned using a list structure...etc.).
> Furthermore, it is not the sort of thing that you would expect to be
> implemented 'by hand'.
> 
> However, without a concatenation mechanism, there would be no point in me
> exploring this at all
> (part of my work for the IUCr involves such exploration).
> 
> Cheers
> 
> Simon
> 
> ____________________________________________________________________________
> From: Herbert J. Bernstein <yaya@bernstein-plus-sons.com>
> To: Group finalising DDLm and associated dictionaries <ddlm-group@iucr.org>
> Sent: Saturday, 9 October, 2010 12:24:27
> Subject: Re: [ddlm-group] Updated draft from subgroup discussing encodings
> 
> The change in the handling of an underscore in a tag name (requiring
> at least one more character) is a good idea in any case (whether
> or not the use for cancatenation is adopted).  I suggest we put that
> change to a vote promptly and separately.
> 
> As a matter of clean style, the use of whitespace around the underscore
> is certainly a good idea for a compliant CIF2 writer.
> 
> =====================================================
> Herbert J. Bernstein, Professor of Computer Science
>   Dowling College, Kramer Science Center, KSC 121
>         Idle Hour Blvd, Oakdale, NY, 11769
> 
>                 +1-631-244-3035
>                 yaya@dowling.edu
> =====================================================
> 
> On Sat, 9 Oct 2010, SIMON WESTRIP wrote:
> 
> > I would prefer the first form only - i.e. treat the lonely underscore as a
> > 'keyword'
> > and thus require separation by whitespace. But this preference has more to
> > do with fitting
> > this operator element in the current 'classes' of cif elements.
> >
> > On a related point, the draft spec states:
> >
> > "A data name begins with an ASCII _ and may be followed by any number of
> > characters within the 2048
> > character restriction."
> >
> > I think this should read:
> >
> > "A data name begins with an ASCII _  and is followed by one or more
> > characters within the 2048
> > character restriction."
> >
> > Or words to that effect - especially if the underscore is adopted as an
> > operator.
> >
> > Cheers
> >
> > Simon
> >
> >
> >___________________________________________________________________________
> _
> > From: Herbert J. Bernstein <yaya@bernstein-plus-sons.com>
> > To: Group finalising DDLm and associated dictionaries
> <ddlm-group@iucr.org>
> > Sent: Friday, 8 October, 2010 23:38:10
> > Subject: Re: [ddlm-group] Updated draft from subgroup discussing encodings
> >
> > I think it is.
> >
> > The current form of the proposal, as per your suggestion, is:
> >
> >
> > "string1" _ "string2" or
> > "string1"_"string2"
> > etc.
> >
> > will represent the concatenation of string1 and string2
> > for any quoted strings, string1 and string2 using any
> > of the valid quote marks.
> >
> > The first form does not conflict with any valid cif2
> > or cif1 construct unless we accept underscore by itself as
> > a tag.  The second form does conflict with a cif1
> > quoted string and therefore should not be used if there
> > is any ambiguity as to whether the file in question
> > is a cif1 or a cif2 file.
> > =====================================================
> > Herbert J. Bernstein, Professor of Computer Science
> >   Dowling College, Kramer Science Center, KSC 121
> >         Idle Hour Blvd, Oakdale, NY, 11769
> >
> >                 +1-631-244-3035
> >                 yaya@dowling.edu
> > =====================================================
> >
> > On Fri, 8 Oct 2010, SIMON WESTRIP wrote:
> >
> > > Dear all
> > >
> > > "Once we resolve the string concatenation operator issue..."
> > >
> > > Is this issue still on the table?
> > >
> > > Cheers
> > >
> > > Simon
> > >
> >>__________________________________________________________________________
> _
> > ______________________________________________________
> > > From: James Hester <jamesrhester@gmail.com>
> > > To: ddlm-group <ddlm-group@iucr.org>
> > > Sent: Tuesday, 5 October, 2010 23:52:08
> > > Subject: [ddlm-group] Updated draft from subgroup discussing encodings
> > >
> > > Dear DDLm group,
> > >
> > > The encoding group that was split off this group and tasked with
> > > developing a mutually satisfactory approach to encodings in CIF2 has
> > > now produced an updated draft of the CIF2 'changes' document.  Brian
> > > has posted this on the IUCr website at
> >>http://www.iucr.org/__data/assets/pdf_file/0016/41911/cif2_syntax_changes_
> j
> > rh20101005.pdf
> > > The changes relative to the July draft are in section 2 describing the
> > > character set, and some additional text in section 1.
> > >
> > > Once we resolve the string concatenation operator issue, I think we
> > > are in good shape to take CIF2 to COMCIFS for approval.  I would once
> > > again urge anybody with any outstanding issues regarding DDLm or dREL
> > > to bring those issues up as soon as possible.
> > >
> > > James.
> > > --
> > > T +61 (02) 9717 9907
> > > F +61 (02) 9717 3145
> > > M +61 (04) 0249 4148
> > > _______________________________________________
> > > ddlm-group mailing list
> > > ddlm-group@iucr.org
> > > http://scripts.iucr.org/mailman/listinfo/ddlm-group
> > >
> > >
> >
> >
> 
>
_______________________________________________
ddlm-group mailing list
ddlm-group@iucr.org
http://scripts.iucr.org/mailman/listinfo/ddlm-group

Reply to: [list | sender only]