Discussion List Archives

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [ddlm-group] Updated draft from subgroup discussing encodings

To me that looks very doable.  I would suggest we put the concatenation
into CIF2 and try to get this sort of on-the-fly evaluation into some
early update.  It looks like it would obviate a lot of methods calls
and be very readable.

=====================================================
  Herbert J. Bernstein, Professor of Computer Science
    Dowling College, Kramer Science Center, KSC 121
         Idle Hour Blvd, Oakdale, NY, 11769

                  +1-631-244-3035
                  yaya@dowling.edu
=====================================================

On Sat, 9 Oct 2010, SIMON WESTRIP wrote:

> Dear all
> 
> The following is something that ought to be left until CIF3, but
> perhaps there is no harm in introducing the concept here, in support of a
> string concatenation mechanism (and partly because any feedback would be
> useful
> to me in exploring ways to include 'richer' content in CIFs for publication
> purposes).
> 
> With the introduction of a string-concatenation mechanism, something I have
> been thinking about
> for some time has become a real possibility (to my mind at least).
> 
> Basically, I've been looking for a mechanism to reference another data value
> in a data value without having to
> make the reference using some sort of application-specific escape sequence
> within the value.
> 
> With a concatenation mechanism, something like the following beomes
> possible:
> 
> 
> _publ_text
> ;
> ...Most of this text is written according to traditional cif text markup
> conventions...
> However, here is an equation written in TeX...
> 
> ; _ $_publ_object_value{"data_":"I" "_publ_object_id":"1"} _
> ;
> ...If it were included in TeX directly in this text field there would be no
> way of knowing that the syntax was intended to be interpretted and processed
> as TeX...
> ;
> 
> 
> In this example the publ_object is defined by a loop:
> 
> loop_
> _publ_object_id
> _publ_object_type
> _publ_object_value
> 1 'tex' 'W^{1} = \matrix{1 & 0 & 0 & 0 \cr 0 & 1 & 0 & 0 \cr 0 & 0 & 1 & 0
> \cr 0 & 0 & 0 & 1 \cr}'
> ...
> 
> The structure of the 'reference' could be seen as a 'query language' in this
> context -
> 
> $_publ_object_value reads 'VALUE OF _publ_object_value'
> 
> {"data_":"I" "_publ_object_id":"1"} reads "WHERE the data block id is 'I'
> and _publ_object_id is '1'"
> 
> Such queries could be nested when e.g. loops are related by key values.
> 
> This concept is based on using the $ and assumes that $ has no other
> syntactic use in this context
> at this level (I suspect it may have use in dREL but havent found reference
> to it in the docs I've seen - nor the DDLm docs)?
> 
> In addition, although it utilizes syntax 'structures' that will already be
> in CIF2 and builds on the concepts of dREL/DDLm,
> it would still require its own 'chapter' in the specification and close
> scrutiny to ensure that it is a robust specification
> (there are a number of issues to address, e.g. the value returned by the
> query must obviously be of the appropriate type
> for the item that invoked the query, or castable; ... if more than one value
> matches the query, should it be returned using a list structure...etc.).
> Furthermore, it is not the sort of thing that you would expect to be
> implemented 'by hand'.
> 
> However, without a concatenation mechanism, there would be no point in me
> exploring this at all
> (part of my work for the IUCr involves such exploration).
> 
> Cheers
> 
> Simon
> 
> ____________________________________________________________________________
> From: Herbert J. Bernstein <yaya@bernstein-plus-sons.com>
> To: Group finalising DDLm and associated dictionaries <ddlm-group@iucr.org>
> Sent: Saturday, 9 October, 2010 12:24:27
> Subject: Re: [ddlm-group] Updated draft from subgroup discussing encodings
> 
> The change in the handling of an underscore in a tag name (requiring
> at least one more character) is a good idea in any case (whether
> or not the use for cancatenation is adopted).  I suggest we put that
> change to a vote promptly and separately.
> 
> As a matter of clean style, the use of whitespace around the underscore
> is certainly a good idea for a compliant CIF2 writer.
> 
> =====================================================
> Herbert J. Bernstein, Professor of Computer Science
>   Dowling College, Kramer Science Center, KSC 121
>         Idle Hour Blvd, Oakdale, NY, 11769
> 
>                 +1-631-244-3035
>                 yaya@dowling.edu
> =====================================================
> 
> On Sat, 9 Oct 2010, SIMON WESTRIP wrote:
> 
> > I would prefer the first form only - i.e. treat the lonely underscore as a
> > 'keyword'
> > and thus require separation by whitespace. But this preference has more to
> > do with fitting
> > this operator element in the current 'classes' of cif elements.
> >
> > On a related point, the draft spec states:
> >
> > "A data name begins with an ASCII _ and may be followed by any number of
> > characters within the 2048
> > character restriction."
> >
> > I think this should read:
> >
> > "A data name begins with an ASCII _  and is followed by one or more
> > characters within the 2048
> > character restriction."
> >
> > Or words to that effect - especially if the underscore is adopted as an
> > operator.
> >
> > Cheers
> >
> > Simon
> >
> >
> >___________________________________________________________________________
> _
> > From: Herbert J. Bernstein <yaya@bernstein-plus-sons.com>
> > To: Group finalising DDLm and associated dictionaries
> <ddlm-group@iucr.org>
> > Sent: Friday, 8 October, 2010 23:38:10
> > Subject: Re: [ddlm-group] Updated draft from subgroup discussing encodings
> >
> > I think it is.
> >
> > The current form of the proposal, as per your suggestion, is:
> >
> >
> > "string1" _ "string2" or
> > "string1"_"string2"
> > etc.
> >
> > will represent the concatenation of string1 and string2
> > for any quoted strings, string1 and string2 using any
> > of the valid quote marks.
> >
> > The first form does not conflict with any valid cif2
> > or cif1 construct unless we accept underscore by itself as
> > a tag.  The second form does conflict with a cif1
> > quoted string and therefore should not be used if there
> > is any ambiguity as to whether the file in question
> > is a cif1 or a cif2 file.
> > =====================================================
> > Herbert J. Bernstein, Professor of Computer Science
> >   Dowling College, Kramer Science Center, KSC 121
> >         Idle Hour Blvd, Oakdale, NY, 11769
> >
> >                 +1-631-244-3035
> >                 yaya@dowling.edu
> > =====================================================
> >
> > On Fri, 8 Oct 2010, SIMON WESTRIP wrote:
> >
> > > Dear all
> > >
> > > "Once we resolve the string concatenation operator issue..."
> > >
> > > Is this issue still on the table?
> > >
> > > Cheers
> > >
> > > Simon
> > >
> >>__________________________________________________________________________
> _
> > ______________________________________________________
> > > From: James Hester <jamesrhester@gmail.com>
> > > To: ddlm-group <ddlm-group@iucr.org>
> > > Sent: Tuesday, 5 October, 2010 23:52:08
> > > Subject: [ddlm-group] Updated draft from subgroup discussing encodings
> > >
> > > Dear DDLm group,
> > >
> > > The encoding group that was split off this group and tasked with
> > > developing a mutually satisfactory approach to encodings in CIF2 has
> > > now produced an updated draft of the CIF2 'changes' document.  Brian
> > > has posted this on the IUCr website at
> >>http://www.iucr.org/__data/assets/pdf_file/0016/41911/cif2_syntax_changes_
> j
> > rh20101005.pdf
> > > The changes relative to the July draft are in section 2 describing the
> > > character set, and some additional text in section 1.
> > >
> > > Once we resolve the string concatenation operator issue, I think we
> > > are in good shape to take CIF2 to COMCIFS for approval.  I would once
> > > again urge anybody with any outstanding issues regarding DDLm or dREL
> > > to bring those issues up as soon as possible.
> > >
> > > James.
> > > --
> > > T +61 (02) 9717 9907
> > > F +61 (02) 9717 3145
> > > M +61 (04) 0249 4148
> > > _______________________________________________
> > > ddlm-group mailing list
> > > ddlm-group@iucr.org
> > > http://scripts.iucr.org/mailman/listinfo/ddlm-group
> > >
> > >
> >
> >
> 
>
_______________________________________________
ddlm-group mailing list
ddlm-group@iucr.org
http://scripts.iucr.org/mailman/listinfo/ddlm-group

Reply to: [list | sender only]
International Union of Crystallography

Scientific Union Member of the International Science Council (admitted 1947). Member of CODATA, the ISC Committee on Data. Partner with UNESCO, the United Nations Educational, Scientific and Cultural Organization in the International Year of Crystallography 2014.

International Science Council Scientific Freedom Policy

The IUCr observes the basic policy of non-discrimination and affirms the right and freedom of scientists to associate in international scientific activity without regard to such factors as ethnic origin, religion, citizenship, language, political stance, gender, sex or age, in accordance with the Statutes of the International Council for Science.