Discussion List Archives

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [ddlm-group] CIF 1.5

Dear Colleagues,

   I respectfully disagree.  Indeed, I strongly disagree with almost every
aspect of the recent CIF 2 decisions.  However, I have had my say.  Let
us finish the design of something and get it into use.


  Herbert J. Bernstein, Professor of Computer Science
    Dowling College, Kramer Science Center, KSC 121
         Idle Hour Blvd, Oakdale, NY, 11769


On Tue, 1 Dec 2009, David Brown wrote:

> Before we close off the discussion on CIF1.5 I just want to put in my final 2 cents
> worth.
> As I mentioned before, I see no point in introducing CIF1.5 which will only muddy the
> waters and lead to total confusion. CIF1.5 is quite unnecessary.
> 1. Legacy software will not be able to read CIF1.5 any more than it will be able to
> read CIF2.0 files, so we might as well go directly to CIF 2.0. And how quickly will
> the legacy software be converted to output CIF1.5 files anyway. You need a 10 year
> lead time for changes in packages such as SHELX and another decade before people
> upload the latest version. Even then they will produce CIF1.1 output that they can
> load into their other programs.
> 2. For the forseable future DDLm applications will have to have a CIF1.1 lexer and a
> preparser to convert legacy files into CIF2.0 mode.
> 3. What dictionaries will be used for files written in CIF1.5? It will be difficult
> enough to find volunteers to convert DDL1 to DDLm, and I have no idea if there are
> any plans to convert DDL2 to DDLm dictionaries. If CIF1.5 will use DDLm then why not
> just go straight to CIF2.0?
> 4. CIF2.0 datafiles can look almost exactly like CIF1.1 datafiles except for a few
> datanames and some undelimited data values that include forbidden characters. Most
> people will not notice the difference between CIF2.0 files and regular old fashioned
> CIFs. Indeed many CIF1.1 data files could probably be read in with CIF2.0 parsers
> without a problem. The biggest problem are the DDL2 datanames that contain
> 'U[1][2]', but these are not found in DDL1. Since the whole DDL2 data archive is
> centrally held I assume could it could be easily converted (if it was thought
> worthwhile), If there are problems in DDL1 they are confined to one or two
> datanamens. Undelimited data values containing illegal characters could be a
> problem. The CIF1.1 lexer and preparser mentioned in 2 above will deal with all of
> these.
> 5. DDLm does not require that its lists, vectors and matrices be entered as arrays.
> dREL allows all of these new CIF2.0 constructs to be reconstituted from their
> primitives as required.
> The future as I foresea it will see everone carrying on with current software and
> CIF1.1 datafiles as long as they want. CIF2.0 software will be developed to take
> advantage of the new features, but with a CIF1.1 front end to carry out the minimal
> required conversion to CIF2.0, such applications will be able to read all existing
> and future CIFs of every stripe. Eventually CIF1.1 legacy software will die or be
> converted to CIF2.0 and the rest of the world will painlessly convert to to CIF2.0
> data files, probably without the ueser even noticing.
> I think we are imagining monsters lurking behind trees even in a treeless desert.
> CIF1.5 should be dropped and not resurrected, and I am prepared to debate this with
> Herbert privately (so as not te waste everyone else's time) if he is not convinced.
> David
> Brian McMahon wrote:
> Dear Colleagues
> I agree with James. The remit of this group was to finalise DDLm. An
> early conclusion was that this necessarily involved syntax changes at
> the STAR level, and the consequent discussions have revolved around
> the idea of providing a specification for CIF (essentially at the
> syntax level) that took advantage of these syntactic changes and
> allowed uniform handling of CIF data files and DDLm dictionaries. For
> me, the immediate benefit of these discussions has been a much more
> complete account of what needs to be done upstream, at the STAR level,
> to accommodate the changes that are desirable in downstream (CIF and
> DDLm applications) at some point.
> So, for example, the STAR spec needs formally to be revised to allow
> Unicode character sets (certainly UTF-8, which is what we settled on
> for CIF; as far as I recall, it's still possible that the STAR
> revision could allow other Unicode encodings that Herbert needs
> for imgCIF, and I'd be interested in knowing whether the new spec
> could also allow the inclusion of full binary data streams so that
> CBF could properly become one of the STAR family of formats). There
> must also be the new delimiter characters and formal rules for
> handling list items.
> We've developed these conclusions by using various use cases and
> Gedankenexperimente, but we've not, in the main, been driven by the
> need to meet real problems currently difficult of solution in the
> community. Indeed, recent work with embedded visualisation scripts and
> incorporation of TeX mathematical fragments into CIFs destined for
> publication in Acta show that there's much more that can still be
> achieved within the existing syntactic framework.
> So let us complete the job of finalising the specifications (STAR++,
> DDLm, CIF2.0), and then involve the wider community in discussing
> how, when and if they are to be implemented.
> Brian
> On Tue, Dec 01, 2009 at 02:30:09PM +1100, James Hester wrote:
> Dear Herbert and colleagues,
> Little quibble: I wrote 'one more type' rather than 'more than one type'.
> Anyway, I suggest that we concentrate on finalising CIF2.0 syntax, then put
> a draft out for discussion in the broader community, and if there is
> sufficient feedback to the effect of 'we need an intermediate format', then
> we can address the issue of CIF1.5.  Addressing it now distracts us from the
> task of putting CIF2.0 to bed, which we will still need to do in any case.
> On Tue, Dec 1, 2009 at 11:17 AM, Herbert J. Bernstein <
> yaya@bernstein-plus-sons.com> wrote:
> Dear James,
>  Please look at the following part of your first paragraph:
> "with a commitment to support CIF1.1 for the long term and a guaranteed way
> to distinguish the two types of data files."
> and please look at the following part of your second paragraph
> "Furthermore, they now have to support one more type of file going into the
> future."
> I seem to be missing something.  If we are going to support CIF 1.1 for
> the long term and we are going to have CIF 2 be a very different file type,
> then it is not CIF 1.5 that will cause software devlopers to have
> to support one more file type going into the future, but the fundamental
> decisions made by this group.
> If you support CIF 1.1 and a very different CIF 2, then you are going to
> end up with mixed files, i.e. multiple ad hoc CIF 1.5 (or actually CIF
> 1.55) files.  All I am doing is proposing to formalize what is going to
> happen anyway.
>  I've had my say.
>  Regards,
>    Herbert
> =====================================================
>  Herbert J. Bernstein, Professor of Computer Science
>   Dowling College, Kramer Science Center, KSC 121
>        Idle Hour Blvd, Oakdale, NY, 11769
>                 +1-631-244-3035
>                 yaya@dowling.edu
> =====================================================
> On Tue, 1 Dec 2009, James Hester wrote:
>  (Note to those reading this later: this continues a thread started within
> the 'space as
> list item separator' thread.  I recommend reading those messages before
> continuing on
> here).
> (For those who came in late:
> We flirted with the idea of a minimally disruptive path from CIF1.1 to
> CIF2.0 back in the
> beginning of this group (late September/early October, I believe) , and
> ended up choosing
> to define one maximally disruptive CIF2.0 standard together with a
> commitment to support
> CIF1.1 for the long term and a guaranteed way to distinguish the two types
> of data files.)
> Picking up the CIF1.5 discussion...
> Introducing CIF1.5 is a further source of confusion.  Apart from this, it
> produces extra
> workload for software authors.  Herb has essentially defined CIF1.5 as
> CIF1.1 plus new
> syntactical elements (or in other words CIF2.0 minus character set
> limitations and UTF8).
> So in order to support CIF1.5, authors of both CIF reading and CIF writing
> software have
> to add this new syntax.  Then when they decide to support CIF2.0, they
> have to once again
> revisit their software.  I would have thought it far more sensible to ask
> them to update
> and distribute their software only once.  Furthermore, they now have to
> support one more
> type of file going into the future.
> I see absolutely no benefit in this idea.
> On Tue, Dec 1, 2009 at 9:40 AM, Herbert J. Bernstein <
> yaya@bernstein-plus-sons.com> wrote:
>      Dear James,
>       The point is that we will need to make it easy for people working
> with
>      CIF 1 and CIF 1.1 based tools to cobble together valid CIF 2 data.
>  The
>      most important bit will be a way to include vectors and matrices in
> their
>      data.  This will allow them to do it.
>       Please note that it hase taken several years to just get to the
> point
>      where we are beginning to rigorously define CIF 2.  If we are lucky,
> it
>      will only take a few years to have a full set of tools to allow users
>      and software writers to reliably produce true CIF 2 data.
>       Regards,
>         Herbert
>      =====================================================
>       Herbert J. Bernstein, Professor of Computer Science
>        Dowling College, Kramer Science Center, KSC 121
>             Idle Hour Blvd, Oakdale, NY, 11769
>                      +1-631-244-3035
>                      yaya@dowling.edu
>      =====================================================
> On Tue, 1 Dec 2009, James Hester wrote:
>      Dear Herbert: as CIF 1.1 doesn't define lists, I'm not sure why you
>      suggest that the
>      example below is a valid tag.
>      On Tue, Dec 1, 2009 at 12:36 AM, Herbert J. Bernstein
>      <yaya@bernstein-plus-sons.com>
>      wrote:
>           Sorry something got lost in the prior message.  It should have
>           read:
>                 Dear Colleagues,
>                  Back to the question of commas.  If you accept the
>                  desirability of having a CIF 1.5, commas in lists
>                  become very useful. Someone with
>                  a CIF 1.1 editor will be able to prepare a CIF 1.5 file
>                  for many useful cases by doing all lists with commas
>                  and no embedded blanks as long as they can make their
>                  lists fit on single lines.
>                  In CIF 1.1
>                 [[1,2,3],[4,5,6],[7,8,9]]
>                 is a valid value for a tag, but
>                 [[1 2 3] [4 5 6] [7 8 9]]
>      is not.
>      No, neither example is a valid CIF 1.1 tag.  CIF 1.1 explicitly
>      excludes brackets as the first character of a non-delimited string.
>                 Having the option of commas in lists will help to smooth
>                 the transition for at least some people.
> _______________________________________________
> ddlm-group mailing list
> ddlm-group@iucr.org
> http://scripts.iucr.org/mailman/listinfo/ddlm-group
ddlm-group mailing list

Reply to: [list | sender only]
International Union of Crystallography

Scientific Union Member of the International Science Council (admitted 1947). Member of CODATA, the ISC Committee on Data. Partner with UNESCO, the United Nations Educational, Scientific and Cultural Organization in the International Year of Crystallography 2014.

International Science Council Scientific Freedom Policy

The IUCr observes the basic policy of non-discrimination and affirms the right and freedom of scientists to associate in international scientific activity without regard to such factors as ethnic origin, religion, citizenship, language, political stance, gender, sex or age, in accordance with the Statutes of the International Council for Science.