Discussion List Archives

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [ddlm-group] New syntax: 'marker' characters

I am away at a meeting.  I'll try to comment on all this when I
get back, if you all have not resolved it. -- Herbert

=====================================================
  Herbert J. Bernstein, Professor of Computer Science
    Dowling College, Kramer Science Center, KSC 121
         Idle Hour Blvd, Oakdale, NY, 11769

                  +1-631-244-3035
                  yaya@dowling.edu
=====================================================

On Thu, 5 Nov 2009, James Hester wrote:

> Hi Joe and others: based on what Joe reports, it would seem pointless
> to add markers to the syntax, as an application can simply do a first
> quick non-tokenising run through a CIF file and create an index for
> itself if it desires efficient data extraction.
>
> I'll take up the theme of ordering elsewhere.
>
> James.
>
> On Thu, Nov 5, 2009 at 5:53 AM, Joe Krahn <krahn@niehs.nih.gov> wrote:
>
>> I use a regexp that properly handles all comments and quoting types in
>> CIF1, so it does not just search for the 'data_' sub-string.
>>
>> I am actually surprised that it could parse so quickly. However, this
>> requires parsing characters without tokenizing them; a data block is a
>> single regexp. A normal CIF parser may not be designed to parse without
>> tokenizing and storing values, so it may take some redesign to get the
>> same performance even in a compiled program.
>>
>> Of course, there is no reason why a given CIF implementation could not
>> use comments as hints for faster parsing. Even with the above argument
>> that fast parsing is possible, a large network-mounted file could go
>> slow just reading the intervening file data. However, putting the hints
>> in comments means that it does not need to be part of the CIF spec.
>>
>> Ordering hints are a bit different, because they affect more than just
>> comments. Currently, order is supposed to be irrelevant, so you could
>> claim that it is also just a performance hint. I have always thought
>> that a canonical order is useful. Most CIF software writes out "pretty"
>> formatted text because organization is useful when being viewed by a
>> human. Herbert's suggestion is to make preferred ordering an integral
>> part of the DDL, but avoid incorporating ordering rules into the CIF
>> syntax. Then, people that want ordering rules can use them, but it avoid
>> complicating the CIF spec.
>>
>> Joe Krahn
>> _______________________________________________
>> ddlm-group mailing list
>> ddlm-group@iucr.org
>> http://scripts.iucr.org/mailman/listinfo/ddlm-group
>>
>
>
>
> -- 
> T +61 (02) 9717 9907
> F +61 (02) 9717 3145
> M +61 (04) 0249 4148
> _______________________________________________
> ddlm-group mailing list
> ddlm-group@iucr.org
> http://scripts.iucr.org/mailman/listinfo/ddlm-group
>
_______________________________________________
ddlm-group mailing list
ddlm-group@iucr.org
http://scripts.iucr.org/mailman/listinfo/ddlm-group

Reply to: [list | sender only]
International Union of Crystallography

Scientific Union Member of the International Council for Science (admitted 1947). Member of CODATA, the ICSU Committee on Data. Member of ICSTI, the International Council for Scientific and Technical Information. Partner with UNESCO, the United Nations Educational, Scientific and Cultural Organization in the International Year of Crystallography 2014.

ICSU Scientific Freedom Policy

The IUCr observes the basic policy of non-discrimination and affirms the right and freedom of scientists to associate in international scientific activity without regard to such factors as ethnic origin, religion, citizenship, language, political stance, gender, sex or age, in accordance with the Statutes of the International Council for Science.