[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Reply to: [list | sender only]
Re: Problems with CIF BNF
- To: "Discussion list of the IUCr Committee for the Maintenance of the CIFStandard (COMCIFS)" <[email protected]>
- Subject: Re: Problems with CIF BNF
- From: "Herbert J. Bernstein" <[email protected]>
- Date: Mon, 12 Mar 2007 22:12:08 -0400
- In-Reply-To: <[email protected]>
- References: <[email protected]><a06230902c21b5216f07d@[192.168.10.211]><[email protected]>
You may find the following links helpful:
http://arcib.dowling.edu/cifiucr/
especially
http://www.bernstein-plus-sons.com/software/ciftest/
and
http://arcib.dowling.edu/vcif/
At 4:30 PM -0400 3/12/07, Joe Krahn wrote:
>I realize that there are a few hacks in the BNF to deal with
>context-dependence, like productions defined as multiple symbols, which
>make it impossible to use as a working BNF. But, there are other
>problems with grammar. With the end-of-line example, the lexer can do
>something 'sensible', but it is still important to have a specific
>definition of whether missing a terminal <eol> makes the CIF invalid.
>
>I can look at CBFlib to see an interpretation of the CIF grammar, but
>someone else's parser may have a different interpretation. In fact, it
>would be good to have a collection of unusual CIF files for parser
>testing, with a consensus as to which ones are valid and which are invalid.
>
>Joe
>
>Herbert J. Bernstein wrote:
>> Without a defined lexer, you cannot do CIF as a BNF; it is context
>> sensitive in its use of whitespace. The question you are raising
>> about EOF should be handled by the lexer, which should deal sensibly
>> with the usual unix problem of disambiguating the case of a final
>> line that ends with eof rather than eol-eof. There is a rather
>> complete bison grammar in CBFlib working on the level of tokens
>> after lexing the input. -- HJB
>>
>>
>> At 1:44 PM -0400 3/12/07, Joe Krahn wrote:
>>> Some parts of CIF are vague. I hoped that the BNF syntax would be a
>>> precise syntax specification, but it has problems. It is central to
>>> properly defining the CIF format, and should therefore be very accurate.
>>>
>>> First, there are some plain syntax errors, like unbalanced braces in the
>>> production of <Float>, and an empty token in the TokenizedComments
>>> production.
>>>
>>> There are also a few hacks like <noteol>, and the lack of rules for the
>>> content of quoted strings. I think it is also a hack for a production
>>> unit to be defined for two elements, like "<eol><UnquotedString>".
>>>
>>> Does EOF count as whitespace? Normally, a text file ends with an <eol>
>>> on the last line, so it is not a problem. With Fortran, you may not be
>>> able to distinguish between them, so it seems that EOF probably should
>>> count as a whitespace token.
>>>
>>> There are also places where the grammar could be simplified, such as:
>>>
>>> { {'e' | 'E' } | {'e' | 'E' } { '+' | '- ' } } <UnsignedInteger>
>>>
>>> written as:
>>> {'e' | 'E' } { '+' | '-' }? <UnsignedInteger>
>>>
>>> Also note the error in the first form copied from the web page: the
>>> minus sign has a space included.
>>>
>>> Should the logical-OR symbol always be contained within braces? This
>>> appears to be inconsistent, but maybe the rule is to require braces when
>>> the members include a quoted character element.
>>>
>>> I will try to edit my own version of the BNF to produce what I think it
>>> is supposed to mean. Answers to some of the above questions will be
>>> helpful in getting it right.
>>>
>>> Thanks,
>>> Joe Krahn
>>> _______________________________________________
>>> comcifs mailing list
>>> [email protected]
>>> http://scripts.iucr.org/mailman/listinfo/comcifs
>>
>> _______________________________________________
>> comcifs mailing list
>> [email protected]
>> http://scripts.iucr.org/mailman/listinfo/comcifs
>_______________________________________________
>comcifs mailing list
>[email protected]
>http://scripts.iucr.org/mailman/listinfo/comcifs
Reply to: [list | sender only]
- Follow-Ups:
- Re: Problems with CIF BNF (Joe Krahn)
- References:
- Problems with CIF BNF (Joe Krahn)
- Re: Problems with CIF BNF (Herbert J. Bernstein)
- Re: Problems with CIF BNF (Joe Krahn)
- Prev by Date: Re: COMCIFS Annual Report for 2006 (draft)
- Next by Date: Re: Problems with CIF BNF
- Prev by thread: Re: Problems with CIF BNF
- Next by thread: Re: Problems with CIF BNF
- Index(es):

