[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [ddlm-group] Relationship of CIF2 to legacy platforms

For clarity I propose 2048 for consistency with the current spec. StarBase
used a 8192 byte buffer. I am not tied to 2048, but I agree it needs to be
some fixed number.

On 19/11/09 10:34 AM, "James Hester" <jamesrhester@gmail.com> wrote:

> We should resolve the Fortran line length issue as I think we've got
> enough information on the table - could those who haven't indicated
> their preference please vote either
> (1) CIF2 should have a maximum line length specified or
> (2) no line length should be specified.
> For bonus points, you can indicate what this length should be.
> So (including Nick's recent email) I count the votes as:
> (1) Herbert (>=2048), Nick (2048), James (4096)
> (2) Joe
> I've added my vote to the fixed line length simply because I accept
> Herbert's argument that legacy Fortran programs are actually important
> in the crystallographic world, and a restriction on line length does
> not impose a burden on CIF readers.  It also imposes a bit of
> discipline on CIF writers and helps to produce a readable file.
> On Fri, Nov 13, 2009 at 3:47 AM, Joe Krahn <krahn@niehs.nih.gov> wrote:
>> Nick Spadaccini wrote:
>>> On 3/11/09 12:53 AM, "Joe Krahn" <krahn@niehs.nih.gov> wrote:
>>>> Herbert,
>>>> I am only suggesting that maintained Fortran code ought to be able to
>>>> utilize F2003 STREAM I/O, supported by current versions of GFortran,
>>>> Intel Fortran and Sun Fortran.
>>>> Of course, I probably am not considering all of the issues. STREAM I/O
>>>> avoids the need for a fixed maximum record length, but even the newest
>>>> Fortran compilers have very limited UTF-8 support. Even with STREAM I/O,
>>>> it is not trivial to count trailing blanks as significant.
>>>> Maybe the biggest problem is UTF-8. IMHO, it makes sense for UTF-8 to be
>>>> an optional encoding, rather than just declaring CIF2 is all UTF-8. This
>>> Not sure what you gain by doing this. If it is pure ASCII only then the
>>> declaration of UTF-8 inhibits nothing, since ASCII is a subset. If it is not
>>> pure ASCII, then it needs to be UTF-8. I can't see how knowing in advance
>>> that it is a subset of UTF-8 or possibly the full set of UTF-8 gives you
>>> anything.
>>> cheers
>>> Nick
>> A compiler/language not aware of UTF-8 could avoid errors by rejecting
>> CIF files that contain UTF-8. However, I think the approach being taken
>> is just to allow implementations to restrict usage, rather than put it
>> in the specifications. For example, the plan seems to be that
>> DDL/dictionary definitions will be used to avoid UTF-8 in data names,
>> where it is most likely to be a problem. So, you are right: there is no
>> reason for the CIF2 syntax to make UTF-8 optional when the dictionaries
>> can restrict characters to the ASCII subset.
>> The other potential legacy issues I know of are fixed maximum line
>> lengths, and significant trailing blanks. Dictionary definitions cannot
>> avoid these. It might be possible to take a similar approach, by
>> avoiding them by implementation conventions rather than making it part
>> of the spec. If these are only going to be an issue for a few more
>> years, it would avoid having to make another syntax change in the near
>> future.
>> My main interest here is to avoid incompatible implementations. I also
>> think that Fortran, and any other line-oriented I/O software, should be
>> able to do stream-oriented I/O in the near future.
>> Joe
>> _______________________________________________
>> ddlm-group mailing list
>> ddlm-group@iucr.org
>> http://scripts.iucr.org/mailman/listinfo/ddlm-group



Associate Professor N. Spadaccini, PhD
School of Computer Science & Software Engineering

The University of Western Australia    t: +61 (0)8 6488 3452
35 Stirling Highway                    f: +61 (0)8 6488 1089
CRAWLEY, Perth,  WA  6009 AUSTRALIA   w3: www.csse.uwa.edu.au/~nick
MBDP  M002

CRICOS Provider Code: 00126G

e: Nick.Spadaccini@uwa.edu.au

ddlm-group mailing list

Reply to: [list | sender only]