[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Reply to: [list | sender only]
Line Separators
- To: Multiple recipients of list <imgcif-l@bnl.gov>
- Subject: Line Separators
- From: Andy Hammersley <hammersl@esrf.fr>
- Date: Thu, 29 Feb 1996 08:20:50 -0500 (EST)
Hello, I think that we're almost at a stage when the best way forward might be to prepare a detailed specification document of the proposed data format and centre discussion around the specification. However, I think that the manner in which "lines" are separated in the header section deserves some careful consideration. (Please remember that the file is a BINARY file, and should not be confused with an ASCII text file. The following is only an attempt to make conversion to an ASCII text file, or viewing with some editors, of the header section as simple as possible.) I know of three basic ways in which different operating systems store ASCII text files (there are probably variants of at least some of these methods): 1. Variable-length records, where typically the first two bytes of each record specify the length of the record. This is how VMS used to store ASCII text, and I guess other "old" operating systems. (This seems the most elegant to me, as programs do not need to examine bytes in a line, once they know that they want to jump to another line) 2. "Stream-LF" The ASCII text is one long byte stream, and the line-feed character (ASCII byte value 10) is used to signal the end of a "line". This is the normal Un*x method, but VMS can recognise such files. (This seems the simplest, but inefficient method to me.) 3. On DOS ASCII text is stored in a manner similar to "Stream-LF", but has an additional carriage-return character (ASCII byte value 13) before the line-feed. (If this has a name, I'm sorry, but I don't know it.) So far we have talked about a header section which would follow the "stream-lf" approach, but maybe it would be better to follow the DOS approach. The variable-length records despite their elegence seem to be gradually disappearing, and are very different to either the Un*x or DOS approaches. In this "modern" (?) world I don't see the variable-length record approach as being viable choice. I see three reasonable alternatives: i. Use ONLY Stream-lf ii. Use ONLY the DOS approach iii. Allow either Stream-lf or the DOS approach to be used. ----------------- iii. is possible, but would complicate the task of parsing a file. e.g. if (no_carriage_return) then Do something else Do something slightly different presumably jumping over the CR end if As it makes the format more complicated I think iii. is best avoided. At the ESRF, this point was discussed and the following conclusion drawn: "A DOS text file can be viewed as a stream-lf file on a Un*x system, and the extra carriage-returns just make it look slightly messy (^M's I think). However, if a DOS editor looks at a file without the carriage-returns the result is far worse." (I'm not sure what happens, but this what I am told.) Thus it was concluded that taking the DOS approach was better. (Of course you can also wonder about which O.S. will be dominant in 10 years time.) Any other comments or suggestions ? Andy Hammersley
Reply to: [list | sender only]
- Prev by Date: Re: imageNCIF
- Next by Date: Re: Line Separators
- Prev by thread: Too little too late?
- Next by thread: Re: Line Separators
- Index(es):