[Date Prev][Date Next][Date Index]
(19) DDL types, filename handles and other matters
- To: COMCIFS@uk.ac.iucr
- Subject: (19) DDL types, filename handles and other matters
- From: bm@uk.ac.iucr (Brian McMahon)
- Date: Tue, 8 Feb 94 12:35:16 GMT
Dear Colleagues Please forgive the delay since the last circular. This may be seen as something of a blessing in disguise, but there are a couple of points I should have brought out sooner. Peter Murray-Rust sent the following inquiry: PMR> How's things? I've just got the included message from Chris PMR> Sander and I thought I's better check with you before agreeing - as I'm a PMR> new boy on COMCIF. How does this fit in with other efforts. And what is PMR> the view of the committee? I think it arises as I agreed in general PMR> terms to publicise the mmCIF in the biological; newsgroups when it became PMR> announceable. I haven't seen any mail recently so I'm not sure what PMR> timescale we are at. PMR> PMR>> ---------- Forwarded message ---------- PMR>> Subject: COMCIF PMR>> PMR>> Dear Peter, just saw your message to Rob and thought of saying hello. PMR>> PMR>> We've been thinking off and on about summarizing the discussion PMR>> of possible CIF extensions and hope that between you, Phil and PMR>> us (Michael Scharf here with me) a discussion document can PMR>> be drafted within the not too distant future. What do you think ? PMR>> PMR>> Best regards, Chris Sander I should suppose that the thought police of COMCIFS would not go so far as to block anyone from free and open discussion! It would probably be useful for groups who set up discussion lists or e-mail conferences to invite someone from COMCIFS to listen in, so that feedback could be supplied if there was any likelihood of people putting forward proposals that might run counter to the standard; but it would surely be beneficial to have a wider community of people involved in software development working together. The York and Tarrytown meetings sparked off a lot of useful ideas, but in the longer term there will likely be input from other groups with different interests and areas of expertise. Phil Bourne is already doing an excellent job of coordinating feedback from the mmCIF meetings, which is why I think it useful that he also has a place in our discussion forum. Any other comments? I hope that I can make some comments on the timescale for development of the draft dictionaries later in this week. =============================== Agreements A10.6 Any printable ASCII character other than white space will be permitted in a CIF dataname. Only the leading underscore character '_' is syntactically important. This is a direct application of the STAR principle, as outlined by Syd: S> The latest STAR specs paper in JCICS is in press and no character S> restrictiions other than <white space> and underscore exist S> for data names. Ipso facto, no such restrictions exist for CIF. For the record, we need to have an official definition of 'white space'. The ASCII characters recognised as white space in the C Standard are: space, form feed, newline, carriage return, horizontal tab and vertical tab. We may also need an official definition of the convention for terminating a line (the idea of lines in CIF arises from the rule that "Lines may not exceed 80 characters"). A CIF on a Unix system has a newline character at the end of every line; when transferred to a PC by ftp this newline will be maintained if the transfer is in binary mode, but translated into a carriage return/newline pair otherwise. I propose that the convention be that lines in a CIF contain no more than 80 printable ASCII characters excluding the terminating whitespace characters used by the local computer system. In practice, it would be well to stop a character or two short of this. A line written on a PC is terminated by ^M^J as the single end-of-line flag. If transferred to a Unix system by binary ftp, the ^J maps to a newline, but the ^M is incorporated as an extra white-space character on the line. (Syd: May one presume that the formal BNF syntax of STAR is not affected by these considerations, since the STAR format is considered as a byte stream and not a sequence of records?) A15.1 Standard prefixes ----------------------- S> My only small contribution is that _xtal_* data names have existed in our S> files for three years. This may also be true for _shelx_* items as well. It S> probably would be sensible to tell COMCIFS that one has done this, but I S> think any restrictions beyond informal notification is overkill. ===================================== Current discussion topics D15.1 New types --------------- S> D15.1 At the time Tony Cook and I formulated _type_conditions we realised S> that it was opening Pandora's box! We just wondered how long before the S> enumeration list exceeded two score and ten! Judging from the mail it S> won't be too long. S> S> OK, that's the intention of this attribute, but I must remind everyone that S> core DDL definitions become part of the STAR restrictions and we are going S> to look at these very carefully before including them in the initial DDL S> core specifications. For example, the _enumeration_constraint (_construct) S> item is as yet half-baked (as you point out) and will need much more work S> and testing before it could be included in the core DDL definitions. I hastened to point out to Syd that the _enumeration_constr*t was even less than half-baked, but was intended to illustrate purely schematically how such details might be encoded in a machine-parseable form. People will be working on these ideas, but externally to the core DDL: they will become definite proposals only when they have been demonstrated to work. S> The 'date' and 'bool' conditions seem logical enough but do we really want S> to freeze a date construction into STAR? We have accepted this construct for S> CIF and it seems pretty logical but are there better ones? I personally feel S> a bit nervous about acting as God on such celestial matters. Also 'bool' is S> interesting and logical but what about all of the existing definitions S> involving (yes|no) -- presumably they remain as char (no-bool)!? S> S> So may this debate flourish but please take into account the global nature S> of these items. If the proposers of additional _type_conditions enumeration S> values convince Tony and me that something is a must for the core DDL, we S> will include it in the initial publication (if they wish it). Otherwise the S> extra enumeration values (and perhaps su/esd is in this category also) will S> have to await extensions to the DDL for specific applications such as CIF. A16.1 - Reopened by (18)D15.1 ----------------------------- B> David's comment on creating a date/time type is excellent. I would B> be happy to adopt the "yyyy.mm.dd-hh:mm:ss" form. (I will be less B> happy to make all the changes in the pd dictionary, but it would B> clearly be an improvement.) B> B> I would appreciate comments on this in a rapid time frame as I would B> like to see a decision before completing the next draft of the dictionary. D16.1 e.s.d./s.u. ----------------- S> Well, David put over my point succinctly. I am not too sure how to S> interpret Howard's reply....these recommendations are in the pipeline. If S> these recommendations are made then it will take a considerable time for S> them to adopted by the journals and the community. My inclination on seeing S> these comments is to leave su/esd out of the core DDL entirely, or at least S> until the matter is resolved. D17.2 Revised DDL ----------------- B> For the record I am quite comfortable with Brian's <iucr/mm/restr.lst> B> syntax as is. But I am still concerned about namespace uniqueness, unless B> we set rules for naming. My reason for including a date/time in the file B> name was not for version tracking but to better insure uniqueness. The B> name <private/xplor/defaults> can probably counted on as being unique B> but how about <private/smith/defaults> or <private/brown/defaults>? B> (Interesting question: what is the most common surname for B> crystallographers, stay tuned for the electronic world directory). (Currently "Wang", with "Tanaka" in second place - but we haven't yet got the UK and USA entries.) B> Perhaps we can let the uniqueness issue go for include files, at least B> for now, but I do need to resolve it very soon for inter-file references B> in cifdic.P94 (e.g. _pd_dataset_id). It would be nice to have a single B> method for addressing uniqueness for both applications. B> B> At a minimum, I would like to see two concepts implemented inside B> include files: B> (1) a beginning of file marker B> (2) a file name label B> The reason for (1) is that if CIFs will be concatenated, it will be B> necessary to have a standardized (non-comment) mechanism for determining B> where files begin. (2) allows the file name to be exchanged. Note that B> both of these ideas could be implemented by requiring a line such as B> _file_name iucr/mm/restr.lst B> as the first non-comment line in the file to be included. S> In short I am unconvinced by any of the arguments for complex filename S> specifications to appear in the core DDL. Everyone has different views on S> how this should be done -- and this is usually the telltale sign that it S> must be kept simple in the general definition. If specific applications wish S> to introduce their own special machine-specific site-specific constructs, S> so be it. The _include_file value will be a character string -- no special S> constructs whatsoever in the core DDL specifications. S> S> Unless there are convincing arguments to the contrary, _file_version_id will S> be included in the DDL. Brian T may not be pleased with my lack of S> imagination with the filenames, but he must poutingly accept that I did S> pick up on his plea for tighter identification of the included file! Regards Brian
- Prev by Date: (18) New Consultants, _include_file, matters arising
- Next by Date: (20) New dictionaries. Date/time, multimedia
- Index(es):