[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Let us define more carefully what we want from this survey. Firstly, informed comment is always welcome, so to that end I will include a link to our discussion and an invitation to provide a comment, as you suggest. However, in the unlikely case that these comments turn out to be variations on the themes that we have already covered, and so provide no clearer path forward than we have already, we need to include the simple survey questions, which are designed to approximately answer the following questions:
1. What proportion of crystallographers would be significantly inconvenienced by having to manipulate CIF files in UTF-8 (this is a variation of the 'respect' argument but using less loaded words)? (Q1-5)
2. What is most important: the 'text' nature of CIF files, or fidelity of file transfer and retrieval? (Q8)
I have rejigged to questions to avoid any appearance of seeking a predetermined answer (if that's what Herbert meant by 'push-polling'), so that we now have:
Herbert's introductory paragraph, followed by...
"The following questions relate to editing text files. Examples of such files include CIF files, Windows .INI files, and programming source code."
1. How often do you edit text files?
2. What scripts or languages do you use when editing text?
3. What operating systems do you use when editing text?
4. What text editors do you usually use for editing text?
5. How difficult would it be for you to input and output text files in UTF-8 encoding? (Very difficult/difficult/neutral/easy/very easy)
6. What is your preferred method (if any) of transmitting text files containing non-ASCII characters to colleagues or organisations in other countries?
7. How often has the method described in (6) led to incorrect display of the transmitted file?
8. How important do you think the following things are when designing changes to the CIF standard (rank 1-4, presented in random order)?
(a) Reliable transmission and retrieval of CIF file contents
(b) Ease of use in text tools such as editors and text-based search applications
(c) Backwards compatibility
(d) Availability of non-ASCII characters
9. Please comment (if you wish) on your rankings in question 9
10. Please comment (if you wish) on this encoding issue, and include a name and email address if you are prepared to discuss your comments further.
I have no doubt not expressed these questions as well as some of you may have, so I welcome improvements. Note that Q8 includes some gratuitous market research which we could also use as a baseline to determine the relative importance of our discussions.
Let's try to agree ahead of time how we will interpret the results. A small (<10%?) number answering 'difficult' or 'very difficult' for 5 when restricted to those users who often deal with non-ASCII text (based on answers to (1) and (2)) suggests to me that restricting encoding to UTF-8 would not involve much inconvenience. We can check the OS and editor choice as well to cross-check difficulty. I would interpret a higher ranking for option (7.a) compared to option (7.b) as being in favour of the UTF-8 only option. Questions 6 and 7 are designed to gather information about the prevalence of problems in text file transfer, and solutions that others have found.
I'd like to get this done early next week, and I will post a link to the survey to this group for feedback before sending it further afield. I would then post requests to national crystallographic associations (perhaps via the European Crystallographic Association, AsCA and the ACA?) and wait a month for responses. in the meantime we will attempt to clear up some non-UTF-8 business.
--
T +61 (02) 9717 9907
F +61 (02) 9717 3145
M +61 (04) 0249 4148
Reply to: [list | sender only]
Re: [ddlm-group] Community consulation regarding CIF2 encoding
- To: Group finalising DDLm and associated dictionaries <ddlm-group@iucr.org>
- Subject: Re: [ddlm-group] Community consulation regarding CIF2 encoding
- From: James Hester <jamesrhester@gmail.com>
- Date: Fri, 2 Jul 2010 16:54:16 +1000
- In-Reply-To: <alpine.BSF.2.00.1007011002070.66637@epsilon.pair.com>
- References: <AANLkTikJeNZ73b8I6oLVVVCq42reCzGLB8-F6wyAxiCq@mail.gmail.com><301204.88233.qm@web87002.mail.ird.yahoo.com><alpine.BSF.2.00.1007010617140.32815@epsilon.pair.com><AANLkTilpERNowTmkjKQOyOCqyT2oiytGWyvwZld0oMmf@mail.gmail.com><alpine.BSF.2.00.1007011002070.66637@epsilon.pair.com>
Let us define more carefully what we want from this survey. Firstly, informed comment is always welcome, so to that end I will include a link to our discussion and an invitation to provide a comment, as you suggest. However, in the unlikely case that these comments turn out to be variations on the themes that we have already covered, and so provide no clearer path forward than we have already, we need to include the simple survey questions, which are designed to approximately answer the following questions:
1. What proportion of crystallographers would be significantly inconvenienced by having to manipulate CIF files in UTF-8 (this is a variation of the 'respect' argument but using less loaded words)? (Q1-5)
2. What is most important: the 'text' nature of CIF files, or fidelity of file transfer and retrieval? (Q8)
I have rejigged to questions to avoid any appearance of seeking a predetermined answer (if that's what Herbert meant by 'push-polling'), so that we now have:
Herbert's introductory paragraph, followed by...
"The following questions relate to editing text files. Examples of such files include CIF files, Windows .INI files, and programming source code."
1. How often do you edit text files?
2. What scripts or languages do you use when editing text?
3. What operating systems do you use when editing text?
4. What text editors do you usually use for editing text?
5. How difficult would it be for you to input and output text files in UTF-8 encoding? (Very difficult/difficult/neutral/easy/very easy)
6. What is your preferred method (if any) of transmitting text files containing non-ASCII characters to colleagues or organisations in other countries?
7. How often has the method described in (6) led to incorrect display of the transmitted file?
8. How important do you think the following things are when designing changes to the CIF standard (rank 1-4, presented in random order)?
(a) Reliable transmission and retrieval of CIF file contents
(b) Ease of use in text tools such as editors and text-based search applications
(c) Backwards compatibility
(d) Availability of non-ASCII characters
9. Please comment (if you wish) on your rankings in question 9
10. Please comment (if you wish) on this encoding issue, and include a name and email address if you are prepared to discuss your comments further.
I have no doubt not expressed these questions as well as some of you may have, so I welcome improvements. Note that Q8 includes some gratuitous market research which we could also use as a baseline to determine the relative importance of our discussions.
Let's try to agree ahead of time how we will interpret the results. A small (<10%?) number answering 'difficult' or 'very difficult' for 5 when restricted to those users who often deal with non-ASCII text (based on answers to (1) and (2)) suggests to me that restricting encoding to UTF-8 would not involve much inconvenience. We can check the OS and editor choice as well to cross-check difficulty. I would interpret a higher ranking for option (7.a) compared to option (7.b) as being in favour of the UTF-8 only option. Questions 6 and 7 are designed to gather information about the prevalence of problems in text file transfer, and solutions that others have found.
I'd like to get this done early next week, and I will post a link to the survey to this group for feedback before sending it further afield. I would then post requests to national crystallographic associations (perhaps via the European Crystallographic Association, AsCA and the ACA?) and wait a month for responses. in the meantime we will attempt to clear up some non-UTF-8 business.
On Fri, Jul 2, 2010 at 12:15 AM, Herbert J. Bernstein <yaya@bernstein-plus-sons.com> wrote:
Dear James,
Please point not just to your survey, but to the entire discussion.
Please recall that the entire discussion is already public on the
web. This is just a matter of calling community attention to it.
While I find the survey somewhat slanted if presented without the background, essentially a push-poll, if we can get the community involved in the discussion, and not just presented with structured questions, I have faith that we will get a good sense of what is workable in the current context. Those who just want to respond to your survey questions
can do that. Those who wish to delve into the issue more deeply can do
that. It is up to them, not to us.
In any case, I recommend putting something out to some assortment
of lists very soon, so we can have the discussion well started before
the ACA meeting.
Regards,
Herbert
P.S. To some extent the poll has a bit of the flavor of voting on
the value of PI. What matters is not whether a huge majority likes
solution A or B. What matters is whether choosing one solution or
the other will impede some significant chunk of science from getting
done, and, as with the value of PI, that is hopefully a factual
determination, not a matter of opinion. We need informed commentary
much more than we need to count votes on preferences.
=====================================================
Herbert J. Bernstein, Professor of Computer Science
Dowling College, Kramer Science Center, KSC 121
Idle Hour Blvd, Oakdale, NY, 11769
+1-631-244-3035
yaya@dowling.edu
=====================================================
On Thu, 1 Jul 2010, James Hester wrote:
Hi Herbert: the idea would be to distribute an email with a pointer to the survey. Your suggested paragraph would be a reasonable text for that email, acting as an introduction to the questionnaire, although the mention of XML and HTML I think is slanting the question somewhat. And indeed we should include an open-ended question in the survey asking for their thoughts. The point of the short series of questions is that those who have no time to spend familiarising themselves with our discussion and formulating a thoughtful reply are still able to spend a few minutes and provide important information on which we can base our decision.
On Thu, Jul 1, 2010 at 8:37 PM, Herbert J. Bernstein
<yaya@bernstein-plus-sons.com> wrote:
Dear Colleagues,
Unless we are assuming that the CIF2 transition is not acutally
going to happen, that transition is going to involve a wide
range
of both software developers and users of crystallographic
software
throughout the community. Either we have te dicsussion with
them
on a UTF-8-ony standard now, or we will have to have the
discussion
with them later, when it is much harder and more expensive to
revise what we will have done.
If James is reluctant to post his own summary to the lists,
then
how about the following:
COMCIFS, the IUCr Committee of the Maintenance of the CIF
standard
is considering some important improvements and extensions to
CIF.
Among the extensions being considered is enlarging the character
set allowed from simple ASCII to the full UNICODE character set
(the same set of characters used in web browsers with HTML and
in XML). There is strong disagreement on COMCIFS as to whether
this would best be done by mandating just a single UNICODE
encoding,
UTF-8, or whether is would be best to follow the practives of
HTML
and XML in allowing alternate encodings. The full thread of the
discussion thus far can be seen at:
http://www.iucr.org/__data/iucr/lists/ddlm-group/
Comments from interested members of the community would be
appreciated.
Regards,
Herbert
=====================================================
Herbert J. Bernstein, Professor of Computer Science
Dowling College, Kramer Science Center, KSC 121
Idle Hour Blvd, Oakdale, NY, 11769
+1-631-244-3035
yaya@dowling.edu
=====================================================
On Thu, 1 Jul 2010, SIMON WESTRIP wrote:
I agree this would probably be more productive.
Perhaps the IUCr could point its authors at such a survey
- via its CIF
author services pages (printCIF, checkCIF...)?
Cheers
Simon
___________________________________________________________________________
_
From: James Hester <jamesrhester@gmail.com>
To: ddlm-group <ddlm-group@iucr.org>
Sent: Thursday, 1 July, 2010 6:51:47
Subject: [ddlm-group] Community consulation regarding CIF2
encoding
Dear DDLm-ers,
I think Herbert's suggestion of sending a version of my
summary out is
unlikely to produce a great deal of enlightenment, because
I expect the
range of responses to simply mirror that which we have
already seen in this
group, with no ultimate resolution. I would like to
propose instead a
simple questionnaire that we can use to inform our
decision. The questions
I would like to see answered are:
1. Do you regularly use non-ASCII characters when editing
text? Examples
of such characters include accented ASCII characters, and the
characters
from Arabic, Japanese, Chinese, Cyrillic etc. (Yes/No/Don't
know)
2. What languages do you usually deal with when editing text?
3. What text editing programs do you usually use?
4. Can the text editors that you usually use read and write
files in UTF-8
format? (yes/no/don't know)
5. Which non-ASCII encoding do you think would result in the
least problems
when transferring your text files across the internet?
6. Would you object to a new CIF standard which allowed only
UTF-8 encoded
files? If so, why?
7. Do you have any comments regarding suitable choice of
encoding(s) for
the new CIF standard?
Once we have fine-tuned the questions, I would suggest creating
the survey
using www.surveymonkey.com, then posting requests for responses
wherever
crystallographers are to be found, but especially in groups
where non ASCII
scripts are likely to be found (European Crystallographic
Society, Japanese
Crystallographic Society, Computing Commission etc.).
James.
--
T +61 (02) 9717 9907
F +61 (02) 9717 3145
M +61 (04) 0249 4148
_______________________________________________
ddlm-group mailing list
ddlm-group@iucr.org
http://scripts.iucr.org/mailman/listinfo/ddlm-group
--
T +61 (02) 9717 9907
F +61 (02) 9717 3145
M +61 (04) 0249 4148
_______________________________________________
ddlm-group mailing list
ddlm-group@iucr.org
http://scripts.iucr.org/mailman/listinfo/ddlm-group
--
T +61 (02) 9717 9907
F +61 (02) 9717 3145
M +61 (04) 0249 4148
_______________________________________________ ddlm-group mailing list ddlm-group@iucr.org http://scripts.iucr.org/mailman/listinfo/ddlm-group
Reply to: [list | sender only]
- References:
- [ddlm-group] Community consulation regarding CIF2 encoding (James Hester)
- Re: [ddlm-group] Community consulation regarding CIF2 encoding (SIMON WESTRIP)
- Re: [ddlm-group] Community consulation regarding CIF2 encoding (Herbert J. Bernstein)
- Re: [ddlm-group] Community consulation regarding CIF2 encoding (James Hester)
- Re: [ddlm-group] Community consulation regarding CIF2 encoding (Herbert J. Bernstein)
- Prev by Date: Re: [ddlm-group] Community consulation regarding CIF2 encoding
- Next by Date: [ddlm-group] On process
- Prev by thread: Re: [ddlm-group] Community consulation regarding CIF2 encoding
- Next by thread: [ddlm-group] Summary of encoding discussion so far
- Index(es):