Discussion List Archives

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Fwd: Re: A Search Engine for Searching Across Distributed EprintArchives

  • To: epc@iucr.org
  • Subject: Fwd: Re: A Search Engine for Searching Across Distributed EprintArchives
  • From: Pete Strickland <ps@iucr.org>
  • Date: Thu, 21 Oct 2004 09:11:57 +0100
  • Organization: IUCr


----------  Forwarded Message  ----------

Subject: Fwd: Re: A Search Engine for Searching Across Distributed Eprint              
Archives
Date: Wednesday 20 October 2004 4:42 pm
From: Barry Mahon <barry.mahon@IOL.IE>
To: ICSTI-L@DTIC.MIL

Dear All,

For those of you who don't see this list, a comment from Stevan Harnad about 
'cross server' searching of articles....

As we know he doesn't think this presents a problem, something many would 
disagree with.... but here he argues,in effect, for a 'wait and see' policy 
for when we have all the material in OA.

Bye, Barry

On Wed, 20 Oct 2004, Donat Agosti wrote:

> Something, which bothers me and doesn't show up in most of the
> discussion of open access, is the construction of search tools across
> digital publications (and potentially millions of pages of legacy
> information). In the end, this will be the real issue, not just reading
> another publication face to face.

The real issue -- and the 1st, 2nd, 3rd and Nth priority today -- is
Open Access (OA) *content*: The full-texts of the 2.5 million annual
articles published in the world's 24,000 peer-reviewed journals are
still not openly accessible online (only about 20% of them are).

It is merely distraction and dreaming to worry about search tools when the OA
content is not yet there for them to search!

Having said that, cross-archive search tools (for the little OA content
we have so far) already *do* exist (and they are already far more powerful
than their sparse content yet deserves!):

    http://oaister.umdl.umich.edu/o/oaister/
    http://citebase.eprints.org/
    http://www.scirus.com/srsapp/

And (I promise you), providing more OA content is guaranteed to inspire
the creation of more and more such tools, with more and more powerful
capacities.

So please, don't worry about more powerful search tools when the cupboards are
still bare: Fill the cupboards and the search tools will come, hungrily!

> What do you think about that? It seems, that the big publishing houses
> are already thinking about that, and that they developed such facilities.

The big publishing houses' cupboards are *not* bare: They have the 100% Toll
Access content on which to provide ever more powerful search tools. Let's 
provide
100% Open Access content and then watch what happens!

> This of course is one of the most important tools, for data
> mining, extraction, or just finding the right piece of information. It
> also means, that we look beyond self-archived pdf documents to searchable
> documents with some mark up of their logic content included. Any ideas?

Two ideas:

(1) Provide the full-text Open Access content, and the tools for finding, 
mining
and extracting from it will come with the territory.

(2) The primary target is journal articles, which consist primarily of text. 
The
most powerful means of text-processing today is full-text inversion. (This is 
part
of the magic that google does.) Enhancing this with citation-linking (in place
of google's ordinary linking), plus some hub/authority analysis, citation and
download ranking, co-citation analysis, co-text (semantic/similarity) 
analysis,
and full-text boolean search, and I think you will have search capabilities to
surpass your wildest dreams.

The only missing element is the content. Please let's not forget that, and
lapse into Oneirology instead of Open Access Provision!

Stevan Harnad


----------------------------------------------------
This message has been processed by Firetrust Benign.


-------------------------------------------------------

-- 

Best wishes

Peter Strickland
Managing Editor
IUCr Journals

----------------------------------------------------------------------
IUCr Editorial Office, 5 Abbey Square, Chester CH1 2HU, England
Phone: 44 1244 342878   Fax: 44 1244 314888   Email: ps@iucr.org
Ftp: ftp.iucr.org   WWW: http://journals.iucr.org/

NEWSFLASH: Complete text of all IUCr journals back to 1948 
now online! Visit Crystallography Journals Online for more details
_______________________________________________
Epc mailing list
Epc@iucr.org
http://scripts.iucr.org/mailman/listinfo/epc

Reply to: [list | sender only]
International Union of Crystallography

Scientific Union Member of the International Science Council (admitted 1947). Member of CODATA, the ISC Committee on Data. Partner with UNESCO, the United Nations Educational, Scientific and Cultural Organization in the International Year of Crystallography 2014.

International Science Council Scientific Freedom Policy

The IUCr observes the basic policy of non-discrimination and affirms the right and freedom of scientists to associate in international scientific activity without regard to such factors as ethnic origin, religion, citizenship, language, political stance, gender, sex or age, in accordance with the Statutes of the International Council for Science.