Wednesday, July 1, 2009

Information Retrieval on the Web

Source Article :

Kobayashi, M. & Takeda, T. Information Retrieval on the Web. [Electronic version]. ACM Computing Surveys, 32 (2), June 2000.
Retrieved on June 30, 2009, from http://comminfo.rutgers.edu/~muresan/IR/Docs/Articles/csurKobayashi2000.pdf


Abstract

The article reviews studies conducted in relation to the growth of internet, its users and tools for web-based information retrieval and ranking. Specifically, the paper presented studies on how search engines are evaluated. It also details the different tools/technologies in information retrieval, which will expectedly make searching for specific information easier. Based on the studies cited, it was revealed that most internet users find the speed of information retrieval as the primary problem in searching the Web, broken links and inability to find the relevant information came in close. In lieu with the mentioned problems, development of new strategies in web-based information retrieval was also discussed.


Three Things I learned from the article

  • The article confirms my belief that nothing can replace a good book. With vast information available in the Web, searching for the most relevant data would be like looking for a needle in a haystack.
  • Manual indexing is considered better than automatic, but then again, manual indexing drawbacks are usual.
  • Comments on web documents such as blog posts are considered annotations and these aids in indexing the document.

The researches cited in the article, though conducted for quite a long time now, still stand true in today’s scenario. Most, if not all library clients, rely on the free information available in the internet. This makes them prone to information overload, resulting to low percentage of obtaining the most relevant information. I was once assigned at the Graduate School of Business Library which caters professional clients. The fact that these clients are all involved in their careers, other than in their academics resulted to rare times of library visit, either to read or to borrow books. They depend on web-based information retrieval which makes their searching more complicated compared to information retrieval on the designed database and the materials that they will eventually locate.

The index is the most important tool in information retrieval [Manber 1999]. Human or manual indexing is considered better than automatic although the latter is done by machines/robots. But human indexing is not that perfect either, several fall backs were mentioned in the article which includes inconsistency of indexers. Moreover, in the absence of standards, an indexer has a tendency to assign/create a new subject term for a subject which already existed. In DLSU library, editing of the existing article indexes is a major project. Subjects assigned to the past articles have been inconsistent. Though the difference is not that weighty, it still creates confusion to the users.

Serious comments on web documents aid in information retrieval and also in evaluating documents. Thus, a good comment on a web document will possibly interest the searcher to read the whole document. Aside from providing reviews and links to information retrieval, comments can also help the author in improving his/her work. Just like what we can do if and when our classmates critique our RAs. :-)


1 comment: