| |
|
- Information on the Web is inherently heterogeneous:
content is distributed on multiple servers in multiple locations and
multiple formats and languages aimed for diverse audiences and purposes
- Even the largest of the search engines, Google or Yahoo indexes only about 45% of all Web pages. You cant find everything on every search engines: See an example
- The “Hidden Web” of content databases (e.g. PubMed, Web of Science)
is estimated to be thousands of times larger than the Open Web.
- Both the Open Web and the Hidden Web are characterized by
problems of information coverage, quality, overload, relevancy,
currency and completeness, as well as language ambiguity
and incompatible user interfaces
- Meta-Search Engines may simultaneously search multiple Open Web and
Hidden Web sites in order to increase content coverage, precision,
relevance and/or search efficiency and effectiveness
|
|