Adsense web, Tools, PLR articles, Ebooks SEBENAGHAU: Indexing

AdBrite

Friday, October 16, 2009

Indexing

Once the spiders have completed the task of finding
information on Web pages, the search engine must store the
information in a way that makes it useful. There are two key
components involved in making the gathered data accessible
to users:

* The information stored with the data
* The method by which the information is indexed

In the simplest case, a search engine could just store the
word and the URL where it was found. In reality, this would
make for an engine of limited use, since there would be no
way of telling whether the word was used in an important or
a trivial way on the page, whether the word was used once or
many times or whether the page contained links to other
pages containing the word. In other words, there would be no
way of building the ranking list that tries to present the
most useful pages at the top of the list of search results.

To make for more useful results, most search engines store
more than just the word and URL. An engine might store the
number of times that the word appears on a page. The engine
might assign a weight to each entry, with increasing values
assigned to words as they appear near the top of the
document, in sub-headings, in links, in the meta tags or in
the title of the page. Each commercial search engine has a
different formula for assigning weight to the words in its
index. This is one of the reasons that a search for the same
word on different search engines will produce different
lists, with the pages presented in different orders.

An index has a single purpose: It allows information to be
found as quickly as possible. There are quite a few ways for
an index to be built, but one of the most effective ways is
to build a hash table. In hashing, a formula is applied to
attach a numerical value to each word. The formula is
designed to evenly distribute the entries across a
predetermined number of divisions. This numerical
distribution is different from the distribution of words
across the alphabet, and that is the key to a hash table's
effectiveness.

The search engine software or program is the final part.
When a person requests a search on a keyword or phrase, the
search engine software searches the index for relevant
information. The software then provides a report back to the
searcher with the most relevant web pages listed first.
Is Your website search engine friendly? If you have any
doubts, it may be time to take a look and make your own “big
break”.

No comments:

Post a Comment