Sunday, November 6, 2011

Google new search index: Caffeine

Bookmark and Share


Google new search index: Caffeine
Today, we are announcing the completion of a used web indexing process called Caffeine. Caffeine provides 50 percent fresher results for web searches than our last index, and it is the largest collection of net site we have offered. Whether it is a editorial, a weblog or a forum post, you can now find links to relevant content much sooner after it is published than was feasible ever before.
Some background for those of you who don't build search engines for a living like us: when you search Google, you are not searching the live web. In lieu you are searching Google's index of the net which, like the list in the back of a book, helps you pinpoint exactly the information you require. (Here's a lovely explanation of the way it all works.)
So why did they build a used search indexing process? Content on the net is blossoming. It is growing not in size and numbers but with the advent of video, images, news and real-time updates, the average webpage is richer and more complex. In addition, people's expectations for search are higher than they was once. Searchers require to find the latest relevant content and publishers expect to be found the instant they publish.
To keep up with the evolution of the net and to meet rising user expectations, we have built Caffeine. The picture below illustrates how our elderly indexing process worked compared to Caffeine:
Google new search index: Caffeine
Our elderly index had several layers, a quantity of which were refreshed at a faster rate than others; the main layer would update every couple of weeks. To refresh a layer of the elderly index, they would analyze the whole web, which meant there was a significant delay between when they found a page and made it available to you.
With Caffeine, they analyze the net in small portions and update our search index on a continuous basis, globally. As they find new pages, or new information on existing pages, they can add these straight to the index. That means you can find fresher information than ever before—no matter when or where it was published.
Caffeine lets us index web pages on a giant scale. In fact, every second Caffeine processes hundreds of thousands of pages in parallel. If this were a pile of paper it would grow three miles taller every second. Caffeine takes up 100 million gigabytes of storage in one database and adds new information at a rate of hundreds of thousands of gigabytes per day. You would require 625,000 of the largest iPods to store that much information; if these were stacked end-to-end they would go for over 40 miles.
We have built Caffeine with the future in mind. Not only is it fresher, it is a powerful foundation that makes it feasible for us to build an even faster and comprehensive search engine that scales with the growth of information online, and delivers even more relevant search results to you. So stay tuned, and look for more improvements in the months to come.

No comments:

Post a Comment

video games games game download video game pc games pc games download