|
Sponsored Links |
How often does Google crawl a page?Crawling is the process of collecting documents from the web. When it comes down to search engines, the purpose of crawling is to refresh the pages that have changed since the previous crawl and to discover new pages and expand the index. The major problem that crawlers face is the growth of the web. Should a search engine crawl new pages or refresh old ones? There are too many pages to crawl and search engines must choose wisely. It is important to crawl documents that change frequently and documents that are of high quality as often as possible. Crawling Priority
Search engines assign a crawling priority to every page. Crawling priority is a number that denotes the importance of a page in relation to crawling. Pages with a higher crawling priority number will be crawled before pages with a smaller priority number. Main Factors that determine Google's Crawling Priority
PageRank - pages with a higher PageRank have a higher crawling priority Number of slashes ('/') in the URLs - pages with fewer slashes in their URLs have a higher crawling priority because they tend to change more often. In other implementations, Google uses the number of slashes ('/') in the links that point to a page. Getting a link from a page with a lot of slashes in its URL results in a smaller crawling priority number. New Sites and Crawling
It is really frustrating to release a new site, and discover that in the following 3 months Google has crawled just 5% of its pages. In order for a new page A to get crawled:
The worst situation happens when you have a new site with a lot of pages that are more than 2 clicks away from the home page. These new pages might get crawled months later because the pages that link to the 3rd ++ level pages are also new (they have to be found and crawled first, and at the same time have a low crawling priority). Tips to get a new site crawled faster
Related Papers
Efficient Crawling Through URL Ordering Latest SEO Blog EntriesShareware Marketing 101 - 28 August 2006Google Webmaster Central To Solve Canonical Issues - 16 August 2006 Forum Upgraded With Signatures and Avatars - 13 July 2006 Climbing the Keyword Ladder - 17 June 2006 PayPal Is Not Enough - 12 June 2006 SEO Guide Gets a New White-Grey-Black Hat Skin - 06 June 2006 Matt Cutts On BigDaddy, PageRank and The SandBox - 17 May 2006 Google Patents On PageRank Variants - 11 April 2006 Microsoft Paper Gives Clues Into The Future of SEO - 10 April 2006 Focus On The User And Get More Traffic And Revenue - 09 March 2006 |
RSS 2.0 Atom 1.0 |