search engine optimization company seo website designing web designing article writing content
Services Articles Request for Proposal Contact Us
Home About Us Web Development Seo Services Link Building Content Syndication Blogging Software Portfolio
seo packages
smo packages
Website Optimization
arrow Offsite Optimization
  Directory & SE Submission
  Article Submission
  Social Bookmarking
  Social Free Blogging
  Rss Submission
  Classified Submission
  Press Release
  Forum Submission
  Anchor Text Analysis
  Link Building
   
arrow Onsite Optimization

 

Website Analysis

 

Keyword Analysis

 

SEO Copywriting

 

SEO Copyediting

 

Blog Writing

 
Contnet Syndication
arrow Article Writing
arrow SEO Copywriting
arrow
Blog Writing
arrow Editing & Proofreading
arrow Webcopy Editing
arrow SEO Copyediting
arrow Whitepapers
arrow Ghostwriting
arrow Website Copywriting
arrow Press Release
   
Search Engine Optimization
arrow SEO Guide
arrow SEO Tutorial
arrow SEO Experts
 
arrow
Request for Proposal
Process of website indexing by Google & other Search Engines

There is a lot of buzz about how various search enginesindex websites. This topic is enshrouded in enigma about exact algorithms of searchengine’s strategy to index the webpage or website as most of the searchenginesdisclose limited information about how they plan and workout for indexingany webpage or website. Website owners get some hints bystudying the log reports. These reports say about the visits by crawlers but then too, it is quite illusive that how the indexingoccurred or which web-pages of the websitewere actually crawled.

While the hypothesis about the process of searchengine indexingmethod continues, we can certainly develop a theory that is based on our experience, study, hints and tips we get from experts in internet marketing and search engine optimization. We can get information about how search engines might be going about indexing8-10 billion web pages so quickly and also in opposite case, the reason why it delays in showing up newly posted web pages in indexing.

Generally, when we talk about search engine, it is the discussion about giant Google! However, many SEO experts believe that most popular searchenginessuch as Yahoo! and MSN follow quite a similar fashion of indexing websites on the internet. Let us look at the most commonly followed indexing strategy of Google.

Step 1: Google is run from about 10 IDCs (Internet Data Centers). Each of the data center has approximately 1000 to 2000 Pentium-3 or Pentium-4 servers based on Linux operating system.

Step 2: Google has whopping 200 (some say ‘over 1000′) crawlers/Googlebots that scan the web every day. Now they do not necessarily adopt an exclusive pattern, which means that without knowing other crawler’s visit, various crawlers might visit the same website on the same day. That is why; it probably gets us a ‘daily visit’ record in the traffic log report. This, in turn, keeps website owners quite happy about the frequent visits.

Step 3: Among all, some crawlers do their job by only grabbing new URLs (they are therefore called as URL Grabbers). These URL grabbers grab links and URLs those they detect on various websites including the web-links directing internet user to your website. They also record the ‘date stamp’, present on the files uploaded while visiting the website. In this way, they can distinguish a new content or an updated content web-page.

Step 4: The URL grabbers honor the robots.txt file and also Robots Meta Tags in order that they include or exclude URLs you want to index on search engine or don’t. (Note: Some URL having different session IDs are considered as different ‘unique’ URLs. It is therefore, session ID’s are better to avoid, or else they, by mistake, can be considered as duplicate content.

Step 5: The URL grabbers provide very little time and bandwidth on the website as the job they are given is quite simple. Nevertheless, we all know, they require to scan about 8-10 billion URLs on the web a month. This, even for 1000 crawlers, is not a little job by any means!

Step 6: The URL grabbers record the captured URLs along with date stamps and other required information in a ‘Master URL List’. This helps them for deep-indexing the websites by other special crawlers.

Step 7: The Master URL list is later processed and categorized somewhat like this –

New URLs observed, then

Old URLs with latest date stamp, then

301 and 302 redirecting URLs, then

Old URLs having old date stamp, then

404 error URL and then,

Other URLs

Step 8: The actual indexing is performed by (what we call) a deep crawler. A deep crawler’s duty is to catch URLs from the Master URL List and deep crawl every URL and include all the content - texts, HTML, images, flash and so on.

Step 9: The precedency is given to Old URLs having new date stamps. This is because they are nothing but already indexed pages but with fresh contents. ‘301 and 302 redirected URLs’ come in second after that. And this is again followed by ‘New URLs noticed’.

Step 10: High priority is given to URLs that have useful and appropriate links appearing on several other websites. They are classified as essential URLs. Websites and URLs having date stamp and content update regularly or hourly are labeled as News’ websites those are indexed every hour or even every minute.

Step 11: Indexing of ‘Old URLs having old date data’ and ‘404 error URLs’ are completely dismissed. As search engines already have the content indexed that is still not updated, they do not waste their resources to index ‘Old URLs having old date stamps.’

Step 12: 404 error URLs are those gathered from various websites. Since they are in broken links/error pages such URLs do not display content on them if any.

Step 13: The ‘Other URLs’ can contain those which are dynamic URLs, got session IDs, documents in PDF/Word format, PowerPoint presentation, files containing multimedia etc. The Google requires to further process this and navigate which one is worth to index and up to what degree. It probably allocates indexing task to special crawlers for further procedure.

Step 14: When Google dockets the deep crawlers for indexing the New URLs and 301 and 302 redirected URLs, it is just a URL (not the content) starts appearing in SERPs (Search Engines Result Pages) when you hit the search “website: www.<domain>.com” in Google.

Step 15: As deep crawlers have to crawl billions of web pages a month, they require as many as 4-8 weeks for indexing even updated contents. New URLs might require longer to get indexed.

Step 16: When the deep crawlers have indexed the content, it then goes into their originating IDCs. The next step consists of processing the content, sorted and retroflexed (synchronized) to the remaining IDCs. Some years back, when the size of the data was doable, the data synchronization was done once in a month that used to last for about five days. This was known as ‘Google Dance’. Today, the data synchronization is done constantly that some people label to as ‘Everflux’

Step 17: When you are on Google through any browser, you can get down at any of the 10 IDCs based on their speed and availability. As the data at any given time is a bit different at every IDC, you might get different result at different time or on repeated searches for the same word/term (a Google Dance).

Dynamic URLs might require longer to get indexed (sometimes they don’t get indexed in the least!) as even a little data can produce unlimited URLs that can jumble the Google index as a duplicate content.

All in all, one requires waiting for as long as 8-12 weeks, to get full indexed in Google. One must consider this as ‘seed bowing time’ in ‘Google’s garden’. Until you can enhance the importance of the web page by having several incoming links from renowned websites, there is no technique to swift up the indexing process, provided you personally know Sergey Brin & Larry Page and got a substantial influence over them!

seo
Why SeoFlicks?

Our Alexa Rank < 10,000 :: 45,000 visitors a day

Trustworthy, honest company
Fast manual submissions
Detailed submission report
Quality after sales service
Value for your money, giving you better ROI!
100% Online secure ordering
   

Article Submission

 
Directory Submission
 
Free Original Article
Best Seo Directories
100% Manual Submission
Title Suggestions
Separate Email Accounts
Manual Submissions
Starts as low as $15
Starts as low as $25
read
read
   
Social Bookmarking
Blog Submission
   
Social Network Creation
Free Social Blog Profiles
100% Manual Submission
Free Original Article
PR8 - PR2 Social Sites
PR9 - PR2 Sites
Starts as low as $15
Starts as low as $22
read
blog submission
   
We just don't say..... We prove it...
 
Steady Growth & Real SE Traffic
seo portfolio
 

 

  LATEST NEWS   SPECIAL PACKAGES

Website Designing Services | E-commerce Development Services

SEO Services | Link Building Services | PPC Advertising | Article Writing | SEO Portfolio | Hire Dedicated SEO Experts | Forum Submission | Blog Writing

IT Outsourcing

Linked InLinked InFacebookFacebookTwitterTwitter

Ayushveda launches Knowledge hub - A network of 14 blogs using Wordpress MU.
SEO Packages
SMO Packages
75 Writers to be hired for Ayushveda Knowledge Hub. Apply Now
Article Submision Packages
 
Directory Submission Packages
SeoFlicks 2nd office to open in August adding 50 more SEO Executives. The total reaches 75 SEO Executives & 100 Content Writers. .
Social Bookmarking Packages
Blog Submission Packages
 
Content Writing Packages
         

  Home | About Us | Contact Us | Disclaimer | Terms of use | | Privacy Policy | Sitemap copyscape

  Copyright © 2009 Ayushveda Informatics - Search Engine Optimization, Content Writing and Web Development Company