If you access and study your web server logs, you will find that more than 50% of the traffic is taken by Bots , Both search engine and Spam Bots.  Recently we have tried to build a system where we can analyze the data of the website, its seems there is big difference between Google Analytic and Our own Reporting, We have removed all the bot traffic from the data and still its very much higher than Google Analytic.

It is very easy to find out search engine crawling bots , like Google, yahoo and Bing, By analyzing the user agent filed , it is very easy to determine them. but many spam bots never gives there identity, if we check the user agent it will be same like of common browsers.

we are still struggling to block all unwanted bots, some of our ideas are

  • Putting a java script to determine the source of the traffic
  • Blocking the ips which hits the server very frequently

Post By Gishore J Kallarackal (2,121 Posts)

Gishore J Kallarackal is the founder of techgurulive. The purpose of this site is to share information about free resources that techies can use for reference. You can follow me on the social web, subscribe to the RSS Feed or sign up for the email newsletter for your daily dose of tech tips & tutorials. You can content me via @twitter or e-mail.

Website: → Techgurulive