November 13, 2008
What You May Not Know About Google
If you are attempting to rank in Google, for competitive keyword phrases, in order to turn a profit; then you likely know all about PageRank, link reputation, link popularity, keyword density, duplicate content, the Google Slap, the Google Sandbox, and a whole slew of other hot topics. However, do you know the primary challenge Google faces and what those 100+ Ph.D's are working on? If you said the Google algorithm, you would be incorrect.
Turns out their biggest challenge is one of scale. It is an understatement, to the nth degree, to say their business grows rapidly. Every year they make technology decisions that no longer look so good the following year. Exponential growth is a great problem to have but requires a tremendous amount of work. The algorithm would be easy if the internet was not growing at such an explosive pace.
Did you know that Google is actually in the business of power? Google has about 500,000 machines used just for search? Their data centers consume about 4% of the North American power grid (more than our televisions). Over 4 years, the cost of the power is half the cost of the hardware. Google's own usage is likely greater than 50MW ($5,000/hr). Google builds their own power supplies which are 90% efficient versus 70% efficient commercial power. Each data center costs approximately $600 million, is the size of three football fields (under cover), and has a 4 story cooling tower.
When is the last time you saw Google down? Um, like never! Yeah, they have achieved the vaunted five 9's, 99.999% up-time. Interestingly, they do not invest in high quality hardware. Instead they have chosen to write software that can handle the constant failures of cheap hardware. With so many machines, it is inevitable that at any given time a ton of machines are going to die and there is no need to waste money on expensive machines that will also fail.
Do you have any idea how much disk space is required to store a copy of the web? Turns out it is 2 petabytes and Google keeps three copies. In case you are not familiar, a petabyte equals a million billion bytes. Search is not hard because of the algorithm, it is hard because of scale. They have to maintain the biggest index with the fastest response time. There are more than one trillion URLs and growing by several billion per day (at least 20 billion in their index). The total links are at least 10-20 trillion with 30%+ of all pages being spam. Yet, it only takes them a quarter of a second to return the ten best results for any query.
The most amazing part about Google is that 85% of user searches are non-commercial (no commercial intent, not monetizable). This means that 85% of their investment in search is to support a giant loss leader. Think about that for a while. Truly amazing and hard to fathom. Lastly, did you know that 50%+ of all searches in a given month are unique? How about that almost 25% of all searches are new (never seen before in the history of Google)? No wonder keyword research and site optimization is so challenging!
If you remain paranoid about using Google Analytics or Webmaster Tools for fear that Google is going to use the information against you, get over yourself. If you think Google is out to get you every time your positions slip in Google, get real. You mean nothing to Google and are in fact the smallest needle in the largest haystack in the history of the world.
Image Credit: Matt McGee
Filed under: Uncategorized by BrockO





Leave a Comment