Google - The biggest Big Brother

If it hasn’t happened already, I bet you will see companies blocking access to google.com from within corporate firewalls in the future.

Take a look at some of the “Zeitgeist” pages on google.com (for example the one for August 2004 at http://www.google.com/press/zeitgeist/zeitgeist-aug04.html. What you see are some serious statistics that Google created from all the searches it received during the month of August 2004. Things at google.com work the same as elsewhere on the web: for each visit to their servers a visitor will leave a footprint in the access logs of the server that tells Google what you were looking for and where you came from (your IP address). Some information can be supressed by your browser or software that sits between your browser and google.com (for example information about the specific browser you are using or whether you used a link to go to google.com). However the other two pieces mentioned before, IP-address and request, cannot be hidden easily (yes, there are anonymizing proxies, but those are not commonly used from within corporate networks).

The IP-address can be used to determine where the request came from (country, company, university, home dsl connection, phone connection, etc.). There are a lot of companies who specialized in this field and those will be happy to sell you solutions that will match an IP-address as accurate as possible to a location and/or company (an example for this can be found on this page on kahunaburger.com - look for “VisualRoute”).

Things start to get interesting when you correlate the where and the what. Big companies usually only have a few places where Internet traffic leaves the corporate networks. So it is easy for Google to select all those queries that have been made from, say, within microsoft.com. And even if a company has a lot of entry points into the public Internet, it is not too difficult to find all queries that originate from those places.
Besides the noise queries there will also be queries that are directly related to Microsoft’s future business. They may see requests from Microsoft’s R&D departments looking for information on “natural language processing” or find queries from Microsoft’s HR department looking for details on a candidate that they are planning to hire.

I would not be surprised if besides those “Zeitgeist” pages above there were internal reports that listed the “most interesting” queries that originated from “Fortune 500″ companies and especially Google’s competition.

I’m not saying that Google would be the only place where things like this could happen, but as the dominator in the search market, it’s more likely to happen there than anywhere else. And nothing in their privacy policy stops them from collection and even selling this information (as long as it cannot be related to you as an individual).

So, to come back to my original hypothesis: Keeping all this in mind, don’t you think that companies would try to limit the information leakage by closing the connection to Google?

No comments yet. Be the first.

Leave a reply

« « Happy Birthday Pumpkin! | NewMexiKen: Bush’s a Post Turtle » »