So a compressed text file that ends up being 72GB sounds like a lot, right? Especially if you have to data-churn it with per-line processing. Fine. Wow. A lot.
That pales in comparison to the (reported) volumes of data processed by Google and Facebook:
In December 2007 (!) Google was processing 400 PB (petabytes) per month, with an average job size of 180GB.
Facebook’s volumes have been steadily increasing, too: From March 2008’s 200GB of daily new data, they’ve moved up to 2TB per day in April 2009, to steady off to 4TB per day in October 2009.
Most of which are, without a doubt, LolCat pictures 🙂