Factual Blog / Tagged:


The Humongous nfu Survival Guide

Github: github.com/spencertipping/nfu A lot of projects I’ve worked on lately have involved an initial big Hadoop job that produces a few gigabytes of data, followed by some exploratory analysis to look for patterns. In the past I would have sampled the data before loading it into a Ruby or Clojure REPL, but increasingly I’ve started to...

nfu: Command-line Numeric Fu

Note: Explore nfu on Github here We often use the UNIX command line for ad-hoc data crunching. Most of the time we have the good sense to use a better tool after the first 100 characters or so, but sometimes we’ll just blow past the right margin with a string of sort, uniq -c, sort -nr,...