Chris Essig

Walkthroughs, tips and tricks from a data journalist in eastern Iowa

A few reasons to learn the command line

leave a comment »

Note: This is my first entry for Lee Enterprises’ data journalism blog. Reporters at Lee newspapers can read the blog by clicking here.

As computer users, we have grown accustomed to what is known as the graphical user interface (GUI). What’s GUI, you ask? Here are a few examples: When you drag and drop a text document into the trash, that’s GUI in action. Or when you create a shortcut on your desktop, that’s GUI in action. Or how about simply navigating from one folder to the next? You guessed it: that’s GUI in action.

GUI, basically, is the process of interacting with images (most notably icons on computers) to get things done on electronic devices. It’s easy and we all do it all the time. But there is another way to accomplish many tasks on a computer: using text-based commands. Welcome to the command line.

So where do you enter these text-based commands and accomplish these tasks? There is a nifty little program called the Terminal on your computer that does the work. If you’ve never opened up your computer’s Terminal, now would be a good time. On my Mac, it’s located in the Applications > Utilities folder.

A scary black box will open up. Trust me, I know: I was scared of it just a few months ago. But I promise there are compelling reasons for journalists to learn the basics of the command line. Here are a few:


1. Several programs created by journalists for journalists require the command line.

Two of my favorite tools out there for journalists come from ProPublica: TimelineSetter and TableSetter.

The first makes it easy to create timelines. We’ve made a few at the Courier. The second makes easily searchable tables out of spreadsheets (more specifically, CSV files), which we’ve also used at the Courier. But to create the timelines and tables, you’ll need to run very basic commands in your Terminal.

It’s worth noting the LA Times also has its own version of TableSetter called TableStacker that offers more customizations. We used it recently to break down candidates running in our local elections. Again, these tables are created after running a simple command.

The New York Times has a host of useful tools for journalists. Some, like Fech, require the command line to run. Fech can help journalists extract data from the Federal Election Commission to show who is spending money on whom in the current presidential campaign cycle.


2. Other programs/tools that journalists can use:

Let’s say you want to pull a bunch of information from a website to use in a story or visualization, but copy and pasting the text is not only tedious but very time consuming.

Why not just scrape the data using a program made in a language like Python or Ruby and put it in a spreadsheet or Word document? After all, computers are great at performing tedious tasks in just a few minutes.

One of my favorite web scraping walkthroughs comes from BuzzData. It shows how to pull water usage rates for every ward in Toronto and can easily be applied to other scenarios (I used it to pull precinct information from the Iowa GOP website). The best way to run this program and scrape the data is to run it through your command line.

Another great walkthrough on data scraping is this one from ProPublica’s Dan Nguyen. Instead of using the Python programming language, like the one above, it uses Ruby. But the goal remains the same: making data manageable for both journalists and readers.

A neat mapping service that is gaining popularity at news organizations is TileMill. Here are a few examples to help get you motivated.

One of the best places to start with TileMill is this walkthrough from the application team at the Chicago Tribune. But beware: you’ll need to know the command line before you start (trust me, I learned the hard way).


3. You’ll impress your friends because the command line kind of looks like the Matrix

And who doesn’t want that?


Okay I’m sort of interested…How do I learn?

I can’t tell people enough how much these two command line tutorials from PeepCode helped me. I should note that each costs $12 but are well worth it, in my opinion.

Also, there is this basic tutorial from HypeXR that will help. And these shortcuts from LifeHacker are also great.

Otherwise, a quick Google or YouTube search will turn up thousands of results. A lot of programmers use the command line and, fortunately, many are willing to help teach others.


Written by csessig

January 31, 2012 at 9:21 am

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: