Chris Essig

Walkthroughs, tips and tricks from a data journalist in eastern Iowa

How automation can save you on deadline

leave a comment »

In the data journalism world, data dumps are pretty common occurrences. And often times these dumps are released periodically — every month, every six months, every year, etc.

Despite their periodic nature, these dumps can catch you off guard. Especially if you are working on a million things at once (but nobody does that, right?).


Case in point: Every year the Centers for Medicare & Medicaid Services releases data on how much money doctors received from medical companies. If I’m not mistaken, the centers release the data now because of the great investigative work of ProPublica. The data they release is important and something our readers care about.

Last year, I spent a ton of time parsing the data to put together a site that shows our readers the top paid doctors in Iowa. We decided to limit the number of doctors shown to 100, mostly because we ran out of time.

In all, I spent at least a week on the project last year. But this year, I was not afforded that luxury because the data dump caught us off guard. Instead of a few weeks, we had a few days to put together the site and story.

Fortunately, I’ve spent quite a bit of time in the last year moving away from editing spreadsheets in Excel or Google Docs to editing them using shell scripts.

The problem with GUI editors is they aren’t easily reproducible. If you make 20 changes to a spreadsheet in Excel one year and then the next year you get an updated version of that same spreadsheet, you have to make those exactly same 20 changes again.

That sucks.

Fortunately with tools like CSVKit and Agate, you can write A LOT of Excel-like functions in a shell script or a Python script. This allows you to document what you are doing so you don’t forget. I can’t tell you how many times I’ve done something in Excel, saved the spreadsheet, then came back to it a few days later and completely forgotten what the hell I was doing.

It also makes it so you can automate your data analysis by allowing you to re-run the functions over and over again. Let’s say you perform five functions in Excel and trim 20 columns from your spreadsheet. Now let’s say you want to do that again, either because you did it wrong the first time or because you received new data. Now would you rather run a shell script that takes seconds to perform this task or do everything over in Excel?

Another nice thing about shell scripts is you can hook your spreadsheet into any number of data processing programs. Want to port your data into a SQLite database so you can perform queries? No problem. You can also create separate files for those SQL queries and run them in your shell script as well.

All of this came in handy when this year’s most paid doctors data was released. Sure, I still spent a stressful day editing the new spreadsheets. But if I hadn’t been using shell scripts, I doubt I would have gotten done by our deadline.

Another thing I was able to do was increase the number of doctors listed online. We went from 100 total doctors to every doctor who received $500 or more  (totaling more than 2,000 doctors). This means it’s much more likely that readers will find their doctors this year.

The project is static and doesn’t have a database backend so I didn’t necessarily need to limit the number of doctors last year. We just ran out of time. This year, we were able to not only update the app but give our readers much more information about Iowa doctors. And we did it in a day instead of a week.

The project is online here. The code I used to edit the 15+ gigabytes worth of data from CMS is available here.



Written by csessig

September 7, 2016 at 4:59 pm

Building your first Leaflet.js map

leave a comment »

Earlier this year, I had the privilege of teaching a class on building your first Leaflet.js map at the NICAR conference. I just realized I forgot to post the code in this blog so I figured I’d post it now. Better later than never.

If you’re interested in building your first map, check out my Github repo for the presentation.

Written by csessig

August 17, 2016 at 7:58 am

Introduction to Javascript

with 2 comments

A few weeks ago, I presented on the basics of Javascript to my fellow developers at FusionFarm.

If you want to check out my slides for the presentation, they are available here. I went over variables, objects, for loops, functions, and a whole host of other fun stuff.

If I missed anything, let me know in the comments.

Written by csessig

July 12, 2016 at 9:13 am

Posted in Javascript, Uncategorized

Tagged with

D3 formula: Splitting elements into columns

leave a comment »

icons_columnsD3 can be a tricky — but powerful — beast. A month ago, I put together my most complex D3 project to date, which helped explain Iowa’s new Medicaid system

One of the first places I started on this project was building a graph that will take an icon and divide it into buckets. I didn’t see any openly available code that replicated what I was trying to do, so I’d figure I’d post my code online for anyone to use/replicate/steal.

For this project, I put the icons into three columns. In each column, icons are placed side by side until we get four icons in a row, then another row is created. You can see this in action by clicking the button several times. And all of this can be adjusted in the code with the “row_length and “column_length” variables.

To move the icons, I overlaid three icons on top of each other. When the button is clicked, each of the icons gets sent to one of the columns. The icons shrink as they reach their column. After this transition is finished, three more icons are placed on the DOM. And then the whole process starts over.

A bunch of math is used to determine where exactly on the DOM each icon needs to go. Also, we have to keep track of how many icons are on the DOM already, so we can break the icons into new rows if need be.

Data is required to make D3 run so my data is a simple array of [0,1,2]. The array has three values because we have three columns on the page. We do some calculations on the values themselves and then use SVG’s transform attribute to properly place the icons on the DOM. We use D3’s transition function to make it appear like the icons are moving into the icons, not just being placed there.

Hopefully this helps others facing similar problems in D3. If you have any questions, just leave them in the comments.

Written by csessig

May 2, 2016 at 12:30 pm

Save your work because you never know when it will disappear

with 2 comments

We are a few weeks into the new year, but I wanted to look back at the biggest project I worked on in 2015: The redesign of

While most of my blog posts are full of links, I can’t link to that site. Why? Because it’s gone.


In a series of very unfortunate events, the site we spent many, many months planning and developing is already gone.

The timeline: We started preparing for the redesign, which was BADLY needed, in early 2015. We then built it over the course of several months. Finally, it was launched in July. Then, in a move that surprised every one, KCRG was bought by another company in September.

At the time, I was optimistic that the code could be ported over to their CMS. And the site wouldn’t die.

My optimism was short lived. Gray has a standard website template for all its news sites, and they wanted that template on KCRG.

So in December, the website we built disappeared for good.


The KCRG website you see now is the one used and maintained by Gray.

Obviously, this was a big shock for our team. Even worse, the code we wrote was proprietary and requires Newscycle Solutions to parse and display. So even if I wanted to put it on Github, it wouldn’t do anyone any good.

I’m not used to the impermanence of web. When I had my first reporting job in Galesburg, I saved all the newspapers where my stories appeared. And unless my parents’ house catches on fire, those newspapers will last for a long time. They are permanent.

Not so online. Websites disappear all the time. And those who build them have barely any record of their existence.

Projects like PastPages and the Wayback Machine keep screenshots of old websites, which is better than nothing. But their archives are far cries from the living, breathing websites we build. A screenshot can’t show you nifty Javascript.

It’s an eery feeling. What happens in five years? Ten years? Twenty years? Will any of our projects still be online? Even worse: Will technology have changed so much that these projects won’t even be capable of being viewed online? Will online even exist?

Think about websites from 1996. They are long gone. Hell, many sites from two years ago have vanished.

I don’t have good answers. Jacob Harris has mulled this topic and offered some good tips for making your projects last.

But it’s worth pondering when you’re done with projects. What can I do to save my work for the future? I have a directory of all of my projects from my Courier days on an external hard drive. I have an in-process directory for The Gazette as well.

I hold onto them like I did my old newspaper clippings. Although, I’m confident those clippings will last a lot longer than my web projects.

Written by csessig

January 21, 2016 at 11:52 am

Quick intro to PJAX

leave a comment »

Every week, the developers at Fusion Farm meet and have a code review. Last week, I presented on PJAX, which I spent a few hours digging into.

The code behind my presentation is available on my Github page.

You really need to set up a back-end project to use PJAX to its fullest. For this presentation, I focused on just the front-end components, which will at least give you an idea of how the library works.

I’ve seen PJAX (or something like it) a few times in the wild. The Chicago Tribune, for instance, uses it (or something similar) on their website. If you scroll down their page long enough, you’ll notice a new page opens up in your browser. But the entire page doesn’t refresh. Instead the URL changes smoothly as you scroll down the page.

Anyways, check out the code on my site and if you have any questions, fire them my way.


Written by csessig

December 21, 2015 at 9:58 am

Posted in Uncategorized

Exploring D3.js

leave a comment »

network_graph_mary_ssI’ve been wanting to use the D3.js library for some time now but I haven’t up to this point mostly because the code that runs the charts scares the crap out of me (that’s only a slight exaggeration).

Well this weekend I finally took on the scary and published my first D3 chart. The final product is a network graph that shows how one woman is helping connect people within the community. It’s not crazy awesome like most of the other D3 graphics out there but it’s a start.

The code started out as this chart from Mike Bostock and was modified to meet my needs. The final code is on GitHub.

While it wasn’t quite as intimidating as I thought it would be, a few things tripped me up. Chaining methods in D3 was something that threw me off at first. I was also unfamiliar with SVGs. This is a good primer from Scott Murray. Finally, many of the D3 methods are still mysterious me, like diagonal and the layouts. But one step at a time, right?

Some of it, however, isn’t crazy complicated. I’m going to review some of what I learned about D3 by going through some of the code that powered this project.

The code snippet below is line 47 in the chart’s script.js file. It starts the graph. It selects the empty SVG that I put in index.html, sets it’s width and height and appends a G element to it:

// Create an svg
  svg ="svg")
  	.attr("width", width + margin.right + margin.left)
  	.attr("height", height + + margin.bottom)
  	.attr("transform", "translate(" + (margin.left - line_width) + "," + + ")");

A few lines down, we create a variable called node, select all the G elements and bind data to them.

// Update the nodes
  var node = svg.selectAll("g.node")
  	.data(nodes, function(d) {
      return || ( = ++i);


This G element contains everything in the chart and is placed just under the empty svg element in index.html

We then create a variable called nodeEnter, which appends another G element to the G element we just created. (confusing, I know).

// Enter any new nodes at the parent's previous position.
  var nodeEnter = node.enter().append("g")
    	// Set location of nodes
    	.attr("transform", function(d) {
    		if ( d['name'] === 'Mary Palmberg' && d['depth'] === 2) {
    			return "translate(" + circle_mary_location + "," + circle_mary_translate + ")";
        } else {

          if ( d['name'] === 'Thomas Jensen' ) {
            d.x -= 20;
          } else if ( d['name'] === 'Clarence Borck' ) {
            d.x += 20;
          } else if ( d['name'] === 'Jane Skinner' && d['connection'] === 5 ) {
            d.x -= 20;
          return "translate(" + d.y + "," + d.x + ")";
    	.attr("class", function(d) {
    		if (d['name'] === 'Mary Palmberg') {
    			return 'node-mary';
    		} else if (d['name'] === 'Top level') {
    			return 'node-top';
    		} else {
    			return 'node'

In our chart, these G elements are the nodes that contain our circles. Later in the code, I made lines that connect the circles. But we’re going to focus on just these G elements.

It may make more sense to inspect the graph to see what I’m talking about:


After we append the G elements to their parent G elements, we chain a series methods to it, which set attributes for the G elements. These methods are called D3 selectors. With D3 selectors, you can select any shape in the SVG and a do a bunch of things to it. We can append new nodes to it, set CSS properties, give it a certain class name, etc. And since we bound data to the shape, we can decide what those properties will be based on the data we gave it.

This is seriously awesome, by the way.

In the code above, we used a D3 selector to set the location of the shapes using the CSS attribute transform. You’ll notice how setting that attribute is a function with a “d” parameter. “D” is equal to the data behind the shape we are drawing (remember, we appended data to it earlier). It returns an object with different properties. In this case, the property “name” is equal to the name of the person we are drawing in the circle.

For this chart, I had 16 G elements on the page. I also have objects with properties about each G element in a separate JSON file. As we loop through each G element and create it on the page, the data parameter becomes whichever G element we are on. So when we are creating Mary Palmberg’s circle, the “D” parameter will equal data about Mary Palmberg.

“D” returns an object that looks like so:


In the code, I’m checking to see if the G element we are on is Mary Palmberg, because her circle is placed in the middle and all other circles point to her. If we are on Mary, we set the CSS property transform to the variables called “circle_mary_location” and “circle_mary_translate”, which I set earlier in the script.js file. These basically put her circle in the middle of the page.

If it’s not Mary, we run the else part of the if-else statement. We do a couple of checks on a few people and change their x values. This is done because some of the circles were a little bit too close to together. After that, I put the data’s x and y coordinates in CSS translate property and return it. This basically puts it in the correct position on the page.

The last part should be easy to understand. All it does is set a class for the G Element depending on the person’s name. We give Mary a different class name so we can apply different styles to her circle:


It’s important to note that we haven’t actually created any shapes yet. Instead, G elements are just groups of shapes. The next thing we’re going to do is actually append circles to the empty G element.

    	.attr("id", "circle-outer")
    	// Make radius smaller than inner circle
    	.attr("r", function(d) {
    		if (d['name'] === 'Mary Palmberg') {
    			if ( $(window).width() > 700) {
                    return circle_radius + 7;
                } else {
                    return circle_radius + 17;
    		} else {
    			return circle_radius + 2;
    	// Color code the border around the circle
    	.style('stroke', function(d) {
    		// Color Mary differently
    		if (d['name'] === 'Mary Palmberg') {
    			// Red
    			return '#e41a1c'
    		} else if (d['connection'] === 1) {
    			// Purple
    			return "#984ea3";
    		} else if (d['connection'] === 2) {
    			// Green
    			return "#4daf4a";
    		} else if (d['connection'] === 3) {
    			// Blue
    			return "#377eb8";
    		} else if (d['connection'] === 4) {
    			// Orange
    			return "#ff7f00";
    		} else if (d['connection'] === 5) {
    			// Brown
    			return "#a65628";
    		} else {
    			// Blue
    			return "#377eb8";

  // Create the inner circle
    	.attr("id", "circle-inner")
    	// Radius
    	.attr("r", function(d) {
    		if (d['name'] === 'Mary Palmberg') {
    			if ( $(window).width() > 700) {
                    return circle_radius + 5;
                } else {
                    return circle_radius + 15;
    		} else {
    			return circle_radius;
    	.style("fill", function(d) {
    		if ( $(window).width() > 700) {
                return "url(#" + d['name'].replace(' ', '') + ")"
            } else if ( d['name'] === 'Mary Palmberg') {
                return "url(#" + d['name'].replace(' ', '') + ")"

I appended two circles to the G elements. The first is the circle around a person. I gave it an id of “circle-outer” and set it’s radius (attribute “r”) to a variable called “circle_radius”, which is set at the top of the script.js file. Once again I did a check for Mary and gave her a bigger circle since she’s in the middle. I then set the color of the circle’s stroke based on data property “connection”.

Our network graph has five total connections, so there’s a possibility of five different colors. I set the connection number in the JSON file. This is done so every connection through Mary is color-coded the same and thus the reader can see they are part of the same story. Same color = same story. And Mary stays in the middle of it all.

The second circle is the image itself. It has an id of “circle-id”. The attribute fill is set to the image of the person. The urls for each image are set in index.html under the defs tag. Each image is a pattern with a “xlink:href” property set to the url of the image. More information on how this works can be found here.

After that, I append text to have their name appear the circle. And I used this library to create the tooltips. I won’t bore you with that code as well.

There’s a few more things going on in this chart but those are the basics. Or at least as I understand them.

As with learning any new technology, my advice is to pick a project that you want to be powered by D3 and go at it. I learned a lot more about D3 making my first chart than I have with walkthroughs, video tutorials, etc. You’ll never learn it until you get your hands dirty.

Written by csessig

April 1, 2015 at 4:42 pm

Posted in d3, Javascript