Data | Chris Essig

Archive for the ‘Data’ Category

Resources for moving away from Google maps

Note: This is cross-posted from Lee’s data journalism blog. Reporters at Lee newspapers can read my blog over there by clicking here.

For some time now, Google maps has been the go-to source for map-making journalists.

And for good reason. The maps load quickly, can be easily hooked up to other Google products like Fusion Tables and charts, are free (mostly) and feature an incredible API for developers.

But recently, many journalists have been moving away from Google Maps. Part of this was spurred by a recent announcement by Google that they would be charging for their map API (although they backtracked a bit after receiving flack from news organizations).

Others are just sick of seeing the same backdrops on every map featured on news websites. A little variety never hurt anybody.

Mostly because of the latter, I’ve started experimenting with products like Leaflet, Tilemill and Wax to make our interactive maps. Here’s a couple of examples:

1. Public hunting grounds in Iowa

2. Child abuse reports by county

If you are like me and trying to spice things up, here’s some resources that helped me that will hopefully help you as well:

– Leaflet docs – Leaflet is very similar to Google Maps; it provides your map with a backdrop, highlights locations, gives you zoom controls, etc. But unlike Google Maps, Leaflet allows users to make their own backdrops and upload them online. Other users can use the backdrops on their own maps. Leaflet also allows map-makers to put points on the map similar to Google.

– CloudMade Map Style Editor – Here’s a collection of user-generated Leaflet maps you can use with your next project. Be careful not to get sucked in for hours like I do.

– AP: Bringing Maps to Fruition – If you are looking for a good example of Leaflet in action with source code and all, download this ZIP file provided by Michelle Minkoff, Interactive Producer for the Associated Press in Washington. She presented it at NICAR 2012 and inspired me to try it out.

– TileMill – Not satisfied with any of the backdrops you’ve seen? Why not create your own? With TileMill, that is not only possible but easy (if you know basic CSS) and ridiculously addicting. You can also use TileMill, for instance, to color code counties using tiles, which are basically very small PNG image files. You can then use Google Maps as the backdrop (See below) for the map.

– Chicago Tribune: Making maps (five part series) – It’s fair to say this is required reading for any journalist looking to make maps. This thorough read goes through the steps of preparing your data, putting it in a database, using the tile rendering program TileMill to make a great-looking county population map and deploy it with Google Maps. Warning: It’s best if you familiarize yourself with the command line and if you a Mac user, DOWNLOAD ALL THE DEPENDENCIES (you can) with HomeBrew. See “Requirements” on the second part of the tutorial for a list of dependencies.

– Minn Post: How we built the same-day registration map – After you are done styling your map in the above tutorial, the Trib guys will show you how to deploy your tiles with a Python script. This is great but not how I have done it. Instead, I used this fantastic walkthrough from the Post’s Kevin Schaul to render the tiles, post them with Leaflet and make them interactive with Wax.

– Geocoding with Google Spreadsheets – One downfall of TileMill is it doesn’t recognize addresses like Google Maps does. Use this site to find out the latitude and longitude of addresses. Warning: You can only do 99 addresses at a time. If you know of a better resource, I’d love to know about it.

– Leaflet recipe: Hover events on features and polygons – I have not gone through this walk-through from Ben Walsh at the L.A. Times, but it looks like a solid way to incorporate hover events with your Leaflet maps.

– NRGIS Library for Iowa – Journalists at other Iowa newspapers will be happy to know this website exists with plenty of shapefiles of area roads, lakes, interstates, parks, you name it. It’s what I used to style all the points on the hunting grounds map. And if you don’t live in Iowa, just ask around and you may hit the jackpot like I did!

– Maki Icons – Tilemill also supports icons. Here’s a great list of sharp looking icons you can use with the program.

– QGIS 1 and QGIS 2 – If you are looking to make static, interactive maps and don’t need a zoom function, check out the (free!) QGIS program. With it, you can import polygons (with shapefiles) or points on map (using CSVs) and style them. You can then export them as PNGs files and make them interactive. Check out this two-part tutorial for more information.

As always, feel free to e-mail me with questions or if you know of any great resources not listed: chris.essig@wcfcourier.com

Written by csessig

April 16, 2012 at 9:51 am

Posted in Courier, Data, Leaflet, Maps, TileMill, Wax

Turning Blox assets into timelines: Last Part

with 2 comments

Note: This is cross-posted from Lee’s data journalism blog. Reporters at Lee newspapers can read my blog over there by clicking here.

Also note: You will need to run on the Blox CMS for this to work. That said you could probably learn a thing or two about web-scraping even if you don’t use Blox.

For part one of this tutorial, click here. For part two, click here

If you’ve followed along thus far, you’ll see we’re almost done turning a collection of Blox assets — which can include articles, photos and videos — into timelines using a tool made available by ProPublica called TimelineSetter.

Right now, we are in the process of creating a CSV file, which we will use with the program TimelineSetter. This programs takes the CSV file and creates a nice looking timeline out of it.

The CSV file is created by using a Python script that scrapes a web page we created with all of our content on it. The full code for this script is available here.

In the second part of the series, we went ahead and created a Python scraper that will pull all the information on the page we need.

1. If you’ve gone through the second part, you can go ahead and run python timeline.py now on your command line (More information on how to run the scraper is available in the first part of the blog).

You’ll notice the script will output a CSV that has all the information we need. But some of it is ugly. We need to delete certain tags, for instance, that appear before and after some of the text.

But before we can run Python commands to delete these unnecessary tags, we need to convert all of the text into strings. This is basically just text. Integers, by contrast, are numbers.

So let’s do that first:

# Extract that information in strings
    date2 = str(date)
    link2 = str(link)
    headline2 = str(headline)
    image2 = str(image)

    # These are pulled from outside the loop
    description2 = str(description)

Note: This code and the following chunks should go inside the loop statement we created in the second part of the series. This is because we want these changes to take effect on every item we scrape. If you are wondering, I tried to explain loop statements in my second blog. Your best bet though is to ask Google about loop statements.

2. Now that we’ve converted everything into strings, we can get rid of the ugly tags. You’ll notice that the descriptions of our stories start and end with with a “p” tag. Similarly, all dates start and end with an “em” tag.

Because of this, I added a few lines to the Python script. These lines will replace the ugly tags with nothing…effectively deleting them before they are put into the CSV file:

# Extra formatting needed for dates to get rid of em tags and unnecessary formatting
    date4 = date3.replace('[<em>', "")
    date5 = date4.replace('</em>]', "")
    date6 = date5.replace('- ', "")
    date7 = date6.replace("at ", "")

    # Extra formatting is also need for the description to get rid of p tags and new line returns
    description4 = description3.replace('[<p>', "")
    description5 = description4.replace('</p>]', "")
    description6 = description5.replace('\n', " ")
    description7 = description6.replace('[]', "")

3. For images, the script spits out “\d\d\d” for the width property. We will change this so it’s 300px, making all images on the page 300px wide. Also, we will delete the text “None,” which shows up if an image doesn’t exist for a particular asset.

# We will adjust the width of all images to 300 pixels. Also, Python spits out the word 'None' if it doesn't find an image. Delete that.
    image4 = re.sub(r'width="\d\d\d"', 'width="300"', image3)
    image5 = image4.replace('None', "")

4. If you are at all familiar with the way articles work in Blox, you know that when you update them, red text shows up next to the headline telling you when the story was last updated. The code that formats that text and makes it red is useless to us now. So we will delete this and replace it with the current time and date using the Python datetime module.

To use the datetime module, we have to first import it at the top of our Python script. We then need to call the object within the module that we want. A good introduction to Python modules is available here.

The module we want is called “datetime.now()”. As the name suggests, it returns the current date and time. I then put it in a variable called “now”, which makes it more reader friendly to use later in the script.

So the top of our page should look like this:

import urllib2
from BeautifulSoup import BeautifulSoup
import datetime
import re

now = datetime.datetime.now()

Inside the loop we call the “datetime.now()” object and replace the text “Updated” with the current date and time:

# If the story has been updated recently, an em class tag will appear on the page showing the time but not the date. We will delete the class and         replace it with today's date. We can change the date in the CSV if we need to.
    date8 = date7.replace('[<em class="item-updated badge">Updated:', str(now.strftime("%Y-%m-%d %H:%M")))

5. Now there is just one last bit of cleaning up we will need to do. For those who don’t know, CSV stands for comma-separated values. This means that basically the columns we are creating in our spreadsheet are separated by commas. This is the preferred type of spreadsheet for most programs because it’s simple.

We can run into problems, however, if some of our data includes commas. So these next few lines in our script will replace all of the commas we scraped with dashes. You can change it to whatever character or characters you want:

# We'll replace commas with dashes so we don't screw up the CSV. You can change the dash to whatever character you want
    date3 = date2.replace(",", " -")
    link3 = link2.replace(",", " -")
    headline3 = headline2.replace(",", " -")
    description3 = description2.replace(",", " -")
    image3 = image2.replace(",", " -")

If you want to put the commas back into the timeline, you can do so after the final HTML file is created (after you run python timeline.py). Typically I’ll replace all the commas with “////” and then do a simple find and replace on the final HTML file with a text editor and change every instance of “////” back into a comma.

Now we have clean, concise data! We will now put this into the CSV file using the “write” command. Again, all these commands are put inside the loop statement we created in the second part of the series, so every image, description, date, etc. we scrape will be cleaned up and ready to go:

# Write the information to the file. The HTML code is based on coding recognized by TimelineSetter
    f.write(date8 + "," + description7 + "," + link3 + "," + '<h2 class="timeline-img-hed">' + headline3 + '</h2>' + image5 + "\n")

The headlines, you’ll notice, are put into an HTML tag “h2” tag. This will bold and increase the size of the headlines so they stick out when a reader clicks on a particular event in the timeline.

That’s it for the loop statement. Now we will add a simple line outside of the loop statement that closes the CSV file when all the loops we want run are done:

#You're done! Close file.
f.close()

And there you have it folks. We are now done with our Python script. We can now run it (see part 1 of the series) and have a nice, clean looking CSV file we can turn into a timeline.

If you have any questions, PLEASE don’t hesitate to e-mail or Tweet me. I’d be more than happy to help.

Written by csessig

March 16, 2012 at 8:38 am

Posted in Blox, Courier, Data, Lee Enterprises, ProPublica, Python, web scraping

Multiple layers and rollover effects for Fusion Table maps

with 3 comments

Note: Because of a change in the Fusion Tables API, the method for using rollover effects no longer works.

Note: This is cross-posted from Lee’s data journalism blog. Reporters at Lee newspapers can read my blog over there by clicking here.

If you haven’t noticed by now, a lot of journalists are in love with Google Fusion Tables. I’m one of them. It’s helped me put together a ton of handy maps on deadline with little programming needed.

For those getting started, I suggest these two walkthroughs on Poynter. For those who have some experience with FT, here are a couple of options that may help you spruce up your next map.

Multiple Fusion tables on one map

Hypothetically speaking, let’s say you have two tables with information: one has county data and the other city data. And you want to display that data on just one map.

Fusion Tables makes it very easy to display both at the same time. We’ll start from the top and create a simple Javascript function to display our map (via the Google Maps API):

function initialize() {
	map = new google.maps.Map(document.getElementById('map_canvas'), {
	    center: new google.maps.LatLng(42.5, -92.2),
		zoom: 10,
		minZoom: 8,
		maxZoom: 15,
	    mapTypeId: google.maps.MapTypeId.TERRAIN
	});
	loadmap();
}

At the end of the initialize function we call a function “loadmap(); With this function, we will actually pull in our Fusion Tables layers. For this example we’ll bring in two layers instead of one. Notice how strikingly similar the two are:

function loadmap() {
	layer2 = new google.maps.FusionTablesLayer({
		query: {
			select: 'geometry',
			from: 2814002
		}
	});
	layer2.setMap(map);

	layer = new google.maps.FusionTablesLayer({
		query: {
			select: 'Mappable_location',
			from: 2813443
		}
	});
	layer.setMap(map);

That’s it! You now have one map pulling in two sets of data. To see the full code for this map, click here.

Rollover effects for Fusion Tables

One feature often requested in the Fusion Tables forums is to enable mouse rollover events for Fusion Table layers. Typically, readers who look at a map have to click on a point, polygon, etc. to open up new data about that point, polygon, etc. A mouseover event would allow new data to pop up if a reader hovers over a point, polygon, etc. with their mouse.

A few months ago, some very smart person rolled out a “workable solution” for the rollover request. Here’s the documentation and here’s an example of it in the wild.

Another example is this map on poverty rates in Iowa. The code below is from this map and is very similar to the code on the documentation page:

layer.enableMapTips({
		select: "'Number', 'Tract', 'County', 'Population for whom poverty status is determined - Total', 'Population for whom poverty status is determined - Below poverty level', 'Population for whom poverty status is determined - Percent below poverty level', 'One race - White', 'One race - Black', 'Other', 'Two or more races'", // list of columns to query, typially need only one column.
		from: 2415095, // fusion table name
		geometryColumn: 'geometry', // geometry column name
		suppressMapTips: true, // optional, whether to show map tips. default false
		delay: 1, // milliseconds mouse pause before send a server query. default 300.
		tolerance: 6 // tolerance in pixel around mouse. default is 6.
		});

	//here's the pseudo-hover
	google.maps.event.addListener(layer, 'mouseover', function(fEvent) {
var NumVal = fEvent.row['Number'].value;
	layer.setOptions({
		styles: [{
			where: "'Number' = " + NumVal,
			polygonOptions: {
				fillColor: "#4D4D4D",
				fillOpacity: 0.6
			}
		}]
	});

Note: It’s easiest to think of your Fusion Table as a list of polygons with certain values attached to it. For this poverty map, each row represents a Census Tract with values like Tract name, number of people within that tract that live in poverty, etc. And for this map, I made it so each polygon in the Fusion Table has its own, unique number in the “Number” column.

Here’s a run through of what the above code does:

1. We’ve already declared “layer” as a particular Google Fusion Table layer (see the second box of code above). Now the “layer.enableMapTips” will allow us to add rollover effects to that Fusion Table layer. The “select” option represents all the columns in that Fusion Table layer that you want to use with the rollover effect.

For instance, here’s the Fusion Table I’m calling in the above “enableMapTips” function. Notice how I’ve called all the columns with data (‘Tract’, ‘County’, etc.). I then told it which Fusion Table to look for with “from: 2415095.” Each Fusion Table has its own unique number. The number for my poverty Fusion Table is 2415095, which is called. To find out what number your Fusion Table is, click File > About.

Finally, I’ve told it what column contains the geometry information for this Fusion Table (Again, go through this Poynter walkthrough to find out all you need to know about the geometry field). Mine is simply called “geometry.” Each row in the “geometry” column represents one polygon.

2. The second step is the “google.maps.event.addListener(layer, ‘mouseover’, function(fEvent).” Basically this says “anytime the reader rollovers a polygon, the following will happen.”

In this function, “fEvent”represents the polygon that the reader is currently hovering over. Let’s say I’m rolling over the polygon that is represented by the first row in the Fusion Table. It’s Census Tract 9601 and has the value of “1” in the “Number” column.

Every time a reader rolls over Census Tract 9601, the code “fEvent.row[‘Number’].value” goes into the Google Fusion Table, finds the Census Tract 9601 row and returns the value of the Number column, which is “1.” So var “NumVal” would become “1” when a reader rolls over Census Tract 9601.

The next part changes that polygon’s color. This happens with the “where” statement. This is saying, “when I rollover a polygon, find the polygon in the Fusion Table that represents ‘NumVal’ and change its color.” Since the variable “NumVal” represents the polygon currently being hovered over, this is the polygon that changes colors. For a reader, the output is simple: I rollover a polygon. It changes colors.

In short: I roll over Census Tract 9601, firing of the “google.maps.event.addListener” function. This function finds the value of the “Number” column. It returns “1.” The code then says change the color of the polygon that has a value of “1” in the “Number” column. This is Census Tract 9601, which I rolled over. It changes colors and life is good.

MapTips

If you go back up to “layer.enableMapTips” in the third box of code, you’ll notice there is an option for “suppressMapTips.” For the poverty map, I have it set to true. But what if you set it to false? Basically, any time a reader hovers over a point or polygon, a small box shows up next to it containing information on that point or polygon. Notice the small yellow box that pops on this example page.

This is a nifty feature and a great replacement for the traditional InfoBox (the box that opens when you click on a point in a Google map). The only problem is the default text size is almost too small to read. How do we change that? Fairly easily:

1. Download a copy of the FusionTips javascript file.

2. Copy the file to the same folder your map is in and add this at the top of your document header:

<script type="text/javascript" src="fusiontips.js"></script>

3. Open the FusionTips file and look for “var div = document.createElement(‘DIV’).” It’s near the top of the Javascript file.

4. This ‘DIV’ represents the MapTips box. By editing this, you can change how the box and its text will display when a reader hovers over a point on the map. For instance, this map of historical places in Iowa used MapTips but the text is larger, the background is white, etc. Here’s what the DIV looks like in my FusionTips javascript file:

FusionTipOverlay.prototype.onAdd = function() {
    var div = document.createElement('DIV');
    div.style.border = "1px solid #999999";
	div.style.opacity = ".85";
    div.style.position = "absolute";
    div.style.whiteSpace = "nowrap";
    div.style.backgroundColor = "#ffffff";
    div.style.fontSize = '13px';
    div.style.padding = '10px';
    div.style.fontWeight = 'bold';
    div.style.margin = '10px';
    div.style.lineHeight = '1.3em';
    if (this.style_) {
      for (var x in this.style_) {
        if (this.style_.hasOwnProperty(x)) {
          div.style[x] = this.style_[x]
        }
      }
    }

Much better! Here’s the code for this map. And here are my three Fusion Tables.

I hope some of these tips help and if you have any questions, send me an e-mail. I’d be more than happy to help.

Written by csessig

February 7, 2012 at 4:34 pm

Posted in Courier, Data, Fusion Tables, Google, Lee Enterprises

A few reasons to learn the command line

leave a comment »

Note: This is my first entry for Lee Enterprises’ data journalism blog. Reporters at Lee newspapers can read the blog by clicking here.

As computer users, we have grown accustomed to what is known as the graphical user interface (GUI). What’s GUI, you ask? Here are a few examples: When you drag and drop a text document into the trash, that’s GUI in action. Or when you create a shortcut on your desktop, that’s GUI in action. Or how about simply navigating from one folder to the next? You guessed it: that’s GUI in action.

GUI, basically, is the process of interacting with images (most notably icons on computers) to get things done on electronic devices. It’s easy and we all do it all the time. But there is another way to accomplish many tasks on a computer: using text-based commands. Welcome to the command line.

So where do you enter these text-based commands and accomplish these tasks? There is a nifty little program called the Terminal on your computer that does the work. If you’ve never opened up your computer’s Terminal, now would be a good time. On my Mac, it’s located in the Applications > Utilities folder.

A scary black box will open up. Trust me, I know: I was scared of it just a few months ago. But I promise there are compelling reasons for journalists to learn the basics of the command line. Here are a few:

1. Several programs created by journalists for journalists require the command line.

Two of my favorite tools out there for journalists come from ProPublica: TimelineSetter and TableSetter.

The first makes it easy to create timelines. We’ve made a few at the Courier. The second makes easily searchable tables out of spreadsheets (more specifically, CSV files), which we’ve also used at the Courier. But to create the timelines and tables, you’ll need to run very basic commands in your Terminal.

It’s worth noting the LA Times also has its own version of TableSetter called TableStacker that offers more customizations. We used it recently to break down candidates running in our local elections. Again, these tables are created after running a simple command.

The New York Times has a host of useful tools for journalists. Some, like Fech, require the command line to run. Fech can help journalists extract data from the Federal Election Commission to show who is spending money on whom in the current presidential campaign cycle.

2. Other programs/tools that journalists can use:

Let’s say you want to pull a bunch of information from a website to use in a story or visualization, but copy and pasting the text is not only tedious but very time consuming.

Why not just scrape the data using a program made in a language like Python or Ruby and put it in a spreadsheet or Word document? After all, computers are great at performing tedious tasks in just a few minutes.

One of my favorite web scraping walkthroughs comes from BuzzData. It shows how to pull water usage rates for every ward in Toronto and can easily be applied to other scenarios (I used it to pull precinct information from the Iowa GOP website). The best way to run this program and scrape the data is to run it through your command line.

Another great walkthrough on data scraping is this one from ProPublica’s Dan Nguyen. Instead of using the Python programming language, like the one above, it uses Ruby. But the goal remains the same: making data manageable for both journalists and readers.

A neat mapping service that is gaining popularity at news organizations is TileMill. Here are a few examples to help get you motivated.

One of the best places to start with TileMill is this walkthrough from the application team at the Chicago Tribune. But beware: you’ll need to know the command line before you start (trust me, I learned the hard way).

3. You’ll impress your friends because the command line kind of looks like the Matrix

And who doesn’t want that?

Okay I’m sort of interested…How do I learn?

I can’t tell people enough how much these two command line tutorials from PeepCode helped me. I should note that each costs $12 but are well worth it, in my opinion.

Also, there is this basic tutorial from HypeXR that will help. And these shortcuts from LifeHacker are also great.

Otherwise, a quick Google or YouTube search will turn up thousands of results. A lot of programmers use the command line and, fortunately, many are willing to help teach others.

Written by csessig

January 31, 2012 at 9:21 am

Posted in Command Line, Courier, Data, Lee Enterprises, Multimedia journalism

Graph: Hispanic population growing in Iowa

leave a comment »

This graph I put together with a weekend story breaks down Iowa’s Hispanic population by county. It was based almost entirely off a JavaScript chart walkthrough put together by Michele Minkoff, Interactive Producer for the Associated Press. So check it out now because it’s FANTASTICO!

I did make a couple of minor tweaks, which may be helpful for others so I will outline them here.

My initial, final product looked like this. Go ahead and roll over the bars like the blue one, for instance, which represents the amount of Hispanic people in Iowa who said they were also white. Notice how the bars in 2000 and 2010 are both the same length, even though the value of the 2000 bar (38,000) is almost half of the 2010 bar (80,000).

The data was retrieved from Census.IRE.org. I just selected “Iowa,” then “County,” then “Hispanic or Latino origin by race.” You can then download a CSV of the data and chop it up as you see fit. Also click an option after “Browse data for…” on the Census.IRE.org page for a great breakdown of what each of the headers in the CSV files means. Here’s Iowa’s page, for instance.

The Javascript file that makes these graphs calls a JSON file containing information retrieved from Census.IRE.org. My JSON file initially looked like this:

headers = [“White”,”Black / African American”,”American Indian / Alaska Native”,”Asian”,”Native Hawaiian / Other Pacific Islander”,”Some other race”,”Two or more races”]
allCountyData = [
[“Iowa”,38296,1109,1034,290,121,35317,6306,80438,2242,2503,497,206,54000,11658],
…(Enter data for every county in Iowa here)

The headers represent the different races of Hispanic people. The line for “Iowa” represents the amount of people per particular race in first 2000 and then 2010. 38296 = Number of Hispanic people in 2000 who said they were White, 1109 said they were black, etc. 80438 = Number of Hispanic people in 2010 who said they were White, 2242 said they were black, etc.

Here’s my tweak: Instead of having both 2000 and 2010 data on the same line, I broke this out onto two separate lines. So my new JSON file includes this:

allCountyData = [
[“Iowa”,38296,1109,1034,290,121,35317,6306],
…(Enter data for every county in Iowa here)

And this:

allCountyData2 = [
[“Iowa”,80438,2242,2503,497,206,54000,11658],
…(Enter data for every county in Iowa here)

From there, I needed to edit the Javascript file to call the data contained in allCountyData2. Here’s my Javascript file. Go ahead and click on it. I know you want to. And scroll down to “function changeGraph(stateText)” and notice how it is calling both selectedData, which includes the value for everything in allCountyData (in the JSON file), AND selectedData2, which contains the value for everything in allCountyData2 (in the JSON file). The “drawVisualization();” inside the function also contains two arguments now: “selectedData, selectedData2.”

As a result, the “function drawVisualization()” at the very top of the Javascript file must also contain two arguments: “newData, newData2.” Initially, my drawVisualization function (here’s my first Javascript file) contained just one argument because it was only calling newData, which contains data from allCountyData. At the very end of the function, it broke newData (allCountyData) into two because remember, allCountyData contained data for both 2000 and 2010:

//the first half of our data is for 2000, so we fill the row in with appropriate numbers
//we start at 1 to leave out the years, remember data structured this way starts at 0
for (var i = 1; i <=(newData.length-1)/2; ++i) {
data.setValue(0, i, newData[i]);
data.setFormattedValue(0,i, numberFormat(newData[i]));
}
//now, the second half of the data is for 2010, so we’ll fill that in
for (var i = 1; i<=(newData.length-1)/2; ++i) {
data.setValue(1, i, newData[i]);
data.setFormattedValue(1,i, numberFormat(newData[i+(newData.length-1)/2]));
}

I needed to change that last “for” function because we no longer wanted newData to be broken in half. Instead, we want to call that second argument (newData2) in the drawVisualization function because it calls the data from allCountyData2:

//the first half of our data is for 2000, so we fill the row in with appropriate numbers
//we start at 1 to leave out the years, remember data structured this way starts at 0
for (var i = 1; i <=(newData.length-1); ++i) {
data.setValue(0, i, newData[i]);
data.setFormattedValue(0, i, numberFormat(newData[i]));
}
//now, the second half of the data is for 2010, so we’ll fill that in
for (var i = 1; i<=(newData2.length-1); ++i) {
data.setValue(1, i, newData2[i]);
data.setFormattedValue(1, i, numberFormat(newData2[i]));
}

If I’m not mistaken, that’s all I did. If you are interested, my HTML file is here and my CSS file is here. Again, Michele Minkoff deserves all the credit in the world for putting together her great walkthrough. So get over there and check it out!

Written by csessig

November 28, 2011 at 11:29 pm

Posted in Charts, Courier, Data, Google, Multimedia journalism

ProPublica to the rescue

leave a comment »

ProPublica, known for producing excellent, investigative journalism, also has a wonderful staff of developers that have put out several tools to help fellow journalists like myself. Here’s a quick run through two of their tools I’ve used in the last month.

– TimelineSetter – I’m a big fan of timelines. They can help newspapers show passage of time, obviously, as well as keep their stories on a particular subject in one central location. This is exactly how we used ProPublica’s TimelineSetter tool when ‘Extreme Makeover: Home Edition’ announced they were going to build a new home for a local family. From the print side, we ran several stories, including about one a day during the week of the build. The photo department also put out four photo galleries on the build and a fundraiser. Finally, our videographer shot several videos. Our audience ate up our coverage, raking up more than 100,000 page views on the photo galleries alone. But unless you wanted to attach every story, gallery and video to any new story we did (which would be both cumbersome and unattractive), it would have been hard to get a full scope of our coverage. That’s were the ProPublica tool came into play. Simply, it helped compile all of our coverage of the event on one page.

I’m not going to go into detail on how I put together the timeline. Instead, I will revert you to their fantastic and easy to use documentation. Best of all, the timeline is easy to customize and upload to your site. It’s also free, unlike the popular timeline-maker Dipity. Check it out!

– TableSorter – This tool is equally as impressive and fairly easy to use. The final output is basically an easy-to-navigate-and-sort spreadsheet. And, again, the documentation is comprehensive. Run through it and you’ll have up sorted table in no time! I’ve put together two already in the last week or so.

The first is a list of farmers markets in Iowa, with links to their home page (if available) and a Google map link, which was formatted using a formula in Microsoft Excel. The formula for the first row looked like this: =CONCATENATE(“http://www.google.com/maps?q=”, ” “, B2, ” “, C2, ” “, E2, ” “, “Iowa”)

The first part is the Google Map link, obviously. B2 represented the cell with the city address; C2 = City; E2 = Zip code and finally “Iowa” so Google Maps knows where to look. In between each field I put in a space so Google can read the text and try to map it our using Google Maps (I should note that not every location was able to be mapped out). Then I just copy and pasted this for every row in the table. At this point, I had a standard XLS Excel file, which I saved as a CSV file. TableSetter uses that CSV file and formats it using a YML file to produce the final output. Go and read the docs…It’s easier than having me try to explain it all. Here’s what my CSV looked like; here’s my YML file; and finally the table, which was posted on our site.

In the same vein, I put together this table on what each state department is requesting from the government in the upcoming fiscal year.

I should also note here that the Data Desk at the LA Times has a variation of ProPublica’s TableSorter that offers more features (like embedding photos into the table, for instance). It’s called Table Stacker and works in a very similar fashion as TableSorter. I recommend checking it out after you get a feel for ProPublica’s TableSorter.

– Learning the Command Line: Both of these tools require the use of the command line using the Terminal program installed on your computer. If you are deathly afraid of that mysterious black box like I was, I highly recommend watching PeepCode’s video introduction to the command line called “Meet the Command Line.” And when your done, go ahead and jump into their second part called “Advanced Command Line.” Yes, they both cost money ($12 a piece), but there is so much information packed into each hour-long screencast, that they are both completely worth it. I was almost-instantly comfortable with the command line after watching both screencasts.

Written by csessig

October 21, 2011 at 3:23 pm

Posted in Command Line, Data, Multimedia clips, Multimedia journalism, ProPublica

	csessig on Leaflet formula: Counting mark…
	Greg on Leaflet formula: Counting mark…
	Lachlan Springfield on Turning Excel spreadsheets int…
	csessig on Leaflet formula: Counting mark…
	Werner Swart on Leaflet formula: Counting mark…

Chris Essig

Archive for the ‘Data’ Category

Resources for moving away from Google maps

Turning Blox assets into timelines: Last Part

Multiple layers and rollover effects for Fusion Table maps

A few reasons to learn the command line

Graph: Hispanic population growing in Iowa

ProPublica to the rescue

Pages

Get to know me

Follow me on Twitter

Recent rambles

Archives

Recent Comments