- The Table of Contents
- Part 1: Context
- II. So... Why Not D3?
- III. What We'll Build
- IV. The 5 "Steps" of Data Visualization
- Part 2: Build
- V. Setup
- VI. Step 1: Fetch
- VII. Step 2: Transform
- VIII. Step 3: Display
- IX. Step 4: Interaction
- X. Step 5: Publish
- Part 3: Closing
- XI. Additional Resources
1. Interactive & Dynamic
Most data visualization that you'll see are static and non-interactive. However, new possibilities open up when you start to explore animation and interactivity. Your data's story can be way more impactful if these extra dimensions are used well. Hans Rosling's famous TED Talk would not have been nearly as compelling if he just showed static images.
2. Browsers & Accessibility
A visualization that's never seen is useless.
3. Modules & npm
The less code you have to write, the better. If someone has already written code that does what you need, use it!
Right now, as I write this, there are 48,566 modules on npm, and there will be more by the time you read this. Using a tool like Browserify you can use these modules very easily in the browser. Chances are very good that something you'd want for your visualization is available for use right now, just one
npm install away.
And yes, D3 is on npm too ;)
So... Why not D3?
D3 has a whole lot of functionality to help you create some pretty amazing visualizations, but the techniques I'll cover should translate to creating any browser-based visualization, regardless of tool.
While D3 is an amazing tool, it's important to remember it's just one tool in our toolbox. It can be helpful to think about what we want to achieve at more of a conceptual level first before thinking about how to use a specific tool.
Lastly, I will most likely cover D3 in a future talk/article. So stay tuned for that =)
What We'll Build
We're going to build a very simple interactive visualization, a horizontal bar chart that:
- 1. visualizes data from Hollywood movies from 2011
- 2. fetches movie data from a Google Drive spreadsheet
- 3. shows additional information in a side overlay when hovering over a bar
Here's the live demo, hover over a bar to see more detailed info:
The 5 "Steps" of Data Visualization
When creating data visualizations it's helpful to think of the process containing 5 "steps." I put "steps" in quotes because it's much more of an iterative process than it is a sequence.
1. Fetch- Get the source data into the browser.
2. Transform- Ensure the correct format, add any additional wanted data.
3. Display- Add visual elements to the page.
4. Interaction- Add interaction events.
5. Publish- Get it out there so other people can see it!
Let's apply the five steps to creating our visualization above:
1. Fetch- Make an AJAX request to the Google Drive spreadsheet to get its csv and parse it.
2. Transform- Change the format to something easier to work with and add an additional metric "quality per dollar."
3. Display- Append html divs to the page, each representing a movie in the dataset, sized relative to its "quality per dollar"
4. Interaction- Listen to mouse hover events and populate the sidebar with more of the movie's data.
Let's get started!
First, make sure you have Node.js installed. This project was made using version 0.10, anything higher will probably work.
Next, to create a new directory for this project, and initialize the project with npm (you can just accept all the defaults), do this in your terminal:
mkdir hollyviz cd hollyviz npm init
Next we'll want to install Browserify, to package our modules up into a "bundle", and Beefy, an awesome local development server that will automatically bundle up our visualization (with Browserify), and serve it up with a barebones
index.html for us.
npm install -g browserify beefy
To verify everything is working. In that directory, create a new file
index.js with the contents
alert('Beefy is working!'). Then, run Beefy like so:
Beefy should now be running on port 9966 (and it should tell you as much). Open your browser to http://localhost:9966. If you see that alert message, we're ready to go!
Step 1: Fetch
Since we're storing our data in Google Drive, the first thing we have to do is use the "Publish to the web..." feature so that our visualization can access the data.
Once we have a link to the data as CSV, we can create a module to fetch and parse it.
Let's modify our
index.js file. Empty it out and let's store that url (I've also made a note of the 5 steps):
As mentioned before, we're going to use two modules from npm to do the fetching and parsing: superagent and csv. Let's install them now. From your command line, we'll tell npm to install those packages and store them as dependencies in the
npm install --save superagent csv
Now that we have both of those installed, we're ready to make a "fetch" module that uses them. Create a new file called
fetch.js. This module will have a simple API: it will take two arguments, a csv's url and a callback to return the parsed data to. Here's how
fetch.js should look:
Now we can use it in
index.js. I've also created a simple "show" helper that will put the data in a textarea for us to see:
This is what you should see if you have Beefy running and you visit http://localhost:9966:
I don't love that data format. Also, I'm interested in a metric that isn't in the data. I want to know about "quality per dollar" -- how many Rotten Tomatoes points a movie received per dollar spent. Let's fix this in step 2.
Step 2: Transform
Our transform step will be very simple. We'll first change the data format from an array of arrays to an array of objects, this will make working with it in the next steps easier. Then, once we do that, we'll just use two of the existing metrics, Rotten Tomatoes Score and Budget, to calculate a new one: "Quality per Dollar".
Create a new file
transform.js. Here's how it should look:
Now that our transform module is ready, let's use it in
Here's how it will look in Beefy. Notice that the format has changed, and we now have a new property: "qualityPerDollar":
Now that the data is taken care of, let's move on to the visualization!
Step 3: Display
For our display step, for each movie we'll add a div to the page, relatively sized according to its "qualityPerDollar" value.
To do this, we'll create a new module in
Now, let's make a couple changes to
index.js. First, we'll get rid of that "show" helper. Since we'll be visualizing the data now, we won't need it. Second, we'll require our display module so we can use it. Then, we'll create an html element for our module to use. Finally, we'll pass our display module three things: the newly created html element to hold the movie "bars", the array of movie data, and the property we want to visualize, "qualityPerDollar":
Here's how it should look now:
Alright, we're getting somewhere! Let's add some interaction so we can see what those movies are!
Step 4: Interaction
Much like before, we'll create a new file
interaction.js for our interaction module. Like display, it will take three arguments, but a little different this time:
Now let's use it in
index.js. This will be very similar to Step 3 since we'll need to create a new element to store the detailed info:
And there we have it! This is what you should now see in Beefy:
Step 5: Publish
Alright, so now that we have it all working, how do we publish this so others can see our work?
For this, we'll use Hedwig, a hosting tool for Browserify bundles. So, like Browserify and Beefy, install Hedwig globally with npm:
npm install -g hedwig
And now, all we'll have to do is pipe the output of Browserify to Hedwig, and Hedwig will take care of the rest:
browserify index.js | hedwig
Hedwig will give you a url so you can easily share your visualization. For example, you can see this one here: http://hedwig.in/g/7683051