Data Visualization with JavaScript without D3
A companion article to my "Data Visualization with JavaScript without D3" talk at DataVis LA.
UPDATE: Sadly, Google changed something with how their spreadsheets are accessed, and now fetching the data causes an error/alert. So much for thinking that Google Spreadsheets would be the easiest way to access this data... If you're interested, here are code changes that can get it to work again.
tl;dr: We'll take a dataset of Hollywood movies, and step by step, we'll build up a simple visualization using only JavaScript in the browser.
- The Table of Contents
- Part 1: Context
- I. So... Why JavaScript?
- II. So... Why Not D3?
- III. What We'll Build
- IV. The 5 "Steps" of Data Visualization
- Part 2: Build
- V. Setup
- VI. Step 1: Fetch
- VII. Step 2: Transform
- VIII. Step 3: Display
- IX. Step 4: Interaction
- X. Step 5: Publish
- Part 3: Closing
- XI. Additional Resources
So... Why JavaScript?
JavaScript is a really powerful tool for creating data visualizations. Here are a couple of the more powerful arguments going for it:
1. Interactive & Dynamic
Most data visualization that you'll see are static and non-interactive. However, new possibilities open up when you start to explore animation and interactivity. Your data's story can be way more impactful if these extra dimensions are used well. Hans Rosling's famous TED Talk would not have been nearly as compelling if he just showed static images.
2. Browsers & Accessibility
Clearly, as proved above, you don't need JavaScript to make a dynamic or interactive visualization (Rosling's tool "Gapminder" is not JavaScript based).
However, JavaScript is the only language that runs natively in the browser. For mobile and tablets, this is huge. Once your visualization is on the web, just share a link, and that's it. With nothing to download and nothing to install, JavaScript is unrivaled when it comes to accessibility.
A visualization that's never seen is useless.
3. Modules & npm
The less code you have to write, the better. If someone has already written code that does what you need, use it!
Right now, as I write this, there are 48,566 modules on npm,
and there will be more by the time you read this. Using a tool like Browserify you
can use these modules very easily in the browser. Chances are very good
that something you'd want for your visualization is available for use right
now, just one npm install
away.
We're only going to use two modules from npm in this example: superagent, for making AJAX requests and csv, for parsing csv data.
And yes, D3 is on npm too ;)
So... Why not D3?
Let's get this out of the way: D3 is awesome. There's a very good reason it's the de-facto standard library for doing complex data visualization with JavaScript. We use it pretty extensively at Interlincx. I know there's a lot of demand for D3 resources, so here's my reasoning for not covering it here:
I want to concentrate on the fundamentals of creating a browser-based visualization without getting lost in any library-specific way of doing things. Even if you're very experienced with both JavaScript and creating data visualizations, D3 has a bit of a learning curve.
D3 has a whole lot of functionality to help you create some pretty amazing visualizations, but the techniques I'll cover should translate to creating any browser-based visualization, regardless of tool.
While D3 is an amazing tool, it's important to remember it's just one tool in our toolbox. It can be helpful to think about what we want to achieve at more of a conceptual level first before thinking about how to use a specific tool.
Lastly, I will most likely cover D3 in a future talk/article. So stay tuned for that =)
What We'll Build
We're going to build a very simple interactive visualization, a horizontal bar chart that:
- 1. visualizes data from Hollywood movies from 2011
- 2. fetches movie data from a Google Drive spreadsheet
- 3. shows additional information in a side overlay when hovering over a bar
Here's the live demo, hover over a bar to see more detailed info:
The 5 "Steps" of Data Visualization
When creating data visualizations it's helpful to think of the process containing 5 "steps." I put "steps" in quotes because it's much more of an iterative process than it is a sequence.
-
1. Fetch
- Get the source data into the browser. -
2. Transform
- Ensure the correct format, add any additional wanted data. -
3. Display
- Add visual elements to the page. -
4. Interaction
- Add interaction events. -
5. Publish
- Get it out there so other people can see it!
Let's apply the five steps to creating our visualization above:
-
1. Fetch
- Make an AJAX request to the Google Drive spreadsheet to get its csv and parse it. -
2. Transform
- Change the format to something easier to work with and add an additional metric "quality per dollar." -
3. Display
- Append html divs to the page, each representing a movie in the dataset, sized relative to its "quality per dollar" -
4. Interaction
- Listen to mouse hover events and populate the sidebar with more of the movie's data. -
5. Publish
- Bundle up our JavaScript and host it on a server.
Let's get started!
Setup
First, make sure you have Node.js installed. This project was made using version 0.10, anything higher will probably work.
Next, to create a new directory for this project, and initialize the project with npm (you can just accept all the defaults), do this in your terminal:
mkdir hollyviz cd hollyviz npm init
Next we'll want to install Browserify,
to package our modules up into a "bundle", and Beefy,
an awesome local development server that will automatically bundle up our
visualization (with Browserify), and serve it up with a barebones index.html
for
us.
npm install -g browserify beefy
To verify everything is working. In that directory, create a new file index.js
with
the contents alert('Beefy is working!')
. Then, run Beefy like
so:
beefy index.js
Beefy should now be running on port 9966 (and it should tell you as much). Open your browser to http://localhost:9966. If you see that alert message, we're ready to go!
Step 1: Fetch
Since we're storing our data in Google Drive, the first thing we have to do is use the "Publish to the web..." feature so that our visualization can access the data.
Once we have a link to the data as CSV, we can create a module to fetch and parse it.
Let's modify our index.js
file. Empty it out and let's store
that url (I've also made a note of the 5 steps):
As mentioned before, we're going to use two modules from npm to do the
fetching and parsing: superagent and
csv. Let's install them now. From your command line, we'll tell npm
to install those packages and store them as dependencies in the package.json
file.
npm install --save superagent csv
Now that we have both of those installed, we're ready to make a "fetch"
module that uses them. Create a new file called fetch.js
. This
module will have a simple API: it will take two arguments, a csv's url
and a callback to return the parsed data to. Here's how fetch.js
should
look:
Now we can use it in index.js
. I've also created a simple
"show" helper that will put the data in a textarea for us to see:
This is what you should see if you have Beefy running and you visit http://localhost:9966:
[ Issue accessing Google Spreadsheet data ]
I don't love that data format. Also, I'm interested in a metric that isn't in the data. I want to know about "quality per dollar" -- how many Rotten Tomatoes points a movie received per dollar spent. Let's fix this in step 2.
Step 2: Transform
Our transform step will be very simple. We'll first change the data format from an array of arrays to an array of objects, this will make working with it in the next steps easier. Then, once we do that, we'll just use two of the existing metrics, Rotten Tomatoes Score and Budget, to calculate a new one: "Quality per Dollar".
Create a new file transform.js
. Here's how it should look:
Now that our transform module is ready, let's use it in index.js
Here's how it will look in Beefy. Notice that the format has changed, and we now have a new property: "qualityPerDollar":
[ Issue accessing Google Spreadsheet data ]
Now that the data is taken care of, let's move on to the visualization!
Step 3: Display
For our display step, for each movie we'll add a div to the page, relatively sized according to its "qualityPerDollar" value.
To do this, we'll create a new module in display.js
Now, let's make a couple changes to index.js
. First, we'll
get rid of that "show" helper. Since we'll be visualizing the data now,
we won't need it. Second, we'll require our display module so we can use
it. Then, we'll create an html element for our module to use. Finally,
we'll pass our display module three things: the newly created html element
to hold the movie "bars", the array of movie data, and the property we
want to visualize, "qualityPerDollar":
Here's how it should look now:
[ Issue accessing Google Spreadsheet data ]
Alright, we're getting somewhere! Let's add some interaction so we can see what those movies are!
Step 4: Interaction
Much like before, we'll create a new file interaction.js
for
our interaction module. Like display, it will take three arguments, but
a little different this time:
Now let's use it in index.js
. This will be very similar to
Step 3 since we'll need to create a new element to store the detailed info:
And there we have it! This is what you should now see in Beefy:
[ Issue accessing Google Spreadsheet data ]
Step 5: Publish
Alright, so now that we have it all working, how do we publish this so others can see our work?
For this, we'll use Hedwig, a hosting tool for Browserify bundles. So, like Browserify and Beefy, install Hedwig globally with npm:
npm install -g hedwig
And now, all we'll have to do is pipe the output of Browserify to Hedwig, and Hedwig will take care of the rest:
browserify index.js | hedwig
Hedwig will give you a url so you can easily share your visualization. For example, you can see this one here: http://hedwig.in/g/aaf8a57de9fdef33f8ac