GDP and life expectancy visualization

Countries and their comparison of GDP and life expectancy in 1827. They are represented as circles depending on their population and assigned a color according to their continent

To solidify what I’ve learned so far in a Udemy course I’ve decided to write about a visualization of changes in GDP and life expectancy and since I am a bit of rebel I did couple of changes that weren’t initially in the assignment but sounded interesting.

Loading and cleaning data

Let’s start with getting the data. D3.js has its own functions to fetch data from remote location in three different formats – json, csv and tsv – but if you want to you can fetch your own data using Fetch API or AJAX. Version 5 of D3.js returns a promise with data, so when we work with data we have to do it there.

In our case we will be dealing with countries and countries do have one problem, they have their own lives and life spans. Personal example: I was born in a country almost 29 years ago which doesn’t exist anymore (Czechoslovakian Socialist Republic). When I was three I visited with my parent a part of the succeeding country (Czechoslovakian Federative Republic) which at the moment is part of a different country than where I live now (Czech republic).

So we must as a first step filter out countries with no data at all. And then clean the data that we got from JSON because numbers in this dataset are saved as strings and not numbers as I would expect.

Building the place for data

When we have data ready, we can start with the most important thing, labels and descriptions of what the data represents. I didn’t understand it why it was important when I was at high school and neither did my classmates because every week in labs, physics or math class at least one person got scolded for improperly labeling a graph. These day I consider labels1 as much as important as the data because without them the data is just a dataset of random numbers.

Size setup

At the beginning of the file with my D3 code I like to define dimensions and margins for the graph and scales for the data.

The reason why I am defining scales outside of the scope of the resolved promise is because I’ll reuse them in a function defined in the same scope as those variables.

Since we have data and we know how big our chart should be, we need to get an HTML element which will hold our graph and append an svg element to it. If you’ve ever used jQuery, you’re all set because it’s very similar syntax. When we defined our svg element we can add attributes to it.

Like in HTML in SVG we can append all child nodes to one main element but doing that it would make quite hard to manage any changes. So we will create a group element where we will display the data.

Setting up scales

Now that we have space where labels will never go, we can do something about our scales that we defined earlier. First we start with x-axis which describes GDP per capita. Since we can guess that the GDP today and GDP 10, 20 or 50 years ago are significantly different, we probably shouldn’t use a linear scale. Logarithmic scale fits our problem a bit better.

First function after scaleLog()  base(10) defines a number which we are using as a base for our logarithm.  range() defines target dimension for the scale and  domain() defines data’s ranges. When it comes to defining meaningful ranges for the domain() I must say that I have struggled with it and at the moment I am not fully comfortable with. The lower boundary has to be bigger than zero because log(0) is not defined for any base. I have experimented with 1, 10 and 100, the teacher of the course used 142 which I adopted because it looks nice.

The upper boundary is a bit simpler, we have to find the biggest number for income in our data. To do that we can use D3’s function max(). Function min() works in the same way.

On y-axis which represents life expectancy we can use linear scale because we don’t expect humans to live 1000 years, Methuselah was only 969 years old when he died, but we still need to find the maximum of life expectancy for upper boundary. The lower boundary is a bit easier because we can expect that it’ll be at least 0.

Since we want to display population sizes as circles, it makes more sense than trying to visualize it as squares, we need to think more mathematically. Formula for a radius is 𝜋⋅r, therefore our ranges have to be modified accordingly.

Those values are observed and for simplicity. Some countries which aren’t as populous are omitted from the graph which makes it easier to display more relevant data. What is relevant is for another discussion.

Axes

Next on our menu are axes. First we need to define them, let’s start with x-axis:

Here we are saying that we want axis on the bottom with scale xScale and tick values should be between 1 and 100,000, and tick labels should have ‘$’ sign in front of them. In reality the graph won’t display display tick at 1 but it will display the tick at 100 despite setting lower boundary on xScale as 142.Graph showing ticks at 100, 1000, 10000 and 100000

Now that we have defined axis we need to add it to the SVG element. To do that we will create a new group element, translate it to the desired place because D3 can handle where the text for the ticks is but it can’t move the axis to the correct location, and apply the x-axis.

But before we are finished with x-axis we should make the text more readable. We do that by rotating the text by 45 degrees (or 𝜋/4 if it was defined).

And similarly we will do y-axis:

The anonymous function in tickFormat()  tells D3 to use every number it finds but as we could see in the picture about D3 is somewhat intelligent and reduces the ticks to the most significant ones.

Axes’ label

At the moment we are missing axes’ legend and legend for the data. So let’s continue with axes’ legend because we want to know what is what.

We want to put in the center of the chart our label. To do that we’ll take half of our inner dimension and set text-anchor’s value to middle.

The label for y-axis is similar but you’ll need to know about SVG’s coordinate system and what happens when you rotate elements. There’s little space to expand on it here and Sara Soueidan wrote about it a lot. (Link to her post is in the resources at the bottom.)

Legend

First we need to decide how we’ll classify our countries. In this case by the continents. The definition below is simplification and if we had different dataset, we’d have choose differently. Preferably generate it from our dataset.

As usual our legend will be in its own group element.

At this point we can’t really say where it’ll be most useful. Those values for CSS’s translate() I adjusted after watching the graph in action.

Next we’ll add more elements to the legend group. Each continent will be in its own group/row and will have a text description and a color indicator. Assigning colors to continents is done behind the scenes inside D3.js with continentColor scale.

Legend. Europe purple box. Asia red box. America blue box. Africa yellow box.

Now we can move to displaying the data

Appear data

We have axes, labels and legend set up and now we need to display our data, preferably the graph should animate as well because our dataset has data for the past 214 years. We will use update() inside the initial promise and we’ll set up an interval.

The update() function is not part of D3.js., we need to define it ourselves. My preference at the moment is to define it outside.

Note to the interval, there is a rule of thumb that animations should take somewhere between 200 and 300ms. I’ve tested the animation with those numbers and 100ms looks smoother than using timing in the recommended range.

Also choosing to display data first and then set interval is an engineering decision. Initializing index variable to -1 would create similar situation.

Update function

First we’ll need to give the data to D3.js. We do it by selecting all elements with desired class or element name. Since we used rectangles in the legend, we can choose all circles. Because this query is through D3.js and not normal querySelector() function, we should expect that there will added properties in the returned object.

Function  data() is such property. It takes the data and creates virtual circles for it.

The next step is to get rid of the old data in the graph. This sounded counter-intuitive to me at first but it’s a logical step. It helps with transitioning from the old values to the new ones.

And the last step before we are finished is initializing the new data with help of function merge() . The transition there is to help moving from one state to another.

Population there follows formula A = 𝜋⋅r2 to calculate radius of the circles. X and Y coordinates need to be scaled to the fit the dimensions of the chart using xScale and yScale.

Control buttons

I wanted to make it a bit fancy, so I added to the HTML file my own control buttons and a year indicator. Please, disregard the abomination known as Bootstrap.

Step back, forward, play and pause buttons. Year indicator

In the code I decided to to update the interval function first by adding a year to my year indicator, span element with id “year”.

Since I have there four buttons I do have four event listeners. First two are simple steppers which will step either forward or backward without clearing/stopping the interval.

If I wanted to be a bit more fancy I could clear the interval when either of those buttons is pushed but I’d have to think about it a bit if it’s even useful.

“Play animation” button starts new interval. By default this button is disabled because my current interval handling is not perfect and remembers only the last started interval. If it was enabled always it would make the animation run faster with every click on the button and it would be unstoppable by the current implementation of stop button.

Inside the event listener we toggle disabled attributes on Play and Stop buttons and set new interval.

On the stop button we toggle disable attributes as well and clear the interval.

Conclusion

This article I showed you how I created a “simple” data visualization using D3.js. You can take a look at finish thing here. (Code is on minified, so browsing with dev tools is possible.)

Finished data visualization

Honestly, I had to look occasionally at instructor’s solution because after spending a lot of time with Angular, React and Vue.js it was a bit hard to switch back to normal JavaScript. This data visualization article won’t be last.

Resources

Demo

Color palette for Color Blindness

Sara Soueidan’s Part 1 post about SVG and coordinates

  1. Same goes for documentation