2. Economic Growth Evidence#

2.1. Overview#

In this lecture we use Python, Pandas, and Matplotlib to download, organize, and visualize historical data on GDP growth.

In addition to learning how to deploy these tools more generally, we’ll use them to describe facts about economic growth experiences across many countries over several centuries.

Such “growth facts” are interesting for a variety of reasons.

Explaining growth facts is a principal purpose of both “development economics” and “economic history”.

And growth facts are important inputs into historians’ studies of geopolitical forces and dynamics.

Thus, Adam Tooze’s account of the geopolitical precedents and antecedents of World War I begins by describing how Gross National Products of European Great Powers had evolved during the 70 years preceding 1914 (see chapter 1 of [Too14]).

Using the very same data that Tooze used to construct his figure, here is our version of his chapter 1 figure.


(This is just a copy of our figure Fig. 2.6. We desribe how we constructed it later in this lecture.)

Chapter 1 of [Too14] used his graph to show how US GDP started the 19th century way behind the GDP of the British Empire.

By the end of the nineteenth century, US GDP had caught up with GDP of the British Empire, and how during the first half of the 20th century, US GDP surpassed that of the British Empire.

For Adam Tooze, that fact was a key geopolitical underpinning for the “American century”.

Looking at this graph and how it set the geopolitical stage for “the American (20th) century” naturally tempts one to want a counterpart to his graph for 2014 or later.

(An impatient reader seeking a hint at the answer might now want to jump ahead and look at figure Fig. 2.7.)

As we’ll see, reasoning by analogy, this graph perhaps set the stage for an “XXX (21st) century”, where you are free to fill in your guess for country XXX.

As we gather data to construct those two graphs, we’ll also study growth experiences for a number of countries for time horizons extending as far back as possible.

These graphs will portray how the “Industrial Revolution” began in Britain in the late 18th century, then migrated to one country after another.

In a nutshell, this lecture records growth trajectories of various countries over long time periods.

While some countries have experienced long term rapid growth across that has lasted a hundred years, others have not.

Since populations differ across countries and vary within a country over time, it will be interesting to describe both total GDP and GDP per capita as it evolves within a country.

First let’s import the packages needed to explore what the data says about long run growth

import pandas as pd
import os
import matplotlib as mpl
import matplotlib.pyplot as plt
import matplotlib.cm as cm
import numpy as np
from collections import namedtuple
from matplotlib.lines import Line2D

2.2. Setting up#

A project initiated by Angus Maddison has collected many historical time series related to economic growth, some dating back to the first century.

The data can be downloaded from the Maddison Historical Statistics webpage by clicking on the “Latest Maddison Project Release”.

For convenience, here is a copy of the 2020 data in Excel format.

Let’s read it into a pandas dataframe:

data = pd.read_excel("datasets/mpd2020.xlsx", sheet_name='Full data')
countrycode country year gdppc pop
0 AFG Afghanistan 1820 NaN 3280.00000
1 AFG Afghanistan 1870 NaN 4207.00000
2 AFG Afghanistan 1913 NaN 5730.00000
3 AFG Afghanistan 1950 1156.0000 8150.00000
4 AFG Afghanistan 1951 1170.0000 8284.00000
... ... ... ... ... ...
21677 ZWE Zimbabwe 2014 1594.0000 13313.99205
21678 ZWE Zimbabwe 2015 1560.0000 13479.13812
21679 ZWE Zimbabwe 2016 1534.0000 13664.79457
21680 ZWE Zimbabwe 2017 1582.3662 13870.26413
21681 ZWE Zimbabwe 2018 1611.4052 14096.61179

21682 rows × 5 columns

We can see that this dataset contains GDP per capita (gdppc) and population (pop) for many countries and years.

Let’s look at how many and which countries are available in this dataset


We can now explore some of the 169 countries that are available.

Let’s loop over each country to understand which years are available for each country

cntry_years = []
for cntry in data.country.unique():
    cy_data = data[data.country == cntry]['year']
    ymin, ymax = cy_data.min(), cy_data.max()
    cntry_years.append((cntry, ymin, ymax))
cntry_years = pd.DataFrame(cntry_years, columns=['country', 'Min Year', 'Max Year']).set_index('country')
Min Year Max Year
Afghanistan 1820 2018
Angola 1950 2018
Albania 1 2018
United Arab Emirates 1950 2018
Argentina 1800 2018
... ... ...
Yemen 1820 2018
Former Yugoslavia 1 2018
South Africa 1 2018
Zambia 1950 2018
Zimbabwe 1950 2018

169 rows × 2 columns

Let’s now reshape the original data into some convenient variables to enable quicker access to countries time series data.

We can build a useful mapping between country codes and country names in this dataset

code_to_name = data[['countrycode','country']].drop_duplicates().reset_index(drop=True).set_index(['countrycode'])

Then we can quickly focus on GDP per capita (gdp)

countrycode country year gdppc pop
0 AFG Afghanistan 1820 NaN 3280.00000
1 AFG Afghanistan 1870 NaN 4207.00000
2 AFG Afghanistan 1913 NaN 5730.00000
3 AFG Afghanistan 1950 1156.0000 8150.00000
4 AFG Afghanistan 1951 1170.0000 8284.00000
... ... ... ... ... ...
21677 ZWE Zimbabwe 2014 1594.0000 13313.99205
21678 ZWE Zimbabwe 2015 1560.0000 13479.13812
21679 ZWE Zimbabwe 2016 1534.0000 13664.79457
21680 ZWE Zimbabwe 2017 1582.3662 13870.26413
21681 ZWE Zimbabwe 2018 1611.4052 14096.61179

21682 rows × 5 columns

gdppc = data.set_index(['countrycode','year'])['gdppc']
gdppc = gdppc.unstack('countrycode')
1 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
730 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
1000 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
1090 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
1120 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
2014 2022.0000 8673.0000 9808.0000 72601.0000 19183.0000 9735.0000 47867.0000 41338.0000 17439.0000 748.0000 ... 19160.0000 51664.0000 9085.0000 20317.0000 5455.0000 4054.0000 14627.0000 12242.0000 3478.0000 1594.0000
2015 1928.0000 8689.0000 10032.0000 74746.0000 19502.0000 10042.0000 48357.0000 41294.0000 17460.0000 694.0000 ... 19244.0000 52591.0000 9720.0000 18802.0000 5763.0000 2844.0000 14971.0000 12246.0000 3478.0000 1560.0000
2016 1929.0000 8453.0000 10342.0000 75876.0000 18875.0000 10080.0000 48845.0000 41445.0000 16645.0000 665.0000 ... 19468.0000 53015.0000 10381.0000 15219.0000 6062.0000 2506.0000 15416.0000 12139.0000 3479.0000 1534.0000
2017 2014.7453 8146.4354 10702.1201 76643.4984 19200.9061 10859.3783 49265.6135 42177.3706 16522.3072 671.3169 ... 19918.1361 54007.7698 10743.8666 12879.1350 6422.0865 2321.9239 15960.8432 12189.3579 3497.5818 1582.3662
2018 1934.5550 7771.4418 11104.1665 76397.8181 18556.3831 11454.4251 49830.7993 42988.0709 16628.0553 651.3589 ... 20185.8360 55334.7394 11220.3702 10709.9506 6814.1423 2284.8899 16558.3123 12165.7948 3534.0337 1611.4052

772 rows × 169 columns

We create a color mapping between country codes and colors for consistency

Hide code cell source
country_names = data['countrycode']

# Generate a colormap with the number of colors matching the number of countries
colors = cm.tab20(np.linspace(0, 0.95, len(country_names)))

# Create a dictionary to map each country to its corresponding color
color_mapping = {country: color for country, color in zip(country_names, colors)}

2.3. GDP plots#

Looking at the United Kingdom we can first confirm we are using the correct country code

fig, ax = plt.subplots(dpi=300)
cntry = 'GBR'
_ = gdppc[cntry].plot(
    ax = fig.gca(),
    ylabel = 'International $\'s',
    xlabel = 'Year',

Fig. 2.1 GDP per Capita (GBR)#


International Dollars are a hypothetical unit of currency that has the same purchasing power parity that the U.S. Dollar has in the United States at any given time. They are also known as Geary–Khamis dollars (GK Dollars).

We can see that the data is non-continuous for longer periods in the early 250 years of this millennium, so we could choose to interpolate to get a continuous line plot.

Here we use dashed lines to indicate interpolated trends

fig, ax = plt.subplots(dpi=300)
cntry = 'GBR'

ax.set_ylabel('International $\'s')

Fig. 2.2 GDP per Capita (GBR)#

We can now put this into a function to generate plots for a list of countries

def draw_interp_plots(series, ylabel, xlabel, color_mapping, code_to_name, lw, logscale, ax):

    for i, c in enumerate(cntry):
        # Get the interpolated data
        df_interpolated = series[c].interpolate(limit_area='inside')
        interpolated_data = df_interpolated[series[c].isnull()]

        # Plot the interpolated data with dashed lines

        # Plot the non-interpolated data with solid lines
        if logscale == True:
    # Draw the legend outside the plot
    ax.legend(loc='center left', bbox_to_anchor=(1, 0.5), frameon=False)
    return ax

As you can see from this chart, economic growth started in earnest in the 18th century and continued for the next two hundred years.

How does this compare with other countries’ growth trajectories?

Let’s look at the United States (USA), United Kingdom (GBR), and China (CHN)

Hide code cell source
# Define the namedtuple for the events
Event = namedtuple('Event', ['year_range', 'y_text', 'text', 'color', 'ymax'])

fig, ax = plt.subplots(dpi=300, figsize=(10, 6))

cntry = ['CHN', 'GBR', 'USA']
ax = draw_interp_plots(gdppc[cntry].loc[1500:],
    'International $\'s','Year',
    color_mapping, code_to_name, 2, False, ax)

# Define the parameters for the events and the text
ylim = ax.get_ylim()[1]
b_params = {'color':'grey', 'alpha': 0.2}
t_params = {'fontsize': 9, 
            'va':'center', 'ha':'center'}

# Create a list of events to annotate
events = [
    Event((1650, 1652), ylim + ylim*0.04, 
          'the Navigation Act\n(1651)',
          color_mapping['GBR'], 1),
    Event((1655, 1684), ylim + ylim*0.13, 
          'Closed-door Policy\n(1655-1684)', 
          color_mapping['CHN'], 1.1),
    Event((1848, 1850), ylim + ylim*0.22,
          'the Repeal of Navigation Act\n(1849)', 
          color_mapping['GBR'], 1.18),
    Event((1765, 1791), ylim + ylim*0.04, 
          'American Revolution\n(1765-1791)', 
          color_mapping['USA'], 1),
    Event((1760, 1840), ylim + ylim*0.13, 
          'Industrial Revolution\n(1760-1840)', 
          'grey', 1.1),
    Event((1929, 1939), ylim + ylim*0.04, 
          'the Great Depression\n(1929–1939)', 
          'grey', 1),
    Event((1978, 1979), ylim + ylim*0.13, 
          'Reform and Opening-up\n(1978-1979)', 
          color_mapping['CHN'], 1.1)

def draw_events(events, ax):
    # Iterate over events and add annotations and vertical lines
    for event in events:
        event_mid = sum(event.year_range)/2
                event.y_text, event.text, 
                color=event.color, **t_params)
        ax.axvspan(*event.year_range, color=event.color, alpha=0.2)
        ax.axvline(event_mid, ymin=1, 
        ymax=event.ymax, color=event.color, 
        linestyle='-', clip_on=False, alpha=0.15)
# Draw events
draw_events(events, ax)

Fig. 2.3 GDP per Capita, 1500- (China, UK, USA)#

The preceding graph of per capita GDP strikingly reveals how the spread of the industrial revolution has over time gradually lifted the living standards of substantial groups of people

  • most of the growth happened in the past 150 years after the industrial revolution.

  • per capita GDP in the US and UK rose and diverged from that of China from 1820 to 1940.

  • the gap has closed rapidly after 1950 and especially after the late 1970s.

  • these outcomes reflect complicated combinations of technological and economic-policy factors that students of economic growth try to understand and quantify.

It is fascinating to see China’s GDP per capita levels from 1500 through to the 1970s.

Notice the long period of declining GDP per capital levels from the 1700s until the early 20th century.

Thus, the graph indicates

  • a long economic downturn and stagnation after the Closed-door Policy by the Qing government.

  • China’s very different experience than the UK’s after the onset of the industrial revolution in the UK.

  • how the Self-Strengthening Movement seemed mostly to help China to grow.

  • how stunning have been the growth achievements of modern Chinese economic policies by the PRC that culminated with its late 1970s reform and liberalization.

Hide code cell source
fig, ax = plt.subplots(dpi=300, figsize=(10, 6))

cntry = ['CHN']
ax = draw_interp_plots(gdppc[cntry].loc[1600:2000],
    'International $\'s','Year',
    color_mapping, code_to_name, 2, True, ax)

ylim = ax.get_ylim()[1]

events = [
Event((1655, 1684), ylim + ylim*0.06, 
      'Closed-door Policy\n(1655-1684)', 
      'tab:orange', 1),
Event((1760, 1840), ylim + ylim*0.06, 
      'Industrial Revolution\n(1760-1840)', 
      'grey', 1),
Event((1839, 1842), ylim + ylim*0.2, 
      'First Opium War\n(1839–1842)', 
      'tab:red', 1.07),
Event((1861, 1895), ylim + ylim*0.4, 
      'Self-Strengthening Movement\n(1861–1895)', 
      'tab:blue', 1.14),
Event((1939, 1945), ylim + ylim*0.06, 
      'WW 2\n(1939-1945)', 
      'tab:red', 1),
Event((1948, 1950), ylim + ylim*0.23, 
      'Founding of PRC\n(1949)', 
      color_mapping['CHN'], 1.08),
Event((1958, 1962), ylim + ylim*0.5, 
      'Great Leap Forward\n(1958-1962)', 
      'tab:orange', 1.18),
Event((1978, 1979), ylim + ylim*0.7, 
      'Reform and Opening-up\n(1978-1979)', 
      'tab:blue', 1.24)

# Draw events
draw_events(events, ax)

Fig. 2.4 GDP per Capita, 1500-2000 (China)#

We can also look at the United States (USA) and United Kingdom (GBR) in more detail

In the following graph, please watch for

  • impact of trade policy (Navigation Act).

  • productivity changes brought by the industrial revolution.

  • how the US gradually approaches and then surpasses the UK, setting the stage for the ‘‘American Century’’.

  • the often unanticipated consequences of wars.

  • interruptions and scars left by business cycle recessions and depressions.

Hide code cell source
fig, ax = plt.subplots(dpi=300, figsize=(10, 6))

cntry = ['GBR', 'USA']
ax = draw_interp_plots(gdppc[cntry].loc[1500:2000],
    'International $\'s','Year',
    color_mapping, code_to_name, 2, True, ax)

ylim = ax.get_ylim()[1]

# Create a list of data points=
events = [
    Event((1651, 1651), ylim + ylim*0.15, 
          'Navigation Act (UK)\n(1651)', 
          'tab:orange', 1),
    Event((1765, 1791), ylim + ylim*0.15, 
          'American Revolution\n(1765-1791)',
          color_mapping['USA'], 1),
    Event((1760, 1840), ylim + ylim*0.6, 
          'Industrial Revolution\n(1760-1840)', 
          'grey', 1.08),
    Event((1848, 1850), ylim + ylim*1.1, 
          'Repeal of Navigation Act (UK)\n(1849)', 
          'tab:blue', 1.14),
    Event((1861, 1865), ylim + ylim*1.8, 
          'American Civil War\n(1861-1865)', 
          color_mapping['USA'], 1.21),
    Event((1914, 1918), ylim + ylim*0.15, 
          'WW 1\n(1914-1918)', 
          'tab:red', 1),
    Event((1929, 1939), ylim + ylim*0.6, 
          'the Great Depression\n(1929–1939)', 
          'grey', 1.08),
    Event((1939, 1945), ylim + ylim*1.1, 
          'WW 2\n(1939-1945)', 
          'tab:red', 1.14)

# Draw events
draw_events(events, ax)

Fig. 2.5 GDP per Capita, 1500-2000 (UK and US)#

2.4. The industrialized world#

Now we’ll construct some graphs of interest to geopolitical historians like Adam Tooze.

We’ll focus on total Gross Domestic Product (GDP) (as a proxy for ‘‘national geopolitical-military power’’) rather than focusing on GDP per capita (as a proxy for living standards).

data = pd.read_excel("datasets/mpd2020.xlsx", sheet_name='Full data')
data.set_index(['countrycode', 'year'], inplace=True)
data['gdp'] = data['gdppc'] * data['pop']
gdp = data['gdp'].unstack('countrycode')

2.4.1. Early industrialization (1820 to 1940)#

We first visualize the trend of China, the Former Soviet Union, Japan, the UK and the US.

The most notable trend is the rise of the US, surpassing the UK in the 1860s and China in the 1880s.

The growth continued until the large dip in the 1930s when the Great Depression hit.

Meanwhile, Russia experienced significant setbacks during World War I and recovered significantly after the February Revolution.

fig, ax = plt.subplots(dpi=300)
ax = fig.gca()
cntry = ['CHN', 'SUN', 'JPN', 'GBR', 'USA']
start_year, end_year = (1820, 1945)
ax = draw_interp_plots(gdp[cntry].loc[start_year:end_year],
    'International $\'s','Year',
    color_mapping, code_to_name, 2, False, ax)

Fig. 2.6 GDP in the early industrialization era#

2.5. Constructing a plot similar to Tooze’s#

In this section we describe how we have constructed a version of the striking figure from chapter 1 of [Too14] that we discussed at the start of this lecture.

Let’s first define a collection of countries that consist of the British Empire (BEM) so we can replicate that series in Tooze’s chart.

BEM = ['GBR', 'IND', 'AUS', 'NZL', 'CAN', 'ZAF']
gdp['BEM'] = gdp[BEM].loc[start_year-1:end_year].interpolate(method='index').sum(axis=1) # Interpolate incomplete time-series

Let’s take a look at the aggregation that represents the British Empire.

gdp['BEM'].plot() # The first year is np.nan due to interpolation
<Axes: xlabel='year'>
AFG Afghanistan
AGO Angola
ALB Albania
ARE United Arab Emirates
ARG Argentina
... ...
YEM Yemen
YUG Former Yugoslavia
ZAF South Africa
ZMB Zambia
ZWE Zimbabwe

169 rows × 1 columns

Now let’s assemble our series and get ready to plot them.

# Define colour mapping and name for BEM
color_mapping['BEM'] = color_mapping['GBR']  # Set the color to be the same as Great Britain
# Add British Empire to code_to_name
bem = pd.DataFrame(["British Empire"], index=["BEM"], columns=['country'])
bem.index.name = 'countrycode'
code_to_name = pd.concat([code_to_name, bem])
fig, ax = plt.subplots(dpi=300)
ax = fig.gca()
cntry = ['DEU', 'USA', 'SUN', 'BEM', 'FRA', 'JPN']
start_year, end_year = (1821, 1945)
ax = draw_interp_plots(gdp[cntry].loc[start_year:end_year],
    'Real GDP in 2011 $\'s','Year',
    color_mapping, code_to_name, 2, False, ax)
plt.savefig("./_static/lecture_specific/long_run_growth/tooze_ch1_graph.png", dpi=300, bbox_inches='tight')

At the start of this lecture, we noted how US GDP came from “nowhere” at the start of the 19th century to rival and then overtake the GDP of the British Empire by the end of the 19th century, setting the geopolitical stage for the “American (twentieth) century”.

Let’s move forward in time and start roughly where Tooze’s graph stopped after World War II.

In the spirit of Tooze’s chapter 1 analysis, doing this will provide some information about geopolitical realities today.

2.5.1. The modern era (1950 to 2020)#

The following graph displays how quickly China has grown, especially since the late 1970s.

fig, ax = plt.subplots(dpi=300)
ax = fig.gca()
cntry = ['CHN', 'SUN', 'JPN', 'GBR', 'USA']
start_year, end_year = (1950, 2020)
ax = draw_interp_plots(gdp[cntry].loc[start_year:end_year],
    'International $\'s','Year',
    color_mapping, code_to_name, 2, False, ax)

Fig. 2.7 GDP in the modern era#

It is tempting to compare this graph with figure Fig. 2.6 that showed the US overtaking the UK near the start of the “American Century”, a version of the graph featured in chapter 1 of [Too14].

2.6. Regional analysis#

We often want to study historical experiences of countries outside the club of “World Powers”.

Fortunately, the Maddison Historical Statistics dataset also includes regional aggregations

data = pd.read_excel("datasets/mpd2020.xlsx", sheet_name='Regional data', header=(0,1,2), index_col=0)
data.columns = data.columns.droplevel(level=2)

We can save the raw data in a more convenient format to build a single table of regional GDP per capita

regionalgdppc = data['gdppc_2011'].copy()
regionalgdppc.index = pd.to_datetime(regionalgdppc.index, format='%Y')

Let’s interpolate based on time to fill in any gaps in the dataset for the purpose of plotting

regionalgdppc.interpolate(method='time', inplace=True)

and record a dataset of world GDP per capita

worldgdppc = regionalgdppc['World GDP pc']
fig = plt.figure(dpi=300)
ax = fig.gca()
ax = worldgdppc.plot(
    ax = ax,
    ylabel='2011 US$',

Fig. 2.8 World GDP per capita#

Looking more closely, let’s compare the time series for Western Offshoots and Sub-Saharan Africa and more broadly at a number of different regions around the world.

Again we see the divergence of the West from the rest of the world after the industrial revolution and the convergence of the world after the 1950s

fig = plt.figure(dpi=300)
ax = fig.gca()
line_styles = ['-', '--', ':', '-.', '.', 'o', '-', '--', '-']
ax = regionalgdppc.plot(ax = ax, style=line_styles)
plt.legend(loc='lower center', 
ncol=3, bbox_to_anchor=[0.5, -0.4])

Fig. 2.9 Regional GDP per capita#