A question I pose to my grad students: should we trust data, especially historic data?
Take historic international trade data. Should we trust it?
[THREAD]
Take historic international trade data. Should we trust it?
[THREAD]
What do I mean by "historic"? I mean data going back to the early 20th and late 19th century.
You know, the type of products J.M. Keynes describes here (in a passage about pre-1914 London):
You know, the type of products J.M. Keynes describes here (in a passage about pre-1914 London):
A great recent source of such data come from the Correlates of War project, which has bilateral trade flows from 1870 to 2014
https://correlatesofwar.org/data-sets/bilateral-trade
https://correlatesofwar.org/data-sets/bilateral-trade
The data look like this: flows of country A to country B in year t (in this case, US to Austria-Hungary in the early 20th century)
These data, compiled by Katherine Barbieri & Omar Keshk, draw on data compiled by Raymond Hicks and Joanne Gowa, which they describe in their 2013 @IntOrgJournal paper https://www.cambridge.org/core/journals/international-organization/article/politics-institutions-and-trade-lessons-of-the-interwar-era/6B634FD11D54E125E519FADAB44EB619
Interesting note: while Barbieri & Keshk cite Hicks and Gowa, Hicks and Gowa cite an EARLIER version of Barbieri & Keshk's data to fill in some of their observations
The primary sources for Gowa and Hicks are national level trade yearbooks and statistical yearbooks.
They provide a full list of the sources consulted in their 2017 @BJPolS piece
https://www-cambridge-org.proxy.uchicago.edu/core/journals/british-journal-of-political-science/article/commerce-and-conflict-new-data-about-the-great-war/8F6B8410BDB042EFA5E43DA17E72FC70
They provide a full list of the sources consulted in their 2017 @BJPolS piece
https://www-cambridge-org.proxy.uchicago.edu/core/journals/british-journal-of-political-science/article/commerce-and-conflict-new-data-about-the-great-war/8F6B8410BDB042EFA5E43DA17E72FC70
For example, for US trade data, they consulted the Statistical Abstract of the United States, produced by @uscensusbureau https://www.census.gov/library/publications/time-series/statistical_abstracts.html
Let's say one just happened to be interested in US trade data for 1915 (cc @rosellacappella).
Scroll down and you can find, for instance, US trade with Austria-Hungary. Notice the sharp decline of exports in 1915. Hmmm...something must have been happening "over there"

These data are great from an aggregate standpoint. But what if you want it at the product level? After all, isn't that how these aggregates are acquired?
But there's a hitch: what if you want to know the products exported to specific countries, like wheat to Britain?
That requires another source.
That requires another source.
One needs to turn to the Monthly Summary of Foreign Commerce (b/c of their age, many of these volumes are available open access via @googlebooks)
But how did we obtain these data (see a pattern here)?
Well, the table has a VERY interesting note at the beginning: "completeness and accuracy...are dependent on the active cooperation of the manufacturers and exporters in rendering adequate manifests."
Well, the table has a VERY interesting note at the beginning: "completeness and accuracy...are dependent on the active cooperation of the manufacturers and exporters in rendering adequate manifests."
In other words, we have to hope that the firms accurately filed out their export manifest at the port.
If you want to go REAL deep, you could visit the National archives to see these manifests https://www.archives.gov/research/guide-fed-records/groups/036.html#36.3.1
If you want to go REAL deep, you could visit the National archives to see these manifests https://www.archives.gov/research/guide-fed-records/groups/036.html#36.3.1
At the end of the day, the trade data we use relies upon hope that (1) firms properly filed out the proper paper work, (2) that the paper work was properly collected, and (3) the paper work was properly aggregated.
This highlights that historic trade data, like all historic data, are highly contingent upon proper record keeping.
That's a safe assumption for some data, from certain countries, at particular times.
But it's important to consider when that assumption isn't safe!
[END]
That's a safe assumption for some data, from certain countries, at particular times.
But it's important to consider when that assumption isn't safe!
[END]
Addendum: for a terrific evaluation of historic trade data accuracy, see this Explorations in Economic History piece (h/t @BobbyGulotty) https://www.sciencedirect.com/science/article/abs/pii/0014498391900076
Addendum 2: very recent pieces by @dmugge & @lukaslinsi that raise questions about the accuracy of trade statistics (h/t @whinecough) are in @RIPEJournal... https://www.tandfonline.com/doi/full/10.1080/09692290.2018.1560353