Numbers do not lie

Data science and statistics have been proven valuable in providing tractable insights into the complexities of reality when applied judiciously. Historically, however, the degrees of freedom used in data analysis were not always chosen with the best of intentions.

Published

May 1st, 2025

1 Numbers do not lie

  • Today’s science is empirical.
  • Even for too complicated underlying systems, data-driven analyses can help us understand underlying relationships.
  • And provide insights with clarity that are alternatively unattainable.

2 Numbers do not lie

  • Today’s science is empirical.
  • In practice, however, data analysis is only as impartial as the person conducting it.
  • Similar to other technological advancements, data analysis can be used for deception as much as it can be used for enlightenment.

3 Many shades of wrong

  • Aragão and Linsi (2022) studied cases where governments used statistics to misrepresent or opportunistically interpret data.
  • And introduced a data manipulation typology:
    • outright manipulation (type 1),
    • politically motivated guesstimating (type 2),
    • the opportunistic use of methodology space (type 3),
    • and indicators-management through indirect means (type 4).

Many shades of wrong

  • Aragão and Linsi (2022) studied three high-profile cases:
    • Argentinian government’s inflation statistics between 2007 and 2015,
    • Brazilian debt figures between 2012 and 2015, and
    • Greece’s public finance statistics in the 2000s.

Many shades of wrong

  • Aragão and Linsi (2022) studied three high-profile cases:
    • Argentinian government’s inflation statistics between 2007 and 2015,
    • Brazilian debt figures between 2012 and 2015, and
    • Greece’s public finance statistics in the 2000s.

4 Argentinian inflation statistics

  • After the 2001 default, the government introduced the “Coeficiente de Estabilización de Referencia” (CER) mechanism.
  • Creditors had the option to exchange foreign currency debt for CER-denominated bonds.
  • CER was based on a daily inflation index calculated by the National Institute of Statistics and Censuses (INDEC).
  • By 2007, 39% of the public debt was CER-denominated.

Argentinian inflation statistics

  • Problem: High inflation rates led to greater interest payments.

Argentinian inflation statistics

  • In 2005, Guillermo Moreno took the office of Secretary of Domestic Trade, appointed by President Néstor Kirchner.
  • In May 2006, Moreno met INDEC technicians, inquiring how the inflation indicator was being measured.
  • He asked for particular details of shops and products used in the index (why?).
  • Later, he argued that the current methodology of calculating the index was unpatriotic! (why?)

Argentinian inflation statistics

  • In January 2007, prices of lettuce and pre-paid mobile cards exhibited unexpected variation in the data.
  • Moreno argued that this unexpected variation had implications for the index calculation.
  • The government replaced the INDEC director, and the data and methodology used to calculate the index were revised.
  • The changes dropped inflation measures… but not enough, and not for long.

Argentinian inflation statistics

  • Gradually, all products with price variations greater than 15% were removed from the index!

Argentinian inflation statistics

Source: Aragão and Linsi (2022)

5 Even in the best of families

  • Cases of data manipulation (of any type) are not limited to governments and politicians.
  • High-profile cases have also been discovered in academia (see Rebel talent case).
  • And the private sector (see Ekphrasis case).

References

Aragão, Roberto, and Lukas Linsi. 2022. “Many Shades of Wrong: What Governments Do When They Manipulate Statistics.” Review of International Political Economy 29 (1): 88–113. https://doi.org/10.1080/09692290.2020.1769704.