Blog

Technical writing on data science methods, software development, and system design. I explore everything from statistical analysis and programming techniques to building intelligent data systems.

May 10, 2024

Statistics

Using the Beveridge curve to visualise trends in Germany's labour market

The Beveridge curve is a valuable tool for analysing labour markets. In this post, I use it to identify trends in the German job market from 2010 to 2023. By utilising the D3.js visualisation library, we can explore how Germany's Beveridge curve evolved over time, providing an intriguing perspective on the country's labour market dynamics.

Keywords: Beveridge curve D3.js Eurostat Labour market official statistics

Continue reading

Dec. 13, 2023

Statistics

Working with data from official providers: a brief tour of pandaSDMX

I have always been an avid user of official statistics. Eurostat, the OECD, the World Bank, other international organisations and national statistics agencies such as the French INSEE are valuable sources of data. In this post, I present a simple and efficient way of interacting with data from these official providers via the SDMX standard using the pandaSDMX Python library.

Keywords: official statistics sdmx Eurostat Python

Continue reading

Sept. 2, 2023

Programming

A functional approach to web scraping with Python’s singledispatch decorator - Part II: practice

In this second part, I present a practical example of how to write code for web scraping in a functional programming style. The example illustrates how to use immutable data structures, functions and Python’s @functools.singledispatch decorator to build a resilient data collection pipeline for web scraping.

Keywords: Functional Programming Web Scraping Python

Continue reading

Sept. 2, 2023

Programming

A functional approach to web scraping with Python’s singledispatch decorator - Part I: theory

For a long time, I have associated web scraping projects with a heavy dose of object-oriented programming. Python developers might be familiar with classes of spiders similar to those used in the web scraping framework Scrapy. In this two-part article, I present a way to write code for web scraping in a functional programming style. In this first part, I explain what I mean by that and what kind of advantages I see with that approach.

Keywords: Functional Programming Web Scraping Python

Continue reading

Jan. 24, 2023

Statistics Machine Learning

Using quantitative methods to build a typology of aid donors: Part II - Clustering

This is the second part of a project that illustrates the use of quantitative methods for developing a typology of aid donors. We'll use agglomerative, hierarchical clustering and spectral clustering to identify groups of aid donors.

Keywords: Clustering

Continue reading

Dec. 1, 2022

Statistics

Using quantitative methods to build a typology of aid donors: Part I - Principal Components Analysis

This is the first post in a two-part series about using quantitative methods to develop typologies. The first part deals with Principal Components Analysis; the second will be about Clustering. I use the topic of development cooperation as a practical example.

Keywords: PCA

Continue reading

Oct. 24, 2022

Programming

Leverage the power of Python’s data model in your classes

Special methods do not only power Python’s built-in objects. They can be used to give custom classes the same look and feel. In this post, I illustrate the use of special methods by defining a simple class of polynomials.

Keywords: OOP Python

Continue reading

Blog

Using the Beveridge curve to visualise trends in Germany's labour market

Working with data from official providers: a brief tour of pandaSDMX

A functional approach to web scraping with Python’s singledispatch decorator - Part II: practice

A functional approach to web scraping with Python’s singledispatch decorator - Part I: theory

Using quantitative methods to build a typology of aid donors: Part II - Clustering

Using quantitative methods to build a typology of aid donors: Part I - Principal Components Analysis

Leverage the power of Python’s data model in your classes