Using quantitative methods to build a typology of aid donors: Part II - Clustering

Jan. 24, 2023

Statistics Machine Learning Clustering

1. Introduction

This is the second part of a project that illustrates the use of quantitative methods for developing a typology of aid donors. In the previous part, we used Principal Components Analysis (PCA) to produce a 3-dimensional representation of providers of official development assistance (ODA). PCA helped us gain insights into which donor countries might be similar to each other. However, we also saw that several donors are not represented well enough with just three Principal Components (PCs). In this second part, we’ll use more PCs and perform clustering on them.

Clustering partitions observations into groups (i.e. clusters) with the goal to assign similar observations to the same group and dissimilar ones to different groups. We’ll use the PCA data from the first part as input for the clustering.

The PCA was conducted on 27 variables from data provided by OECD Development Assistance Committee (DAC) for the year 2020. Consult the notebook of part I to learn more about the data, the selected variables and the transformations applied to them. The first part also contains an annexe explaining how many PCs we should retain for clustering to sufficiently represent all donors. We’ll use a complete data set with 47 donors that uses 10 PCs and a smaller data set with only DAC donors that uses 9 PCs. The 10 and 9 PCs explain around 87% of the variance contained in the original 27 variables.

Now let’s find out what kind of clusters we can extract from this data. We'll use two clustering methods: 1) Agglomerative, hierarchical clustering and 2) Spectral clustering. We’ll first try them out on the data set for all donors and then move on to a more detailed analysis in which we focus only on intra-DAC clusters. The code related to this blog post can be found here.

2. Hierarchical clustering

We use agglomerative, hierarchical clustering as a first step to get an idea of the clusters that might come out of the data. As hierarchical clustering does not require to determine the number of clusters beforehand, it is often used in preparation of other clustering techniques that require knowledge about the expected number of clusters from the start.

Agglomerative, hierarchical clustering starts with all donors being their own individual cluster. For instance, in the case of the data set with all donors, the algorithm starts with 47 individual nodes, one for each donor. Then, donors are successively merged into clusters in an iterative process, starting with the two closest nodes. This process is repeated until there is only one large cluster left, which comprises all donors.

Variants of this type of clustering differ with regard to the method used to calculate similarity or distance between observations. We use Ward’s method to iteratively merge nodes. This method minimises the increase in total within-cluster variance at each merging step.

The incremental creation of nodes is typically presented in a dendrogram (a hierarchical tree). The tree structure allows us to visually explore the clustering process and to decide where to "cut" the tree. This cut determines the number of clusters. Here is the dendrogram for the data set with all donors.

Based on data from OECD stats (see part I).

At the left, all 47 donors are individual nodes. The values on the x-axis starts from an intra-cluster variance of 0 (because there is no difference within clusters as all clusters are composed of a single donor). Subsequently, the increase on the x-axis indicates the growing difference between members in the same cluster as more and more donors are grouped together. At the right side of the dendrogram, all the differences in the data set are contained within a single cluster.

You can see, for instance, that the pairs of donors that are merged first, and are therefore closest together, are the United Kingdom and the Netherlands, Sweden and Norway, and Canada and Switzerland. Other donors get added to other nodes much later in the tree. Turkey is the last single donor to be joined with another node.

The default colour threshold used by the scipy Python library indicates three clusters. Usually, the threshold should be placed where intra-cluster differences become distinctly larger. We can change the threshold and the coloring of the dendrogram. The dendrogram allows us to reason about our choice of clusters.

In this particular case, the threshold is not easy to determine as the intra-cluster variance starts to shoot up at very different points. For instance, the section of the tree with Sweden, the United Kingdom, etc. seems to be more compact than the branches with Kuwait, Portugal, Thailand, etc. where the first pairings of donors happen much later. The densities of the potential clusters are very different.

If we give data to a clustering method, it will allocate all observations to clusters (unless we use methods like DBSCAN that also identify outliers that do not belong to any clusters). But not all donors necessarily belong to a „donor type“. Moreover, a more detailed typology might require smaller clusters than provided by the default colouring of the dendrogram above. In theory, even just two or three donors might constitute a „donor type“.

For the moment, we can use the results of hierarchical clustering to delete some "isolated" donors that could be considered as not being part of any cluster. These are usually donors that are only linked to other nodes very late in the dendrogram. In this case, we take out Turkey, Kuwait, Azerbaijan, Kazakhstan and Cyprus from the data set with all donors and apply spectral clustering to this reduced data set.

3. Spectral Clustering

Spectral Clustering refers to a family of algorithms that look at clustering from a perspective of graph theory. Spectral clustering first converts the data to be clustered into a similarity graph (i.e. a network of nodes connected by edges representing the similarity or connectivity between nodes). In our case, the nodes are donors and the edges between donors reflect similarities between them. Clustering then becomes a question of partitioning the graph so that edges among nodes belonging to different groups have a lower connectivity or weight than edges of observations belonging to the same group (see Luxburg (2007)).

Such a graph can be represented as an adjacency matrix on which standard linear algebra operations can be performed. Spectral clustering decomposes this matrix (or typically its graph Laplacian) into eigenvalues and eigenvectors. Subsequently, any clustering algorithm, e.g. KMeans, can be used to perform clustering on the first k eigenvectors.

Spectral clustering has the advantage that it does not make any assumptions about the shape of the clusters. With our data set of aid donors, it is not clear what the shape of the clusters should be. Given this uncertainty about what the "true“ clusters might be, spectral clustering produces more stable results than applying KMeans directly. However, spectral clustering comes with its own challenges: results can differ strongly according to the approach used to construct the similarity graph. Moreover, we do not get around the challenge of defining the number of clusters in advance.

One common way to construct the similarity graph is the k-nearest-neighbours approach. Donor x is connected to donor y if donor y is among its k nearest neighbours. The result is a sparse adjacency matrix of 1s and 0s. The main parameter for this approach is the number of neighbours.

Alternatively, we can build a fully connected graph with weights on the edges indicating the similarity between observations. The Gaussian kernel (radial basis function, rbf) is often used to calculate the weights such as in the Spectral Clustering implementation of the scikit-learn Python library (np.exp(-gamma * d(X,X) ** 2) where X is our data and d(X, X) is the euclidean distance between observations). The main parameter for this approach is gamma. Values for gamma below 1 make clustering more tolerant to greater distances between observations; values of gamma above 1 require observations to be closer together to be part of the same cluster.

To determine the number of clusters, we can use some of the techniques that also exist for other clustering methods, such as the silhouette score). There is also a method specific to spectral clustering called the eigengap heuristic: look at the ordered eigenvalues starting with the lowest and identify a sudden jump where initially very small values become larger. According to the eigengap heuristic, the number of eigenvalues before this jump is a good guess for the number of clusters.

Figure 2: Eigengap heuristic - example 1

Based on the spectral clustering on the complete data set, without the 5 donors deleted above.

Using the Gaussian kernel with a gamma value of 0.4, we find a clear eigengap that suggests three clusters. With three clusters, we get the following result.

Based on the spectral clustering result for the complete data set, without the 5 donors deleted above. See part I for underlying data from OECD stats.

This clustering result obviously is not nuanced enough for a typology. The cluster with Australia and New Zealand is very small; the large cluster with most of the DAC donors contains 26 countries. However, some useful observations can be made: the clustering makes a difference between a group around „core“ DAC countries and a cluster with donors that come mainly from Central and Eastern Europe/South-Eastern Europe. Moreover, the result confirms that there might be some degree of closeness between Arab donors and DAC countries. Overall, however, we would like to get a more detailed picture.

The notebook contains the code for comprehensively testing different parameters and numbers of clusters, including the eigengap heuristic and the silhouette score for the nearest neighbour approach and the Gaussian (rbf) kernel. Using a higher number of clusters on the complete data set usually leads to worse silhouette scores, with many donors having negative scores. This essentially means that there is a great deal of uncertainty over the cluster membership of some donors.

One option would be to create a more targeted data set, deleting some more donors that might not actually be part of any clusters (as we have already done after the hierarchical clustering step). However, deleting donors in that way can also be quite arbitrary. Instead, we are going to look at the DAC-only data set to get a cleaner clustering solution in the remaining part of this post.

Intra-DAC clustering

This section briefly presents the clustering result for the data set with only DAC members. Finding clusters within the DAC membership is an interesting question. In discussions about development cooperation DAC, donors are often referred to as a monolithic bloc.

We do similar steps as above. First, we have a look at the dendrogram resulting from hierarchical clustering.

Based on data from OECD stats (see part I).

The default colour threshold proposes three clusters, but we might try four or five to get a more detailed picture this time.

Trying out different configurations for spectral clustering shows that Germany and France are very sensitive to varying clustering parameters. The two countries could be considered as "swing donors". Due to their central position (see the 3D plots in part I), they get associated to different clusters around them depending on the parameters used for clustering. We do not use them as input to the final clustering and instead treat them as individual observations.

Figure 5: Eigengap analysis - example 2

Based on the spectral clustering on the DAC-only data set, without France and Germany.

Here we use the k-nearest-neighbour method to build the similarity graph, with 3 as the number of neighbours. The eigengap does not come out very clearly, but there is an increase in the values after the fifth eigenvalue. So we pick five clusters.

Based on the spectral clustering result for the DAC data set, without France and Germany. See part I for underlying data from OECD stats.

This clustering result is more balanced and provides more detail. The largest cluster has seven members, including the United States, United Kingdom, the Netherlands, and other donors. The Nordic donors (Sweden, Norway, Denmark, Finland) form a cluster together with Belgium and Ireland. Next, there seems to be an Asia-Pacific cluster that also includes Portugal. Italy, Spain, Hungary, Austria and the Czech Republic compose a cluster of Central and South European donors. Poland, Greece, Slovenia and the Slovak Republic are the intra-DAC equivalent of the Eastern/South-Eastern European cluster we have seen previously with the complete data set.

This result appears to be more stable as confirmed by the silhouette score. The average score is not very high. However, there are only two donors that have a slightly negative score, Belgium and Portugal (both -0.03). A negative score indicates that a donor might also be part of a different cluster.

Based on clustering result presented in figure 6.

Interpretation of results

Now that we have identified clusters for the DAC-only data set, we have to find out what these clusters actually mean.

One way to interpret clusters is to visualize cluster means on two-dimensional planes representing PCs. Here is the visualisation for the first and second PC.

Based on.spectral clustering result presented in figure 6.

Comparing this plot with the correlation circles from part I, we can interpret the positions of the clusters:

For instance, The lower right corner, where cluster 4 is located, is associated with more spending in Europe, in-donor expenditures and a focus on education as a sector. In contrast, the left side of the plot, where we find clusters 0 and 3, is associated with more spending in Sub-Saharan Africa and a stronger focus on the poorest countries. Moreover, the area around the „Nordic“ cluster (3) is related to higher spending in terms of GNI, more budget support and pooled funding, NGOs and civil society as a delivery channel, and the sector Government, Society and Peace. Towards the top of the plot, around the „Asia-Pacific“ cluster (2), we find more spending in the Asia-Pacific region, Central and South Asia, more project-type interventions and a higher share of country-programmable aid (CPA).

A similar analysis can be made for the other PCs as well. However, some PCs are difficult to interpret as they are less clearly correlated with the original variables (see annexe in the notebook of part I).

To make up for this limitation in our ability to interpret some of the PCs, we can add other ways of finding out what the different clusters stand for.

As another way of interpreting the clusters, we can identify the typical representative of each cluster. As we have the original data of each donor, we can learn about a cluster by looking into the original data of its typical representative.

Here are the typical representatives for the intra-DAC clusters:

The donor closest to the mean in cluster 0 is Switzerland
The donor closest to the mean in cluster 1 is Italy
The donor closest to the mean in cluster 2 is South Korea
The donor closest to the mean in cluster 3 is Sweden
The donor closest to the mean in cluster 4 is Slovenia.

The limitation of this method is that the typical representative might not be a good proxy for more heterogeneous clusters. For instance, it can be doubted that South Korea is as representative of Portugal as Sweden is of Norway.

Finally, we can show strip plots that divide donors into their clusters and relate them to the original variables like in the following plot.

Figure 9: Strip plot with clusters and original variables

Based on the original variables from OECD stats presented in part I.

This plot is interactive. You can explore the variables by using the dropdown menu. The dropdown provides only a reduced list of the 27 variables. Hover over the dots to see the donors’ names.

Some variables, such as those related to geographic regions (e.g. Europe or Sub-Saharan Africa), clearly show the geographic focus of each cluster. Other variables like budget support and pooled funding, project-type interventions, or the share of multilateral aid also show visible patterns across the different clusters. However, not all variables tell us a clear story about the clusters we have found.

Conclusion and ideas for improvement

This concludes the two-part project about using quantitative methods for developing a typology of aid donors. The final clustering might not yet be a robust typology. After all, we would have to do more to relate this project to a specific research question. However, the two parts of this post illustrate a process that can be applied to develop such a typology, or at least confirm or refute commonly used categorisations of aid donors. More iterations of this process would be necessary to arrive at a more advanced solution. It is also important to recognize that there is no guarantee to find a perfectly clean clustering structure as we are working with a real world data set.

Among the various steps in the process, I would highlight three aspects that deserve particular attention as they influence the clustering result the most:

First, the selection and transformation of variables can have a strong impact on the final clustering result. Variable selection presupposes a rigorous literature review and the formulation of a specific research question. This is the most time-consuming part of the work. Part I already contains a long section about data exploration; a more exhaustive implementation is beyond the scope of a blog post. Investing more work in the preparation of the data can improve the results of the PCA by storing more of the explained variance in fewer PCs and making the clustering result more interpretable.

Second, the analysis of the complete data set with all available donors struggled with the heterogeneity of donors. The initial data set is composed of DAC members, "Participants" and other donors reporting data to the DAC. It was difficult to integrate the largest number of donors possible without getting a result that simply opposes "core" DAC donors versus the rest. That is why the intra-DAC clustering yielded more interesting results. This boils down to the challenge of identifying outliers without simply considering all non-DAC donors as outliers.

Third, the choice of the clustering method and the tuning of relevant parameters obviously influences the clustering result. As an additional step, sensitivity analysis with regard to alternative clustering solutions could improve the quality of the result. In this post, we have touched upon this by omitting France and Germany from the intra-DAC clustering to see if we get a more stable result. This could be done more systematically and would help to see what groups of donors persist across changes applied through different clustering methods and parameters.