Recombination and other samplers

Redistricting is the problem of partitioning a set of geographical units into a fixed number of districts, subject to a list of often-vague rules and priorities. In recent years, the use of randomized methods to sample from the vast space of districting plans has been gaining traction in courts of law for identifying partisan gerrymanders, and it was used as an analytical tool by several legislatures and independent commissions.

The mathematics of sampling graph partitions is interesting in its own right, and there has been enormous progress in recent years. Most of the new methods use an idea that comes from this research group: that spanning trees provide a computationally efficient way to partition a graph, because deleting a single edge cuts the graph into two pieces. We introduced the idea in 2018 and released a supporting software package called GerryChain the same year. Now the spanning tree partition step can be found in several different kinds of samplers, including those introduced by Jonathan Mattingly’s group at Duke and Kosuke Imai’s group at Harvard.

Here are a few key papers from the lab on samping graph partitions…

Recombination

This survey paper by DeFord-Duchin-Solomon introduces recombination (or “ReCom”), a big-step Markov chain method with a spanning-tree step for bipartitioning. We set up redistricting as a graph partition problem and introduce a new family of Markov chains on the space of graph partitions, all of which enable approximate sampling from the spanning tree distribution on partitions, which we argue is a natural target for the redistricting problem. The main point of comparison is the “Flip” chain that was in common use before this time, in which the proposal randomly changes the assignment label of a single node at a time. We present evidence that ReCom mixes efficiently, especially in contrast to the slow-mixing Flip, and provide experiments that demonstrate its qualitative behavior. We demonstrate the advantages of ReCom on real-world data and explain both the challenges of the Markov chain approach and the analytical tools that it enables. We close with a short case study involving the Virginia House of Delegates — paper

Reversible ReCom

The next paper by Cannon-Duchin-Randall-Rule introduces an updated Recombination chain, addressing the fact that the simpler recombination methods from the original survey had only an approximate description of their steady state. In this paper, we modify ReCom slightly to give it a property called reversibility, resulting in a new Markov chain, nicknamed RevReCom. This new chain converges to the simple, natural distribution that ReCom was originally designed to approximate: a plan’s stationary probability is proportional to the product of the number of spanning trees of each district. This spanning tree score is a measure of district “compactness” (or shape) that is also aligned with notions of community structure from network science. After deriving the steady state formally, we present diagnostic evidence that the convergence is efficient enough for the method to be practically useful. In addition to the primary application of benchmarking of redistricting plans (i.e., describing a normal range for statistics), this chain can also be used to validate other methods that target the spanning tree distribution — paper

Comparing Samplers