The Barnes-Hut Approximation

Efficient computation of N-body forces

Computers can serve as exciting tools for discovery, with which we can model and explore complex phenomena. For example, to test theories about the formation of the universe, we can perform simulations to predict how galaxies evolve. To do this, we could gather the estimated mass and location of stars and then model their gravitational interactions over time.

Another avenue for discovery is to visualize complex information to reveal structure and patterns. Consider this network diagram, showing connections between people in a social network. We can use such diagrams to examine community groups and identify people who bridge between them.

Though they may seem quite different at first, these two examples share a common need: they require computing forces that arise from pairwise interactions among a set of points, often referred to as an N-body problem. In the case of astronomical simulation, we seek to model the gravitational forces among stars. In the case of network visualization, we compute the layout using a similar physical simulation: nodes in the network act as charged particles that repel each other, while links act as springs that pull related nodes together.

To get a sense of how this force-directed layout works, drag the nodes or use the slider to adjust the force strength. Negative values indicate repulsive forces, while positive values indicate attractive forces.

Force Strength -30.00

A straightforward approach to computing N-body forces is to consider all pairs of individual points and add up the contributions of each interaction. This naïve scheme has quadratic complexity: as the number of points n increases, the running time grows proportionally to n², quickly leading to intractably long calculations. How might we do better?

The Barnes-Hut Approximation

To accelerate computation and make large-scale simulations possible, the astronomers Josh Barnes and Piet Hut devised a clever scheme. The key idea is to approximate long-range forces by replacing a group of distant points with their center of mass. In exchange for a small amount of error, this scheme significantly speeds up calculation, with complexity n log n rather than n².

Central to this approximation is a spatial index: a “map” of space that helps us model groups of points as a single center of mass. In two dimensions, we can use a quadtree data structure, which recursively subdivides square regions of space into four equal-sized quadrants. (In three dimensions, one can use an analogous octree that divides a cubic volume into eight sub-cubes.)

The Barnes-Hut approximation involves three steps:

Construct the spatial index (e.g., quadtree)
Calculate centers of mass
Estimate forces

Let’s explore each step in turn. We will assume we are computing repulsive forces for the purposes of network layout. This setup is akin to modeling anti-gravity or electric forces with similarly-charged particles. While we will use the term “center of mass”, this could readily be replaced with “center of charge”.

As you read through, click the action links to update the diagram!

Step 1: Construct the Quadtree

We begin with a set of two-dimensional points as input. When we insert the first point into the quadtree, it is added to the top-level root cell of the tree.

Next we insert another point, which requires expanding the tree by subdiving the space. Upon each subsequent insertion, more fine-grained cells may be added until all points reside in their own cell.

Advance the slider to add each point and produce the full quadtree.

Inserted Points 0

Step 2: Calculate Centers of Mass

After quadtree construction, we calculate centers of mass for each cell of the tree. The center of mass of a quadtree cell is simply the weighted average of the centers of its four child cells.

We visit the leaf node cells first and then visit subsequent parent cells, merging data as we pass upwards through the tree. Once the traversal completes, each cell has been updated with the position and strength of its center of mass.

Step 3: Estimate N-Body Forces

Now we are ready to estimate forces!

To measure forces at a given point, let's add a "probe" to our diagram (). The purple line extending from the probe indicates the direction and magnitude of the total force at that location. (To promote visibility, the purple line is three times longer than the actual pixel distance the probe would be moved in a single timestep of the force simulation.) The dotted lines extending to the probe represent the force components exerted by individual points.

Move the probe (click or drag in the visualization) to explore the force field.

Ignoring the quadtree, we can naïvely calculate forces by summing the contributions of all individual points. Of course, we would instead like to use the quadtree to accelerate calculation and approximate long-range forces. Rather than compute interactions among individual points, we can compute interactions with centers of mass, using smaller quadtree cells for nearer points and larger cells for more distant points.

At this point we’ve skipped a critical detail: what constitutes “long-range” versus “short-range” forces? We consider both the distance to the center of a quadtree cell and that cell’s width. If the ratio width / distance falls below a chosen threshold (a parameter named theta), we treat the quadtree cell as a source of long-range forces and use its center of mass. Otherwise, we will recursively visit the child cells in the quadtree.

When theta = 1, a quadtree cell’s center of mass will be used — and its internal points ignored — if the distance from the sample point to the cell’s center is greater than or equal to the cell’s width.

Adjust the theta parameter to view its effect on force estimation. How does the number of considered points change based on the probe location and theta? How does the direction and magnitude of the total force vary with theta?

Theta 0.00

We can now perform force estimation for each individual point, using the Barnes-Hut approximation to limit the total number of comparisons!

Performance Analysis

To assess the performance of the Barnes-Hut approximation, let’s look at both the running time and accuracy of force estimation. We will compare naïve (n²) calculation to different settings of the theta parameter.

We will take measurements using different point sets, ranging from 500 to 10,000 points. For each point count, we average the results from 50 separate runs of force estimation, each time using a different set of points placed at uniformly random coordinates within a 900 x 500 pixel rectangle.

The running time results confirm that the Barnes-Hut approximation can significantly speed-up computation. As expected, the naïve approach exhibits a quadratic relationship, whereas increasing the theta parameter leads to faster calculations. A low setting of theta=0.5 does not fare better than the naïve approach until processing about 6,000 points. Until that point, the overhead of quadtree construction and center of mass calculation outstrips any gains in force estimation. For theta=1 and theta=1.5, however, we see a significant improvement in running time, with similar performance for each.

To evaluate approximation error, we measure the average vector distance between the results of the naïve scheme and Barnes-Hut. In the context of a force-directed graph layout, this error represents the difference (in pixels) between node positions after applying the naïve and approximate methods.

Looking at the error results, we first see that the average error is relatively small: only ~5% of a single pixel in difference! However, we should take care in interpreting these results, as we use the average error per point and the maximum error may be substantially higher. While theta=1 and theta=1.5 exhibit similar running times, here we see notably higher error rates for theta=1.5 versus theta=1 and theta=0.5.

These results suggest that a good default value for theta — with low running time and low approximation error — is around 1.0. Indeed, in practice it is common to see default settings slightly below 1. In visualization applications, where errors on the order of a few pixels are not a problem, even higher theta values may be used without issue.

Conclusion

The Barnes-Hut approximation has had a major impact on both physical simulation and network visualization, enabling n-body calculations to scale to much larger data sets than naïve force calculation permits.

Returning to our initial network diagram, we can use Barnes-Hut to efficiently compute repulsive forces at each timestep. For each animation frame, we perform the approximation anew, creating a new quadtree, accumulating centers of mass, and (approximately) estimating forces.

Want to learn more or apply Barnes-Hut in your own work?

The popular D3.js library by Mike Bostock includes Barnes-Hut for performing force-directed layout as part of the d3-force module. The examples in this article are based on the D3 implementation.
For more details, including pseudo-code and performance analysis, Prof. James Demmel of UC Berkeley has online lecture notes on the Barnes-Hut approximation. These notes provided my first introduction to the technique!
While the error of the Barnes-Hut approximation is acceptable in a variety of applications, sometimes greater precision is needed. In such cases, more complicated schemes such as the Fast Multipole Method can be used in lieu of Barnes-Hut.

This article was created using Idyll, with visualizations powered by D3 and Vega. The source code is available on GitHub.