New Algorithm Breaks Speed Limit for Solving Linear Equations
Introduction
Grade school math students are likely familiar with teachers admonishing them not to just guess the answer to a problem. But a new proof establishes that, in fact, the right kind of guessing is sometimes the best way to solve systems of linear equations, one of the bedrock calculations in math.
As a result, the proof establishes the first method capable of surpassing what had previously been a hard limit on just how quickly some of these types of problems can be solved.
“This is one of the most fundamental problems in computing,” said Mark Giesbrecht of the University of Waterloo. “Now we have a proof that we can go faster.”
The new method, by Richard Peng and Santosh Vempala of the Georgia Institute of Technology, was posted online in July and presented in January at the annual ACM-SIAM Symposium on Discrete Algorithms, where it won the best-paper award.
Linear systems involve two or more equations with variables that specify the different ways things relate to each other. They’re “linear” because the only allowable power is exactly 1 and graphs of solutions to the equations form planes.
A common example of a linear system — also likely familiar to math students — involves a barnyard filled with chickens and pigs. If you only know there are 10 heads and 30 feet, how many chickens are there, and how many pigs? As algebra students learn, there’s a set procedure for figuring it out: Write down two algebraic equations and solve them together.
But linear systems can do more than just count chickens and pigs. They crop up in many practical settings, where building a sturdier bridge or a stealthier aircraft can involve solving systems with millions of interdependent linear equations. More fundamentally, linear systems feature in many basic optimization problems in computer science that involve finding the best values for a set of variables within a system of constraints. If we can solve linear systems faster, then we can solve those problems faster too.
“Linear systems are the workhorse of modern computation,” said Vempala.
The new proof finds a quicker way of solving a large class of linear systems by sidestepping one of the main techniques typically used in the process. That technique, called matrix multiplication, previously set a hard speed limit on just how quickly linear systems could be solved. It still features in the work, but in a complementary role. The authors couple it with a new approach that, in essence, is a form of trained divination.
“You can guess your way to solutions,” said Peng. And no teacher would be mad at you for it.
Barnyard Math
To get a feel for linear systems and how you might solve them, return to the barnyard, but imagine it’s more of a menagerie now: chickens, 1-horned rhinos and 2-horned goats. You do a quick count and determine there are 12 heads, 38 feet and 10 horns. Can you figure out how many there are of each animal?
To proceed, assign a variable to each animal (c for chickens, r for rhinos, g for goats) and write an equation for each attribute. The numbers, or coefficients, in front of each variable reflect the quantity of that attribute possessed by each animal.
c + r + g = 12 heads
2c + 4r + 4g = 38 feet
0c + 1r + 2g = 10 horns
Now you have three equations and three unknowns.
One way to solve them is to manipulate one equation and define one variable in terms of the other two. For example, 0c + 1r + 2g = 10 turns into r = 10 – 2g. Substitute that value for r in the other two equations and continue like this until you’ve defined all variables in terms of just one variable, which you can then solve for exactly. Then you can repeat the process, leveraging the one variable you’ve solved for to solve for the next.
But another, more sophisticated, way to proceed is to create a matrix whose entries are the coefficients of the equations. The three equations turn into this matrix.
$latex \left[\begin{array}{ll}
1 & 1 & 1 \\
2 & 4 & 4\\
0 & 1 & 2
\end{array}\right]$
Next, we represent the unknown number of chickens, rhinos and goats with another matrix.
$latex \left[\begin{array}{ll}
c \\
r\\
g
\end{array}\right]$
Finally, we represent the observed number of heads, feet and horns with a third matrix.
$latex \left[\begin{array}{ll}
12 \\
38\\
10
\end{array}\right]$
We can combine these three matrices into a single linear system, where the first matrix multiplied by the unknown values of the second matrix equals the third matrix — at which point we can use linear algebra to solve for the second matrix.
$latex \left[\begin{array}{ll}
1 & 1 & 1 \\
2 & 4 & 4\\
0 & 1 & 2
\end{array}\right]$ × $latex \left[\begin{array}{ll}
c \\
r\\
g
\end{array}\right]$ = $latex \left[\begin{array}{ll}
12 \\
38\\
10
\end{array}\right]$
Whether you manipulate the equations or take the matrix route, you’ll end up performing the same total number of computational steps to solve the problem. That number is the cube of the number of variables in the system (n3). In this case we have three variables, so it takes 33, or 27, computational steps. If we had four animals and four equations, it would take 43 or 64 steps to solve them.
Over the last 50 years researchers have found ways to perform this procedure more efficiently. Often there are shortcuts they can employ — ways of reusing or combining operations — that let them solve linear systems in fewer steps.
In 1969 Volker Strassen devised an algorithm for performing matrix multiplication in only n2.81 steps. Since then mathematicians and computer scientists have jockeyed to lower the exponent further. The most recent advance, made last October by Virginia Vassilevska Williams of the Massachusetts Institute of Technology and Josh Alman, a postdoctoral researcher at Harvard University, proved that it’s possible to perform matrix multiplication in n2.37286 steps, an improvement in the exponent of 0.00001 over the previous best mark.
The upshot of all this is that any linear system you want to solve can be reduced to a question about matrix multiplication, and as of now, matrix multiplication can at least theoretically be performed in n2.37286 steps.
But various technical characteristics suggest it should be possible to solve linear systems faster than this — in potentially n2 steps. We use matrix multiplication because it’s been the best available tool, but that doesn’t mean there isn’t an even better tool waiting to be discovered.
“There’s no reason for this problem of solving linear systems to be dependent on improvements in matrix multiplication,” said Vempala.
Guessing Solutions
To understand the new and improved tool, you need to have in mind another established method of solving linear systems. It’s an intuitive one, the one you might reach for when first confronted with a flock of chickens, a crash of rhinos and a trip of goats all mixed together: Guess numbers for each, plug them into the equations, see how far off you are, and guess again.
This “iterative approach” is one that engineers and scientists often employ. It works well for many practical problems because experts typically don’t guess blindly, which cuts down on the number of iterated guesses they need to make before finding the solution.
“For real-world scientific computing problems, humans have very good intuition for what the answers should be,” said Peng.
Iterative methods are useful in specific instances where intuition can provide some support. They’re also useful more generally whenever the linear system you’re trying to solve has a large number of variables whose coefficients are zero.
This feature is present — and helpful — in the barnyard example, where the easiest attribute to solve for is horns. Why? Because chickens don’t have horns, which zeroes out the chicken term and reduces a problem involving three animals to a problem really involving two. And once you’ve gotten horns out of the way you can use that information to quickly solve for feet and heads.
In more complicated linear systems, this type of relationship, in which not all attributes pertain to all variables, can be pervasive. You might have millions of variables and millions of equations, but each equation might only involve a small number of the overall variables. These types of linear systems are called “sparse,” reflecting the way most variables take a value of zero in most of the equations. This is a situation that comes up often in real-world linear systems. And it’s one in which iterative methods can beat matrix multiplication.
“It only works when your matrix is sparse enough,” said Williams.
But prior to this new work no one had managed to prove that iterative methods are always faster than matrix multiplication for all sparse linear systems.
Coordinated Randomness
Peng and Vempala’s new technique employs an enhanced version of the iterated guessing strategy: Instead of making just a single guess, their algorithm makes many guesses in parallel. This approach speeds up the search, just as you’ll find a gem in a forest faster if you have many people looking at once.
“The parallelism is where the magic happens,” said Giesbrecht.
It may seem obvious that marshaling multiple guesses at once is useful, but making the strategy work isn’t so straightforward. The effectiveness of the new algorithm relies in large part on being smart about how to make the initial guesses that seed the iterative process and on finding clever ways to combine the fruits of the parallel guesses into a single final answer.
To return to the barnyard example, the algorithm might make three initial guesses, where each guess is a 3-by-1 matrix that specifies a number of chickens, rhinos and goats. The algorithm would observe how far off each guess was, then make more guesses, continuing parallel guessing threads.
A key to the algorithm’s ultimate success is that it makes the three initial guesses at random. Randomness might not seem like a good basis for guessing, but as a general-purpose method it has its advantages, especially when you’re working with huge problems. Namely, randomness ensures you don’t accidentally end up biasing your search toward one part of the problem, potentially neglecting the space where the actual solution lies.
“I need to make sure all my guesses are sufficiently random so that they cover all possible combinations,” said Peng. “It’s this very terrible way of making guesses which ends up being the preferred method as the problem gets very large.”
Much of the difficult technical work in Peng and Vempala’s paper involves proving that the different strands of random guesses also work together, including any particular guess that is in fact the answer to the problem.
“There is coordinated randomness,” said Vempala.
That means the random guesses not only account for the exact values of the guesses themselves but also cover all the potential guesses that lie between them. It’s similar in spirit to how two people searching in a forest don’t just search the ground they walk on; they also cover the full line of sight between them.
“Anything in between two [guesses] is also covered,” said Vempala.
This search feature ensures that the algorithm is going to encounter the solution somewhere. But it doesn’t by itself identify what the solution actually is. To do that — to actually put their hands on the solution — Peng and Vempala have to prove something else.
The algorithm keeps track of its random guesses as entries in a matrix. Finding the solution among the entries in the matrix becomes a question of matrix multiplication, which of course is the roadblock they’d set out to circumvent. But here again they take advantage of the randomness that they used to seed the entries in the matrix.
Because the entries in the matrix are random, and coordination happens between them, the matrix itself ends up with certain symmetries. Those symmetries enable computational shortcuts. Just like with any highly symmetric object, you only need to know what one part of it looks like in order to deduce the whole.
As a result, Peng and Vempala’s algorithm can find the solution within the matrix faster than it could in a matrix with the same number of entries, but none of the useful symmetries. The symmetries of the matrix convey another important benefit as well: They help ensure that the guesses never grow so big that they become unwieldy from the perspective of algorithmic efficiency.
“We had to control how big a number shows up as we do this guessing and coordination,” said Peng.
Peng and Vempala prove that their algorithm can solve any sparse linear system in n2.332 steps. This beats the exponent for the best algorithm for matrix multiplication (n2.37286) by about four-hundredths. Edging out matrix multiplication won’t matter for practical applications anytime soon, but as a proof of concept, this slight improvement is a chasm: It shows there’s an entirely better way of solving linear systems.
“Philosophically we didn’t know before if you can go faster than matrix multiplication,” said Vempala.
Now we do.