Genetic Algorithms

Jim Manzi continues to write interesting articles over at NRO and its Corner blog. His take on science is pretty interesting, and is a breath of fresh air when the Corner (too often) seems to be dominated by all things Catholic.

However, I would argue with him regarding Genetic Algorithms. My opinion of them is much lower than his.

So-called Genetic Algorithms (GAs) are computer-software implementations of the same kind of mathematics that takes place in the biological process of evolution. [...] Today, GAs are widely deployed in artificial-intelligence computer programming to solve such prosaic engineering problems as optimally scheduling trucks on a delivery route or identifying the best combination of process-control settings to get maximum output from a factory.

In a surprisingly technical detail, he describes a method for toggling 100 switches at an industrial plant to maximize output:


Our goal, then, is to find the “genome” that will lead the plant to run at maximum output. The algorithm creates an initial bunch of guesses — genomes — by randomly generating, say, 1,000 strings of 100 zeros and ones. We then do 1,000 sequential production runs at the factory, by setting the switches in the plant to the combination of settings indicated by each genome and measuring the output of the plant for each; this measured output is termed the “fitness value.” (Typically, in fact, we construct a software-based simulation of the factory that allows us to run such tests more rapidly.) Next, the program selects the 500 of the 1,000 organisms that have the lowest fitness values and eliminates them. This is the feedback measurement in our algorithm — and it is directly analogous to the competition for survival of biological entities.
Next comes the algorithmic process for generating new guesses, which has two major components: crossover and mutation. These components are directly modeled on the biological process of reproduction. First, the 500 surviving organisms are randomly paired off into 250 pairs of mates. The GA then proceeds through these pairs of organisms one at a time. For each pair it flips a coin. If the coin comes up heads, then organism A “reproduces” with organism B by simply creating one additional copy of each; this is called direct replication. If it comes up tails, then organism A reproduces with organism B via “crossover”: The program selects a random “crossover point,” say at the 34th of the 100 positions, and then creates one offspring that has the string of zeroes and ones from organism A up to the crossover point and those from organism B after the crossover point, and an additional offspring that has the string of zeroes and ones from organism B up to the crossover point and those from organism A after the crossover point. The 500 resulting offspring are added to the population of 500 surviving parents to create a new population of 1,000 organisms. Finally, a soupçon of mutation is added by randomly flipping roughly every 10,000th digit from zero to one or vice versa.
The new generation is now complete. Fitness is evaluated for each, the bottom 500 are eliminated, and the surviving 500 reproduce through the same process of direct replication, crossover, and mutation to create the subsequent generation. This cycle is repeated over and over again through many generations. The average fitness value of the population moves upward through these iterations, and the algorithm, in fits and starts, closes in on the best solution..

Whew! Indeed, this is the sort of thing that GA practitioners do.

I think it's nonsense. There is no guarantee you will find the optimum setting this way. Further, there is absolutely no reason to believe a priori that this procedure will work any better than a much simpler non-GA greedy approach would work:

1) Start with an initial random positioning of all 100 switches.
2) Put your finger on the first switch.
3) Toggle the switch.
4) If production decreases, toggle the switch back.
5) Put your finger on the next switch (loop back to the first if you are at the 100th switch), and go back to step 3.
6) Exit when you can't increase production by toggling a switch.

It's clear that this algorithm increases production and must terminate. But like the GA algorithm described, it doesn't necessarily find the best setting. You could run it several times with different random initial states, but even then you are just hoping that you are not getting stuck short of the best setting. You could try flipping 2 switches at a time, or 3, or N. But without more information on the function you are minimizing, it's pretty much hopeless. Only checking all 2^100 possibilities guarantees you will find the optimum setting.

GA has always seemed to me to be hocus pocus. When more is known about the function, you are likely better off with standard descent methods. When nothing is known, GA is likely no better than any other way of repeatedly guessing.

2 comments:

Anonymous said...

Thanks, that's a great comment / issue.

A couple of points:

1. No search algorithm (in any practical application) gets the optimum answer, just a "very good" answer. If you know the optimum, findable by exhaustion of possible solutions, you usually don't need any fancy algorithm, you just try all the possibilities. This is why I was very careful to say "tends toward" the best possible answer (even this is, of course, a simplification).

2. GAs are general a terrible method, unless nothing else works. Your method (as you indicate) will work wel if the interation effects are not large vs. the primary effects. So, GAs are only relevant in such high interation effects environments. Second, if you have any useful a priori knowledge of the system that lets you set up an LP, QP or similar algorithm, this will always get you to a better solution faster (as you say, a greedier algorithm always works better when it's "informed greed".) It is empirically demonstrable that GAs will outperform "random guessing" on one hand or "poorly informed" greedy algorithms for many types of high-interaction optimization problems with no known structure. A classic case is setting the weights for a neural network.

Best,
Jim Manzi

SteveBrooklineMA said...

Hi Jim-

It's very nice of you to reply! I didn't know the Corner had track-back.

I agree that only in very special cases can you be sure to get to ``the'' global optimum, using any algorithm. For a system with infinitely many states, there might not even be a global optimum.

I think, as you seem to, that GAs should be thought of as a tool to pull out when you are trying to crack a tough nut. There are others you could try as well. Perhaps experience would help guide you as to which tool is most likely to get the job done.

GAs are often sold as being effective because they act in a way that is similar to evolutionary processes that are found in nature. I wonder those evolutionary processes are effective because they are often massively parallel.

Again, I really enjoy your work over at National Review. Thanks for writing,

-Steve