© Copyright JASSS

  JASSS
logo --------

The Uses of Genetic Programming in Social Simulation: A Review of Five Books

Genetic Programming: On the Programming of Computers by Natural Selection
John R. Koza
Cambridge, MA: The M.I.T. Press
1992
Cloth: ISBN 0-262-11170-5
Order this book

Genetic Programming II: Automatic Discovery of Reusable Programs
John R. Koza
Cambridge, MA: The M.I.T. Press, A Bradford Book
1994
Cloth: ISBN 0-262-11189-6
Order this book

Advances in Genetic Programming
Edited by Kenneth E. Kinnear Jr.
Cambridge, MA: The M.I.T. Press, A Bradford Book
1994
Cloth: ISBN 0-262-11188-8
Order this book

Advance in Genetic Programming Volume 2
Edited by Peter J. Angeline and Kenneth E. Kinnear Jr.
Cambridge, MA: The M.I.T. Press, A Bradford Book
1996
Cloth: ISBN 0-262-01158-1
Order this book

Genetic Programming and Data Structures
William B. Langdon
Dordrecht: Kluwer Academic Publishers
1998
Cloth: ISBN 0-792-38135-1
Order this book

--------

Reviewed by
Bruce Edmonds
Centre for Policy Modelling, Manchester Metropolitan University.

* 1. Introduction

Genetic Programming (GP) is a technique which permits automatic search for complex solutions using a computer. It goes beyond previous techniques in that it discovers the structure of those solutions. Previously, if one were trying to find an equation to fit a set of data, one would have had to provide the form of the equation (for example a fourth degree polynomial) and the computer could then find the appropriate parameters. By contrast, GP can experiment with a whole range of different functional forms, building equations from a menu of functions, symbols and arithmetic operations. Thus GP can be seen as an essentially creative technique. It is good at finding novel solutions where not much is known about the form of the solution a priori.

To be more precise, it is a powerful method which allows the automatic induction of a wide range of abstract entities given a method for evaluating these entities. It works in a way that is analogous to the evolution of genomes in living organisms: a population of 'genes' is repeatedly mutated, sexually recombined and selected for desirable features. GP does not require the programmer to specify anything about the structure of the solutions, merely some criteria for measuring the performance of each solution. The algorithm itself finds structures that produce the best performance given these criteria. Many types of entities can be 'evolved' using GP. These include computer programs, functions, networks, mathematical expressions, logical statements and circuits. For example, GP might be used to find the symbolic solution for a set of equations or the structure of an electronic circuit for filtering out noise. In certain formal domains it has (marginally) outperformed all human efforts.

As with other techniques and methodologies developed in the fields of computer science and artificial intelligence, GP is now being used in social simulation. It follows a long line of others techniques including stochastic programming, object-orientated programming, declarative programming, cellular automata and neural networks. Like these, GP has the potential to enrich social simulation, by enabling researchers to model aspects of social behaviour in a way that was not previously possible. It also provides another metaphor for us to wield in our search for an understanding of social processes. However, if these metaphors are used in an unthinking way they can do more to obscure than enlighten.

I will provide a brief explanation of the workings of the basic technique in section 2, followed by a classification the ways in which GP may be applied to the field of social simulation (section 3). I will then review five books about GP from the viewpoint of the social modeller (section 4) before concluding in section 5.

* 2. A Brief Introduction to Genetic Programming

GP is an extension of the Genetic Algorithm which was invented by John Holland (1975). Although the idea of evolving programs was first suggested by Forsyth (1981) and Cramer (1985) among others, it was proved, promoted and developed into a practical tool by John Koza. Genetic Programming is one technique (albeit an important one) amongst a whole range of possible evolutionary algorithms. The whole field is now called Evolutionary Computation.

In common with many search techniques, the GP algorithm has three basic components.

In GP, each candidate solution is stored in the form of a tree structure. Two examples of these trees are shown in Figure 1. The first of these might be interpreted as the function

equation 1

and the second as the logical expression (agent-4 saidYes) OR (agent-3 DidBetterThan me). Initially, the population of candidate solutions is generated randomly from a specification of the possible nodes and terminals which can be used to construct a legal tree.


Figure 1

Figure 1. Two examples of genes that might occur in GP

The GP algorithm proceeds by acting upon the members of this population over a number of generations. Each generation involves three steps.

  1. Each of the candidate solutions is evaluated by trying it out on the target problem to see how good it is. Usually each solution is given a fitness, which is a numerical measure of its quality.
  2. An interim collection of solutions is generated by selecting from the population in a way which takes account of solution quality. Typically, solutions are selected with a probability which is proportional to their fitness, so that better solution are more likely to survive in the population.
  3. Finally, members of this interim population are acted upon to introduce some new solutions which are variation of the old ones. Usually this action involves a mixture of three genetiv operations called 'tree crossover', 'propagation' and 'mutation'. The interim population then replaces some or all members of the current population to form the new population.

These three steps are repeatedly applied until either an acceptable solution is found or some other termination condition is fulfilled. This process is illustrated in Figure 2. It is the combination of fitness dependent selection and blind introduction of variation into the population of solutions that gives the algorithm its evolutionary flavour. Better solutions are literally evolved from the existing population.


Figure 2

Figure 2. An illustration of the GP algorithm

There are several possible ways of evaluating the candidate solutions. The most common of these is to interpret each solution as a program and measure its success rate when it is used in the intended context - this success rate is then used as the fitness of the solution. For example, if a GP was being used to find good procedures for scheduling work, then the evaluation might involve seeing how efficient it was at taking scheduling decisions in some randomly generated test situations. There are other methods available apart from using an explicit fitness measure. For example, tournament selection can be used. In this approach, several candidates solutions are chosen at random and 'played' against each other. The winner of the tournament is then the one that is selected.

The principal way of introducing variety into the solution population is tree crossover. In this operation, two 'parent' solutions are chosen and a node in each tree is randomly selected. Then the subtrees rooted at these nodes are each cut and grafted onto the chosen node in the other 'parent' tree. This is analogous to the crossing over of genetic material that occurs in the production of gametes during sexual reproduction. The process is illustrated in Figure 3. This operation has the effect of 'shuffling' subtrees around the solutions in such a way that loss of variety in the overall population is minimised.


Figure 3

Figure 3. The tree crossover operation in GP

Mutation is the other main operation used to introduce variation into the population. Mutation can be thought of as choosing a solution (A), generating a new completely random solution (B), performing a crossover operation on A and B and keeping only one child: the one with the original root from A. The remaining operation which I mentioned was propagation. Here the selected solution is simply copied unchanged into the next generation. This allows the frequency of successful solutions to increase more quickly as a proportion of the population than would otherwise be the case.

There is now a large literature on the variations and applications of the GP technique. Some of this can be found in the books reviewed below. In addition, useful resources on the WWW include:

It must be said that the reasons why GP works are not yet totally understood. Thus there are currently no general guidelines as to when this technique will be more efficient than other search methods. There are also good theoretical reasons for supposing that any particular search technique will not be the best for all tasks.

* 3. A Classification of Genetic Programming Applications in Social Simulation

Before launching into a review of the five GP books, it will be helpful to provide a rough classification of some different reasons for using this technique in social simulation.

3.1 GP as an Inductive Tool for Instrumental Problem Solving

The most straightforward use of GP is as an instrumental search technique. If you know the desired performance of an entity and the sort of elements it should be composed of, but nothing about its structure, then GP will have a good go at finding a suitable solution for you. GP is thus a good technique in situations where you, as a programmer, have little domain knowledge. In this, it is more suitable than the Genetic Algorithm because the coding of the possible solutions can be done in a more natural and expressive manner. (Genetic Algorithms encode their candidate solutions into fixed length strings.) On the other hand, GP requires more computational power than the GA. The critical factors in using GP are the computational time taken to evaluate a solution once, the size of your population and the number of generations you intend to run the GP for.

An example of such a use is the non-linear fitting of equations to a data set. Such an application is described in chapter 17 of Advances in Genetic Programming.

If you want to use GP as an instrumental problem solving tool, there are a number of commercial and public domain software packages available - see the links listed above.

3.2 GP as an Integral Component of a Social Simulation

A more interesting use of GP is as an integral part of a social simulation. Here the GP process stands in for a part of the process in the system being modelled. This can basically be done in two ways. The GP can be used as a functional component of the simulation or as a descriptive component. For a critique of the realism of evolutionary algorithms (including GP) in social simulation see Banzhaf (1997).

3.2.1 GP as a Functional Component of a Social Simulation

Here the GP is used to produce results that are desired in the simulation. These results are either verifiable against data from the social system under study or else are chosen for other a priori reasons. Here the detail of the GP algorithm is not intended to correspond to the modelled system. There is a considerable danger in using GP as a stand in for a learning process regardless of whether its characteristics are appropriate. This is discussed in more detail by Chattoe (1998). In economics there are now many published papers which take an existing model based on optimisation and plug in a Genetic Algorithm as an 'optimising engine'. No doubt these will be followed by a host of papers which plug in a GP algorithm instead.

An example of this kind of application is provided by chapter 22 of Advances in Genetic Programming Volume 2.

3.2.2 GP as a Descriptive Component of a Social Simulation

In this case, the GP algorithm is used so that not only the results but also the structure and functioning of the algorithm correspond to parts of the social system being modelled. This gives a more bottom-up flavour to use of the technique. Usually the adoption of such an approach requires that the GP algorithm be tailored so that it matches the system under consideration, for example by biasing the algorithm heavily towards propagation or by replacing the crossover operator as I do in (Edmonds 1997).

An example of this sort of application is provided by chapter 8 of Advances in Genetic Programming.

3.3 GP as a Programming Tool

The next possible use for GP is as a programming technique. That is, instead of writing programs yourself, you merely specify what you want the program to do and set a GP to generate a suitable piece of code. At the moment this is probably not practical for social simulators. So far, this approach has only been realised using fairly simple data structures applied to small 'toy' problems. This approach may require huge computational resources and even then, you will still have the difficulty of specifying the task for the program. This might turn out to be as complex as programming the code yourself and you would still have the problem of verifying that the resulting program did match your specification. Automated programming is what the Genetic Programming and Data Structures book is aiming towards.

3.4 GP as Source of Suggestive Analogies

The last use of GP occurs when its operation suggests an analogy with a social process, but where there is no question of the algorithm being a simulation of it. As ever, drawing conclusions from analogies is a dangerous pastime, and any arguments from such analogies should be deeply mistrusted. Having said that, an enrichment of our store of analogies is always welcome and evolutionary algorithms (including GP) do provide a fertile source. In particular, I find GP invaluable for demonstrating how assumptions might be mistaken, due to the serendipitous nature of its solutions.

An example of this kind of use is provided by chapter 11 of Advances in Genetic Programming.

* 4. A Review of Five Genetic Programming Books

In this section, I review five books about GP. None of them are directly relevant to GP in social simulation but one must bear in mind that there are currently few enough papers on this topic, let alone books. The most interesting is perhaps the original Advances in Genetic Programming collection, published in the early days of GP after creative people had piled into the field but before it had become established.

Cover of
book Genetic Programming: On the Programming of Computers by Natural Selection
John R. Koza
Cambridge, MA: The M.I.T. Press
1992
Cloth: ISBN 0-262-11170-5

Order this book

This is the book that defines the field. It is (almost always) a straight-forward and readable book, which should be accessible to anyone with a little experience of programming. The rationale and approach are slowly and clearly described and many examples are provided. Without doubt, the accessibility and thoroughness of this book go some way to explaining the rapidity with which GP has been taken up outside computer science.

The first nine chapters motivate, introduce and verify the basic technique. The rest of the book looks at different aspect of GP and its potential uses. The appendices present and describe the LISP code used in the book. The more interesting and useful chapters for the social simulator include:

At some stage in the future, someone will write a better introduction to GP which also includes some of the newer developments in the field, but this has not happened yet. Cover of book Genetic Programming II: Automatic Discovery of Reusable Programs
John R. Koza
Cambridge, MA: The M.I.T. Press, A Bradford Book
1994
Cloth: ISBN 0-262-11189-6
Order this book

This book introduces the technique of "Automatically Defined Functions" (ADFs) to GP. This amounts to allowing the GP algorithm to develop subroutine-like structures at the same time as the main structure (which has terminals that call these 'subroutines'). Both the 'subroutines' and the main program evolve in the same way as they do in a simple GP. This technique improves the performance of the GP with respect to problems that have an innate hierarchical structure, so it should be used wherever the same structure or behaviour recurs in identical or similar form at different levels.

Of course, there is no general way of telling whether a randomly selected problem from the real world will have any such structure, but one can take a guess from the problem source. For example, it seems that social systems rarely produce unstructured problems or data, presumably because the humans and institutions that comprise them have limited computational capacities, which means that they have to use substantial structure in order to deal with information at all. There are a few possible exceptions which occur when individuals or institutions are in close competition and trying to creatively out think each other as they are in financial markets. In this case, the results can be effectively structureless because if there were any accessible pattern someone would begin to exploit this and in the process disrupt it. (This behaviour can be compared to the co-evolutionary generation of pseudo-random number generators described in chapter 20 of Advances in Genetic Programming.)

This is not a light book. It goes systematically into the evidence and operation of GP with ADFs. The applications it considers are either benchmark ('toy') problems from the world of computer science or from molecular biology. However, if a social scientist wishes to use GP on substantially ordered or even hierarchical problems of a realistic size, then using these techniques will be necessary. For such a person this book will be very useful. It may be a bit boring to read, but it does not assume much prior knowledge. It takes you slowly through the whole story and gives many case studies and examples.

Cover
of book Advances in Genetic Programming
Edited by Kenneth E. Kinnear
Cambridge, MA: The M.I.T. Press, A Bradford Book
1994
Cloth: ISBN 0-262-11188-8
Order this book

This book contains nothing that directly concerns the use of GP in social simulation, but it has some useful chapters which will aid researchers in applying the technique. A few of the chapters are also interesting in that they provide suggestive analogies for social processes. Most of the chapters in this book either deal with ways of extending and improving the basic GP algorithm and/or with other possible application areas. Some of the more interesting chapters include:

Overall, this is not a book that a social simulator would buy as a work of reference, but it does provide some stimulating suggestions. It is worth reading if one has already successfully tried GP out for oneself.

Cover of book

Advance in Genetic Programming Volume 2
Edited by Peter J. Angeline and Kenneth E. Kinnear
Cambridge, MA: The M.I.T. Press, A Bradford Book
1996
Cloth: ISBN 0-262-01158-1
Order this book

This book was published two years after the first collection, when the field of GP had settled down more into what might be called 'normal science'. It is thus far more concerned with technical issues and is generally of far less interest to the social simulator. What it does do is start to examine and compare the various varieties of GP in a more methodical manner. Thus the various chapters do provide the reader with many ideas for refining and improving the efficiency of GP in different domains. However, being merelya collection of papers, it does not do this in a systematic manner, as a handbook might. Three chapters are worth mentioning.

The book ends with a useful survey of web resources and a detailed classified bibliography. This book is really only worth reading if you have already used GP techniques for some time and are hungry for more technical discussion.

Cover of book Genetic Programming and Data Structures
William B. Langdon
Dordrecht: Kluwer Academic Publishers
1998
Cloth: ISBN 0-792-38135-1
Order this book

This book attempts to bring GP back into the fold of computer science by describing, in detail, how to evolve programs that use the standard data structures. Thus the bulk of the book deals with the techniques for each of these structures, leading to chapter titles like Evolving a Stack. This book will be useful for those who are interested in actually evolving computer programs directly and not so much for those concerned with the use of the GP paradigm for social simulation.

However, having said this, William Langdon is good at writing textbooks. Thus the first two chapters survey and introduce GP competently and clearly. Then in the third chapter (Advanced Programming Techniques), he provides the best up-to-date survey of the current field that I have read. In addition to this, he provides an accessible formal analysis of GP in chapter 8. Finally, he provides a readable and comprehensive glossary of terms used in GP as well as an introduction to the (freely accessible) C++ package for doing GP called GPQuick.

Forthcoming GP Books

There are at least two new volumes on GP about to be published. Firstly, a new book by John Koza and David Andre (forthcoming) which will include a survey of the field so far. Secondly, there is a third volume due in the Advances in Genetic Programming series. Finally, there is also what is described as a 'textbook' on GP, by Banzhaf et al. (1997), but I have not seen this.

* 5. Conclusion

Apart from its straightforward instrumental uses, the paradigm of GP opens up a new range of possibilities for social simulators - that of models based on a learning technique where the structure of what is learnt can come to the fore. It is a technique which can be used to simulate some aspects of the creative learning capabilities of humans (Edmonds 1997).

Of course, the GP algorithm is not a perfect mirror of human cognition. To be used effectively as a descriptive element in a social simulation, it needs to be adapted to ensure that it is as realistic as possible. The books reviewed above only touch on such subjects, which reflects the fact that GP was not originally designed with social simulation in mind. (See Chattoe 1998 and Edmonds 1997 for more on this issue.)

GP is not yet a completely mature technique. As such, its impact in the field of social simulation has just begun. No doubt its impact will be at least as great as those of previous paradigms such as neural networks or genetic algorithms, not so much in that it introduces a new computational analogy but because it is unparalleled as a creative computational technique - apply it and you may be genuinely surprised at the results.

--------

* References

ARIFOVIC, J. 1994. Genetic Algorithm Learning and the Cobweb Model. Journal of Economic Dynamics and Control, 18:3-28.

BANZHAF W., P. Nordin, R. E. Keller and F. D. Francone 1997. Genetic Programming - An Introduction: On the Automatic Evolution of Computer Programs and its Applications. Morgan Kaufmann Publishers, San Francisco, CA and dpunkt.verlag, Heidelberg.

CHATTOE E. 1998. Just How (Un)realistic are Evolutionary Algorithms as Representations of Social Processes? Journal of Artificial Societies and Social Simulation, 1, http://www.soc.surrey.ac.uk/ JASSS/1/3/2.html

CRAMER N. L. 1985. A Representation for the Adaptive Generation of Simple Sequential Programs. In J. J. Grefenstette, editor, Proceedings of an International Conference on Genetic Algorithms and their Applications, Carnegie-Mellon University, Pittsburgh, PA, 24-26 July 1985, Lawrence Erlbaum Associates, Hillsdale, NJ, 183-187.

EDMONDS B. 1997. Modelling Bounded Rationality using Evolutionary Techniques. In D. Corne and J. Shapiro, editors, Evolutionary Computing: Selected Papers from the 1997 AISB Workshop, Springer Verlag, Berlin, Lecture Notes in Computer Science 1305, 31-42. (An earlier version is available as Discussion Paper CPM-98-33, Centre for Policy Modelling, Manchester Metropolitan University.)

FORSYTH R. 1981. BEAGLE: A Darwinian Approach to Pattern Recognition. Kybernetes, 10:159-166.

HOLLAND J. H. 1975. Adaption in Natural and Artificial Systems. University of Michigan Press, Ann Arbor, MI.

KOZA J. R. and D. Andre forthcoming. Genetic Programming III. Morgan Kaufmann, New York, NY.

--------

ButtonReturn to Contents of this issue

© Copyright Journal of Artificial Societies and Social Simulation, 1998