Reviewed by
Michael Möhring
Institute of Computer
Applications in the Social Sciences, University of
Koblenz-Landau, Rheinau 1, 56075 Koblenz, Germany.
This volume comprises a collection of articles which deal with the validation of knowledge generated by computer models. In 1997, researchers from different disciplines were invited by the Dutch Federation of Social Science Methodology to explain how they dealt with this question in their field.
The resulting book contains seven articles which are the results of the intellectual efforts of the authors. However, to facilitate this work of science, invited reviewers gave comments on draft versions of each article.
The first article (by van Dijkum and van Kuijk) begins by discussing the question of validation from a historical perspective. They show that there exist diverse and controversial opinions, ranging from the view that science will generate valid knowledge to the belief that in science 'anything goes'. Starting from criticism of some of these opinions they make an effort to formulate a new methodology for validity in the social sciences. Based on their view of Popper's simplified methodology of validation, the authors claim that new procedures for validation will be required. In particular, these must be able to deal with feedback and non-linear processes. These new procedures will not have to be constructed from scratch, however, because a number of quantitative techniques (statistical measures of similarity for linear and partly non-linear models) already exist and can be used. Additionally, qualitative validation is at least as important in understanding the outcomes of complex models when combined with theoretical and qualitative knowledge. With the help of sensitivity and error analysis, for example, qualitative strategies can detect the constraints on complex model behaviour (such as equilibria), which might make it easy to eliminate a number of models from consideration. Therefore, users of linear models can still benefit from qualitative validation.
Ferdinand Verhulst, the author of the second article, introduces scientific models as metaphors with quantitative aspects added to their descriptions of reality. The validation of these models ideally requires setting up carefully controlled experiments which allow for measurements to be matched with the quantitative predictions of the model which are the outcomes of mathematical calculations. In practice, 'carefully controlled experiments' are often difficult or impossible to achieve, because of the incompleteness of data and a strong emphasis on qualitative aspects as well. To show that the validation process typically differs from the textbook examples provided by classical physics, he reviews a number of concrete modelling problems: the pollution of the North Sea, the chaotic flow-field of the Wadden Sea, drillstring dynamics and the use of metaphors for psychoanalysis. These examples can be placed on a sliding scale according to the ratio between quantitative and qualitative elements. Finally, the author claims that in contemporary science the ideal of controlled experiment is never reached and argues (even more strongly) that problems where this ideal is reached are not in fact fundamental and do not typically interest the research scientist.
In the third chapter, Jansen and de Vries discuss validation problems in global modelling, also known as integrated assessment modelling. In contrast to historical global modelling approaches like the World Models of Forrester and Meadows, integrated assessment modelling can be characterized by a multi-disciplinary perspective and the integration of different complexity levels (physical environment, human behaviour, information flows, human values, beliefs and ideas). The usefulness of this approach is demonstrated by the integration of natural science and economics in the modelling of global climate change and by introducing human response in the form of perspective-based rules for agents. Such complexity makes it nearly impossible to validate this kind of model (and the theory it represents) in the strict sense, because too many subjective assumptions have to be made and far too little data is available. Nevertheless, the authors argue that it is the modelling process matters, not the model itself. Modelling as a process for structuring knowledge to be used in decision-making is intrinsically valuable. The construction of transparent models in an interactive way is proposed as an additional technique to accompany the use of expert validated metamodels. The chapter also argues for a more explicit recognition of the problems arising from multi-disciplinarity.
In chapter four, Hoede and Weening introduce a new method for validating theories using knowledge graphs. These were orginally used to extract knowledge from texts in medicine. Discussion about validation usually focuses upon the measurement of the concepts in the model, as parts of the underlying theories. Because complex concepts may be problematic in this regard, new conceptualisation tools are necessary. Knowledge graphs can be described by vertices which represent concepts and by arcs which represent the relationship between these concepts. The chapter provides a meta-level discussion of this methodology, applying these methods by analysing the knowledge graphs for three given definitions of the term 'imperialism' and pointing out a contradictions between them.
In chapter five, de Vos and Bosker study educational effects by means of a computer simulation. A three-level (school, class, pupil) simulation model has been developed to describe how the learning of students in Dutch schools evolves over time because 'traditional' statistical methods cannot cope with the assumed reciprocal influences at work. Validation by comparing simulation output to real data shows that in general, the model describes actual Dutch secondary education process well enough. Additionally, the chapter shows that analysing the data with traditional validation methods like multivariate regression analysis will hardly demonstrate the presence of existing relations. Therefore, relations of which researchers are convinced, but which are not confirmed by traditional methods, might still be picked up by more appropriate techniques.
In chapter six, Kleijnen describes validation methods based on mathematical statistics, because other types of validation - such as animation - give only 'face' validity. The choice of statistical tests for validation depends on the type of data available. The author distinguishes three situations (no real data available, real output data available, real input/output data available) and discusses them each in detail. If no real data is available, validation should be guided by 'Design of Experiments' (DOE) principles. These describe how to select a limited set of combinations of variable levels for observation by simulation runs and how to show which variables may be important using regression analysis. If real output data is available, then real and simulation output can be compared by a Student t-test or other (distribution-free) tests like the rank test or bootstrapping. Finally, if real input and output data are available, a trace-driven simulation is possible, where real and simulated output can be analysed by a novel regression analysis, which not only uses the differences but also the sums of real and simulated outputs.
In the final chapter, DeTombe introduces the concept of outside validity to reflect on decisions which the researcher makes before formulating the research question. In contrast to the well known external and internal validity concepts, outside validity discusses questions about the boundaries of the research, the selection of research methods and tools (whose view of reality is represented in the research outcomes) and the presentation of results.
In summary, this book contains an interesting collection of papers which describe different perspectives on validation, originating in diverse scientific fields. For readers who want to learn more about the practical problems of validation in simulation studies (which go beyond mere textbook examples), this book will be a good resource. Kleijnen is right when he notes in his article that the literature on validation is abundant. (Almost every book on simulation has a chapter about validation.) Unfortunately, this literature is restricted to the description of basic (quantitative) validation methods typically applied to small and well known modelling examples. The diversity of modelling examples and case studies provided in this book show that there is no standard way to solve validation problems and that the real difficulties lie in the appropriate selection of existing validation methods and in their adequate adaptation to a concrete model. Furthermore, it is very noteworthy that some of the presented papers emphasise that qualitative validation is at least as important to understanding the outcomes of models. This is even more true when researching complex problems which are found in the social sciences or in interdisciplinary contexts.
Return to Contents of this issue
© Copyright Journal of Artificial Societies and Social Simulation, 2000