Stephen Read and Lynn Miller: Connectionist Models of Social Reasoning and Social Behavior

Connectionist Models of Social Reasoning and Social Behavior

edited by Stephen J. Read and Lynn C. Miller
Lawrence Erlbaum Associates, Mahwah, NJ
1998
Paper: ISBN 0-8058-2216-X

Reviewed by
Deborah Vakas-Duong, Agent Based Learning Systems, 5019 King Richard Drive Annandale, VA 22003, USA.

Cover of book

When Stephen Read and Lynn Miller's book Connectionist Models of Social Processing was first announced in the Erlbaum mailings, I admit that my heart skipped a beat because it sounded almost exactly like the research I was doing independently. We have both used IAC neural networks with hierarchical layers to simulate social processes and we both claim that our work gets at the mechanisms behind the emergence of meaning (Vakas-Duong 1996). However, I was delighted to find, when the book actually came out, that our work is quite complementary. Read and Miller attack the problem of meaning from the psychological side of social psychology, while I address the sociological side. The difference is more than simply a matter of which academic department one belongs to. It fundamentally affects the way that social processes are seen. I have heard it said that social psychology in sociology departments is all theory and no experiment, while social psychology in psychology departments is all experiment and no theory. These differences seem to have carried over into many early attempts to bring new computational tools to social psychology. I would like to see these differences gradually disappear and for social science to become truly integrated, as we learn how to use the computer to tease out what is true from the many theories and sets of experimental data in the separate disciplines. My intention is to use this review as a forum to explore the epistemology of social simulation - to ask what we can know, to be careful about what questions our experiments are really asking and to see how the work presented in this volume stands up to that exploratory standard. I could equally have applied the same standards to any work, including my own.

The first chapter, "Making Sense of People: Coherence Mechanisms" by Thagard and Kunda, is a good introduction to the book in that it explains why connectionist models are good tools for interpreting the experimental findings of social psychology. The authors also give a summary of how their own computational models make sense of the data. The chapter is aimed at psychologists, being heavy on social psychology and light on modelling science. It reviews some of the ideas of schema theory in social cognition and advocates the use of constraint satisfaction (CS) neural networks for modelling social schema. Of course, Rumelhart and McClelland (1988) used networks like these for modelling cognition (and even social cognition) in their Parallel Distributed Processing volumes, but the research presented here is an advance in that it applies these ideas to real data from social psychology. The chapter also presents the author's constraint satisfaction model.

In CS models, constraints usually represent relations between concepts. For example, Thagard and Kunda present a CS model of stereotypes, where "black skin", "white skin", "violent push" and "jovial shove" are nodes. The model tries to explain data in which black people with the same behaviour as white people were interpreted as more violent than whites. It encompasses a model of the prejudiced persons thinking process, where the node "black skin" has a positively weighted connection to the node "aggressive" and "white skin" has a negatively weighted connection to the same node. The aggressive node is then positively associated with the node "violent push" and negatively associated with "jovial shove". Well, of course, if you excite the node "black skin", it will light up the "aggressive" node and this will in turn inhibit the node "jovial shove" and excite the node "violent push". The obviousness of the result leads to the question - of what use is such a model?

Thagard and Kunda do not claim to have a full theory of social cognition. They recognise that their model only explains how existing concepts are applied in new situations and that it does not address how new concepts are formed. However, I am not sure if it is possible to make a good model of "what is" without including a model of "how it came to be". This is particularly true when it comes to modelling dynamic, living societies. In its present form, all their model consists of is a restatement of the data that blacks are seen as more aggressive than whites. It thus has no explanatory power. It would have been a better model if the positive connection between skin colour and aggressiveness had somehow emerged from the dynamics of the system, as has been achieved in previous constraint satisfaction models of social behaviour (Vakas-Duong and Reilly 1995).

In chapter 2, Read and Miller identify some of the most important traits of connectionist models which make them appropriate for modelling social schema. Connectionist models fill in where inputs are lacking and resolve inconsistent input data by giving less strength to activations which don't make sense. Like people, they also become more sensitive near bifurcation points determining which way the network will go in dealing with inconsistent input. In the same way that human perception is influenced by interpretation (and vice versa), connectionist models are appropriate for schema because inputs are influenced by internal activation states which in turn influence them. These are important points about the value of connectionist models for social psychological modelling. However, when it comes to Read and Miller's own computational models, the problem is the same as it was in the first chapter. Their models only encode a restatement of the problem, they do not answer a question. Read and Miller present a CS net which acts like human beings because it is wired that way, not because it uses some parts of human behaviour to explain others. For example, the chapter attempts to explain Trope's model of dispositional inference (Trope and Lieberman 1993), where situational factors change our interpretation of behaviour which is ambiguous. Read and Miller's model simply takes some situational factors, makes a node out of them, and hooks them up to the nodes which represent interpretations. However, they only represent these things because the authors say they do: they have nothing to do with real situational factors or real interpretations. There is nothing "situational" about the situational nodes and thus they have no explanatory power.

I found the third chapter to be somewhat misleading. Kashima et al. used a tensor product net as the basis for a model of group impression formation. The tensor product net was used because it allows for multiple associations to be encoded. However, it is still a linear associator, that takes out the "fuzziness" of neural networks, which is their very advantage. Even though such a network is a poor model of schema, which are essential to group impression formation, it is good at linear tasks like adding independent features. An example is given where subjects would add features in order to categorise, which has little to do with real-life categorisations they might make. The subjects added the categories and so did the net. Does this make it a good model of group impression formation? No, because impressions involve schema, not linear adding of components. Also present is a line graph which shows a close match between the computer simulation and the people's categorisations. However, this line graph is inappropriate for the type of data presented, with meaningless swings that appear meaningful. The match between computer experiment and data appear close because the cases along the x axis do not vary smoothly, not because of the accuracy of the model. In any case, the attempt to model experimental data is to be commended.

Chapter 4, on Smith and DeCoster's model of causal attribution, takes a plain auto-associator and draws analogies to different social phenomena. From a modelling standpoint the results are rather obvious again. If one puts in a pattern that the network was trained on for 20% of the time, that same pattern comes out again. That is just how the auto-associator works. Explanatory power only enters the picture when many phenomena in the world come out in the dynamics of a simulation, not when a single phenomenon in the simulation is analogous to a single phenomenon in the world.

Chapter 5 is better in terms of its scientific validity. In it, Van Overwalle and Van Rooy model two particular mistakes that occur in our perception, discounting and augmentation, which contradict probability theory. These mistakes, because they appear to contradict evolutionary advantage, might cast light on the mechanisms behind the way we attribute cause.

Discounting and augmenting are opposite ways of attributing cause. If five runners were to break a track record on the same day then the skill of the runners would be discounted as a cause and we would say that the wind was the crucial factor. However, if only one runner in a hundred were to break the record, then it would typically be attributed to the runner's skill, and this cause would be augmented. It just so happens that the Widrow-Hoff model, already widely used in neural network research, not only accounts for the existence of association but for the same errors in association that people make. Because at least three real world phenomena come from one simple model, the model has greater explanatory power. In this case, the model is like the world not only in its association, augmentation and discounting abilities, but also in its use of the Hebbian rule found in the nervous system. It is good in the same sense that Rumelhart and McClelland's (1988, vol 2, pp.215) classic model of the mistakes children make while learning language is good. Rumelhart and McClelland found a simple way to model how children learn a correct version of a verb, such as "went," then unlearn it when they learn grammatical rules ("goed"), and then relearn the correct version. They did this not by adding to an existing model, but by simplifying one. Both the Rumelhart and McClelland model and the Van Overwalle and Van Rooy model are good because they explain a mysterious mistake that we make, as well as our normal functioning, using the same simple model.

Van Overwalle and Van Rooy then go on to explain other cognitive errors. They use Pearce's configurational network (Pearce 1994) to model over-generalisation. This is a cognitive error that is displayed, for example, when a dog salivates not only at the bell in a Skinnerian experiment, but also at the ring of a telephone. This time, however, the back-propagation algorithm which is commonly used in neural networks does not make the generalisation mistake as effectively as the network designed to generalise. Since the author's second model becomes more complicated as a result of the attempt to make it generalise, it has less explanatory power than their first. The authors then present an interesting discussion which advocates the use of feed forward networks to explain illusory correlation (Hamilton and Gifford 1976). This is the effect whereby perceivers judge minority groups more negatively than majority groups, even when the proportion of their positive and negative behaviours is identical. This has been done with constraint satisfaction networks (Vakas-Duong and Reilly 1995). Such research leads us to believe that many of the world's problems (such as prejudice) do not stem from instincts for power or greed, but from natural cognitive limitations. If so, this shift in perspective would be an important step towards solving problems previously thought to be insurmountable. It might be that all we need is a simple computational prosthesis to solve age old problems.

Chapter 6, on Shoda and Mischel's computational model of personality structure, is a well argued presentation of culture as a constraint satisfaction system. It simulates the change in patterns of personality under different circumstances, something that is hard to model. It is interesting how the authors play with the way constraint satisfaction networks settle on a solution and introduce new input before complete settling occurs, thus modelling temporal change. Although the natural phenomena of our nervous systems settling down to one solution and of recent solutions having an influence on the next solution may not take place in the same time scales, this model assumes that they do. However, the model is still interesting food for thought, despite this inaccurate assumption. It also lacks an essential ingredient of change and adaptation: learning. There is no learning in this model. Even though circumstances change, the way that agents react does not.

Chapter 7 deals with one of the central theories of psychology, that of cognitive dissonance. Cognitive dissonance is a well documented (and somewhat humorous) phenomenon coming from the primacy of sense-making. Because of cognitive dissonance, people will, for example, actually enjoy doing something more if they are paid less for doing it, because they have to explain to themselves why they are doing it. Constraint satisfaction networks have great potential for explaining this central aspect of cognition, but are not given the chance to display it in this chapter. Schultz and Lepper's model has the same problem as most of the others in this book. It has little explanatory power because the answer to the question they ask is encoded in the question itself. You can't use a CS net to explain how different concepts come to influence each other by hardwiring the relations of influence into the model before it is even run. It doesn't get at the mystery of cognitive dissonance because it restates what is, without telling how it dynamically comes and continues to "be".

Chapter 8 presents an interesting model of explanatory coherence applied to the social realm, in this case the debate on abortion. Ranney and Schank's model seemed slightly off topic to me, because sociality did not seem to be involved in the problem itself, even if the solution has social implications. The model is interesting because it seems like a good tool to help us to reason generally, but not socially. Schema theory and its immediate categorisations seems more appropriate than this well-reasoned model for the distinctively social realm.

Chapter 9 takes an interesting turn into cellular automata modelling, but suffers the same epistemological problems as the constraint satisfaction networks. These occur not only in this cellular automata model, but also in the better known ones such as Sugarscape (Epstein and Axtell 1996). The answer is encoded in the question itself, and is thus an artefact. Here, Nowak and Vallacher claim to simulate the emergence of macro level phenomena from micro level rules,but a closer look reveals that the macro phenomena they are trying to explain are "clustering into groups" and the micro level rules they have encoded are "go near someone similar to yourself". Although they bring in social psychological data that individuals will change their beliefs in order to maintain group cohesion, this can not help the poor epistemology of their model.

In chapter 10, Eiser et al., begin to address another fundamental problem with this type of modelling and even with social psychology in general - that it is not social. All of the models in this book concern the reactions of single minds to the existence of others, but nowhere do we see how these minds came into existence by virtue of those others. Social psychology, as represented here, is all about how an individual reacts to society, not how she creates society nor how society has created her. Eiser et al. say that communication and the emergence of symbols are important. They ask important questions about the mutual effects of culture and cognitive dissonance and then present their model of the emergence of shared representations. Unfortunately, they use virtually the same kind of cellular automata as that presented in chapter 9, in which agents come to a consensus about the meanings of words because the programmer forces them to come to a consensus by making them copy their neighbours. This contradicts hermeneutic theory, and does not address the real issue of how people, who are forced to induce meanings on their own, can come to share meaning.

On the whole, this book is a good "first of its kind" in the field of social psychology. The main strong point is that it brings the data in the field to computational models. In places, these models are almost as good as the Rumelhart and McClelland classics in terms of their explanatory power. However as a whole, they need improvement in this regard, as do many other works of computational social science. I have tried to suggest a few possible standards to hold these and other models to. One important point I have tried to make is that we must be very careful that our findings are not just artefacts of the initial assumptions. We need to be more vigilant that the answer to the questions we ask are not contained in the questions themselves. By the principle of Occam's razor, that simulation which uses a few known principles in the world to explain many more is a better one than that which restates fewer principles in a more complicated manner. Perhaps some of the problem in adhering to this standard lies in the segregation of the disciplines. It is very hard to do this kind of integration: what usually happens is that we are very good in the field we are trained in and rather weak in the ones where we are not. This book is stronger in social theory than it is in the epistemology of its models, as we should expect. Perhaps the development of standards of simulation will help guide these and other interdisciplinary explorations.

References

EPSTEIN J. M. and R. Axtell 1996. Growing Artificial Societies: Social Science from the Bottom Up, The M.I.T. Press, Cambridge, MA.

HAMILTON D. L. and R. Gifford 1976. Illusory Correlation in Interpersonal Perception: A Cognitive Basis for Stereotypic Judgement. Journal of Experimental Social Psychology, 12:392-407.

PEARCE J. M. 1994. Similarity and Discrimination. A Selective Review and a Connectionist Model. Psychological Review, 101:587-607.

RUMELHART D. E. and J. L. McClelland 1988. Parallel Distributed Processing: Explorations in the Microstructure of Cognition, two volumes, The M.I.T. Press, Cambridge, MA.

TROPE Y. and A. Lieberman 1993. The Use of Trait Conceptions to Identify Other People's Behavior and to Draw Inferences About Their Personalities. Personality and Social Psychology Bulletin, 19:553-562.

VAKAS-DUONG D. A. 1996. Symbolic Interactionist Modeling: The Coevolution of Symbols and Institutions. Intelligent Systems: A Semiotic Perspective, Proceedings of the 1996 International Multidisciplinary Conference, volume 2:349-354, NIST, Washington, DC.

VAKAS-DUONG D. A. and K. D. Reilly 1995. A System of IAC Neural Networks as the Basis for Self-Organization in a Sociological Dynamical System Simulation. Behavioral Science, 40:275-303.

Button Return to Contents of this issue