Review of Alvarez, R. Michael: Computational Social Science (Analytical Methods for Social Research)

Computational Social Science (Analytical Methods for Social Research)

Alvarez, R. Michael
Cambridge University Press: Cambridge, 2016
ISBN 9781107518414 (pb)

Reviewed by Karandeep Singh
Korea University of Science and Technology (UST)

This book presents advances in computational social science with the focus on political science and social policy. It is divided into two parts with chapters 1 through 6 focusing on development of various statistical, computational and machine learning techniques, and the second part of book (chapters 7 through 11) focusing on applications of new tools and methodologies to tackle important questions and problems in political and social science; all in wake of the availability of new types and quantities of associated data.

The first part of the book is solution oriented; authors generally state the research problem and proceed with to give a possible solution. For instance, chapter one tackles the issue of linking multiple poll surveys, chapter two deals with the problem of multi-modality in topic models, while chapter three offers a collection of event data, and chapter four delineates modelling networks; chapter five is about measuring and estimating political philosophies, and chapter six introduces into the application of machine learning in biomedical research.

In chapter 1, the author proposes to use data from multiple poll surveys to answer, understand and predict important questions and matters in political science, while making sure that appropriate weighting and statistical adjustments are made. By using the proposed technique, topics such as small area estimation, measurement of ideology and latent variables, political representation and elections are covered. In chapter 2, the authors analyse a corpus of 13,246 posts from 2,008 political blogs using topic models such as Latent Dirichlet Allocation (LDA) or Structural Topic Modelling (STM), and employ potential solutions to tackle the problem of multimodality in topic models. The authors delve to appropriate depth in explaining these solutions and demonstrating the results after extending them to applied research. This chapter exudes excellently the authors’ experience and expertise in this research area. The authors of chapter 3 discuss machine coding of event-data and state various advantages and challenges concerned with these. They propose to build Open Event Data Alliance (OEDA), an open data and open source approach based on principles of depth, validity, transparency, consistency and community participation (like the R language in statistics). This event data can be useful for analysing / predicting from desired perspectives such as forecast of inter- and intrastate conflicts (p. 106).

Chapter 4 talks about network modelling. By using clear examples, the author sheds light on various issues and concepts related to network estimations: quantification of networks can lead to the discovery of the latent structure of a network that is not fully observable otherwise (p. 129). Network analysis can be computationally very demanding, because there are up to 2N potential connections with N individuals. Two modelling approaches for topological data analysis are discussed: statistical models of spatial data and exponential random graph models (ERGM). In chapter 5, the author discusses about measurement and estimation of political philosophies. A new model is developed based on available literature which permits simultaneous approximation of ideological philosophy and salience weights (weights attached to different preferences by people). This model enables researchers to analyse how party messages are affecting preferences of voters and also the importance voters attach to various issues. Chapter 6 discusses the application of random forest techniques (RF) to the biomedical research. Authors discuss the extension of RF such as fuzzy random forest and compare their results by the application to a sample problem.

The second part of the book focuses on applications of computational social science techniques to appropriate problems. In chapter 7, authors argue that social media has become an important aspect of political protests. They postulate five topics to assess the impact of social media on such protests, which is followed by an investigation of about forty million tweets of two separate political events. Finally they subject the data to their postulated criteria and demonstrate that social media indeed play an important role in these happenings. In chapter 8, the author follows an approach similar to LDA topic model approach and analyses extensively texts of press releases from the members of US House of Representatives and demonstrates how representatives communicate with constituents. The author focuses on approximating the topic within the text by nesting granular topics within the coarse topics. Chapter 9 discusses how data science, social marketing, and government policy are undergoing changes in the wake of technological change. It is about two US government-funded projects: animal and plant health inspection service (APHIS) & federal voting assistance program (FVAP), and explains how different kinds of big data (“tall” and “fat”) were handled to assist the government in making better use of available resources. The next chapter talks about an important issue of detecting fraud in elections. The authors make use of machine learning and detect anomalies in 2013 Argentinian elections. A variety of supervised and unsupervised algorithms is employed and results demonstrate that these algorithms are able to detect the anomalies in the election data. Chapter 11 presents an interesting research story. The authors explain how they collected and pieced together disparate sources of data and developed a residential radon gas risk model for the whole area of the United States of America; concerned agencies, however, did not adopt their model recommendations in spite of the obvious benefits. They conclude that more data is not always necessarily good and that selection of the right kind of data is important.

This book encompasses changes, developments, and advances that are happening in the field of computational social science, especially due to the availability of data from newer electronic means. It presents a good mix of informed approaches, developments and implementations of these techniques. The book has technically speaking both substantial chapters and relatively lighter ones. Practical implementation of new approaches and techniques demonstrate how technology can augment the desired impact of policies and programs. The chapters are generally well written, with a suitable introduction to the content of them.

Important and practical issues are touched upon topics such as election fraud, event-data coding, the relation of representatives and constituents, changes in government policies due to technological changes, practical implementation of research findings in public policy, but to name a few. This book broadens the scope of the scholars in the fields of political sciences and social policy. The methods and approaches used could be relevant for scientists and scholars in other fields as well, but they could have a hard time keeping up with technically heavy content. Overall, it is a good book; readers will very likely be inspired and wanted to explore more.

Button Return to Contents of this issue