PyNetLogo: Linking NetLogo with Python

: Methods for testing and analyzing agent-based models have drawn increasing attention in the literature, inthecontextofeffortstoestablishstandardframeworksforthedevelopmentanddocumentationofmod-els. This process can benefit from the use of established software environments for data analysis and visualization. For instance, the popular NetLogo agent-based modelling software can be interfaced with Mathematica and R, letting modellers use the advanced analysis capabilities available in these programming languages. To extendthesecapabilitiestoanadditionaluserbase, thispaperpresentsthepyNetLogoconnector, whichallows NetLogo to be controlled from the Python general-purpose programming language. Given Python’s increasing popularity for scientific computing, this provides additional flexibility for modellers and analysts. PyNetLogo’s features are demonstrated by controlling one of NetLogo’s example models from an interactive Python environment, then performing a global sensitivity analysis with parallel processing.


Introduction
. Agent-based models (ABMs) are a well-established method for the study of complex adaptive systems, in which the interactions of heterogeneous entities yield emergent large-scale behaviors.As such, this approach has been applied across a wide variety of fields such as economics, ecology, or socio-technical systems (e.g.Tesfatsion & Judd ; Grimm & Railsback ; Nikolic et al. ).
. However, the computational nature of ABMs can make them more di icult to understand and communicate than analytical models (Grimm et al. ).Without the use of standard frameworks to structure their analysis and documentation, ABMs may yield ad hoc, poorly reproducible results (Thiele ).Di erent initiatives are attempting to address this gap, such as the ODD and TRACE protocols for documentation (Grimm et al. ; Schmolke et al. ).
. In practice, these documentation protocols are easier to apply when supported by suitable computational tools -for instance to generate experimental designs for uncertain inputs, visualize output data, or apply standard statistical methods.While many agent-based modelling platforms include basic analysis tools, these are typically not su icient to meet the requirements of a comprehensive analysis and documentation process.Conversely, using standalone analysis so ware to process input and output data files can quickly become unwieldy for complex models -making the analysis workflow more di icult to reproduce.
. The literature therefore presents di erent connectors to directly interface agent-based modelling so ware with analysis environments.In particular, the popular open-source NetLogo modelling so ware can be linked at runtime with Mathematica (Bakshy & Wilensky ) and R (Thiele et al. ), which allows modellers to use the comprehensive analysis and visualization functionalities available in these programming languages.

.
As a complement to these connectors, this work introduces the pyNetLogo library, which can be used to control NetLogo through the Python programming language.Python is a general-purpose language which is consistently ranked as one of the five most popular languages on the TIOBE Programming Community index (TIOBE ); it is increasingly used for scientific computing, and o ers a variety of libraries which can support ABM development and testing.It should be emphasized that pyNetLogo is not intended as a replacement for the existing R and Mathematica connectors, or as a comment on the suitability of these various environments for ABM analysis.However, given the popularity of the Python language, pyNetLogo extends the benefits of a specialized analysis environment to a broader audience.

.
The following section of this paper describes the di erent so ware platforms used in this work.A so ware implementation section then introduces pyNetLogo and its key features, and illustrates these mechanisms for a simple predator-prey model.As an example of the analysis workflow which is enabled by pyNetLogo, this model is controlled interactively from a Python environment, then tested using a global sensitivity analysis.

So ware Description
NetLogo .

NetLogo (Wilensky
) is an open-source environment for the design and testing of agent-based models.While NetLogo was initially intended as an educational tool, its ease of use, robust performance and active user community have made it a pragmatic choice for a wide range of research applications (Kravari & Bassiliades ; Railsback et al. ).It has therefore established itself as a leading platform for agent-based modelling (Thiele ).
. NetLogo is primarily implemented in Java and Scala, and includes a range of functions and methods to support the rapid development of spatially-explicit agent-based models.Railsback et al. ( ) further discuss strategies and techniques to improve the performance of more complex NetLogo models.In addition to connectors for Mathematica and R, di erent extension modules are available, for instance to interface NetLogo models with GIS datasets.In particular, an extension for Python (Head ) o ers a converse functionality to the pyNetLogo connector, by allowing Python code to be executed from a NetLogo model.

Python .
Python is a widely used high-level, general-purpose open source programming language that supports various programming paradigms.Python places a strong emphasis on code readability and code expressiveness.A large collection of libraries for many typical programming tasks is readily available.Python is increasingly popular for scientific computing purposes due to the rapidly expanding scientific computing ecosystem available for Python.
. This ecosystem includes NumPy (Walt et al. ) and pandas (McKinney ) for data manipulation, SciPy (Jones et al.
) for general numerical tasks, Matplotlib (Hunter ) for plotting and visualization, as well as Jupyter and IPython (Pérez & Granger ) for interactive analysis.These libraries are pre-packaged in several scientific distributions for Python, such as Continuum Anaconda.Additional libraries can be installed through standard package managers such as pip and conda.

.
Python is o en used as a "glue"Ãĺ language, meaning that it connects pieces of so ware written in di erent languages together into a bigger application.For instance, the JPype library (Menard & Nell ) can be used to access Java class libraries through interfacing the Python interpreter and the Java Virtual Machine.PyNetLogo therefore relies on JPype for interacting with NetLogo.

So ware Implementation
. This section first describes basic interactions between the Python environment and a NetLogo model, using the pyNetLogo connector.These interactions are demonstrated using the simple wolf-sheep predation example which is available in NetLogo's model library.This functionality is then extended to illustrate a typical model analysis workflow, using the SALib Python library (Herman & Usher ) to perform a global sensitivity analysis.
. The model files used for these examples are available from the pyNetLogo repository at https://github.com/quaquel/pyNetLogo, along with interactive Jupyter notebooks which replicate the analysis and visualizations presented in this paper.Detailed documentation and installation notes for pyNetLogo are provided at http: //pynetlogo.readthedocs.io.The pyNetLogo connector can be installed using the pip package manager, using the following command from a terminal or command prompt: p i p i n s t a l l pyNetLogo .
The pyNetLogo connector has been tested with NetLogo ., .and .using the -bit Continuum Anaconda .and .Python distributions.Using these distributions, pyNetLogo requires the additional installation of JPype (available through the conda package manager).The pyNetLogo connector is currently also included in the Exploratory Modeling Workbench Python package (Kwakkel ), which o ers support for experiment design and exploratory modeling and analysis.

Controlling NetLogo through Python with pyNetLogo
.
The pyNetLogo package is composed of a Python module (core.py)and a Java JAR file (netlogolink.jar).The Python module defines a NetLogoLink class; an instance of this class is used to handle interactions on the Python side.The Python and Java environments are linked with the JPype package through the Java Native Interface (JNI).On the Java side, the JAR file provides a corresponding NetLogoLink Java class in two versions, for NetLogo .xand . .An instance of the appropriate Java class in turn communicates with the NetLogo API.This allows for bidirectional data exchanges between a Python environment (which can for instance be an interactive Jupyter notebook) and a NetLogo model at runtime, with appropriate data type conversions between the two environments. .Table summarizes the basic methods available through the NetLogoLink Python class.These are intended to provide "building blocks" for the interactive analysis of NetLogo models with Python, and largely replicate the basic capabilities of the RNetLogo connector for the R environment (Thiele et al. ).Further details are provided at http://pynetlogo.readthedocs.io. .To illustrate the functionality of pyNetLogo, a simple example follows below, using the wolf-sheep predation model which is included in the NetLogo .example library.The Jupyter notebook available from the pyNetLogo repository replicates this example and demonstrates the key methods of the pyNetLogo connector in more detail, using a slightly modified version of the model to test a broader range of data types.

Name
. First, a link to NetLogo is instantiated.This involves starting a Java VM, followed by starting NetLogo.All interactions with NetLogo are handled by an instance of the NetLogoLink class.Note that when using Linux, the NetLogoLink class requires the netlogo_home and netlogo_version parameters to be set manually.If these parameters are not set on Mac or Windows, the class will attempt to identify and use the most recent NetLogo version found in the default program directory.
. Next, we can load a model using the load_model method, followed by basic commands to set up the model and run it for ticks.The report method is then used to return NumPy arrays to Python, containing the NetLogo coordinates of the "sheep" agents, and the energy attribute of the "sheep" and "wolf" agents.These arrays can then for instance be used with conventional Python functions to plot the coordinates of the agents, or the distribution of energy across agents (Figure ). .Building on this functionality, the repeat_report method returns a pandas DataFrame containing reported values over a given number of ticks, for one or multiple NetLogo reporters.The DataFrame is structured using columns for each reporter, and indexed by NetLogo ticks.By default, this assumes the model is executed with the NetLogo go command; this command can be changed by specifying an optional go argument when calling the method.
. In this case, we can first track the count of both agent types over ticks.The outcomes are first plotted as a function of time on the le panel of Figure .On the right panel, the number of sheep agents is then plotted as a function of the number of wolf agents, to approximate a phase-space plot.

.
The repeat_report method can also be used with reporters that return a NetLogo list.In this case, the list will be converted into a NumPy array, which is formatted according to the data type returned by the reporter (i.e.numerical or string data).for a comprehensive overview).This is especially useful for models in which interactions between parameters can be expected to be significant.A simple example of GSA would be generating a Monte Carlo sample of all uncertain inputs, then applying a multiple linear regression to the model output.

.
For more complex, non-linear models, variance-based approaches such as Sobol indices (Sobol' ) can accurately capture each parameter's contribution to the variance of model output.Sobol indices are computed using variance decomposition; first-order and total indices respectively estimate the fractional contribution of each input to output variance on its own, and inclusive of interactions with other inputs.Second-order indices can also be computed to estimate the contribution of pairwise variable interactions towards output variance.However, this type of variance-based analysis requires specific techniques for input sampling and output analysis.

.
In this context, the SALib library provides sampling and analysis modules for methods including Sobol indices, Morris elementary e ects (Campolongo et al.
; Morris ), and derivative-based global sensitivity measures (Sobol' & Kucherenko ). Integrating these methods within a NetLogo workflow significantly extends the functionality of NetLogo's BehaviorSpace tool, which has limited sampling options.This example will use SALib to estimate Sobol indices; although these indices accurately represent input importance, their calculation may require a large input sample size to yield stable results.For complex models which may be too timeconsuming to simulate over such an ensemble of experiments, the Morris elementary e ects technique can instead be used from SALib to "screen" non-influential variables at a smaller sample size, while still accounting for parameter interactions and non-linearities which may be missed by a "one-at-a-time" approach.Ayllón et al. ( ) describe an application of this method for a complex NetLogo model.
. The SALib sampler will then generate an appropriate experimental design based on the analysis technique to be used.To calculate first-order, second-order and total Sobol sensitivity indices, this gives a sample size of n(2p + 2), where p is the number of input parameters, and n is a baseline sample size which should be large enough to stabilize the estimation of the indices.
. For this example, we use n = 1000, for a total of experiments.The next subsection will demonstrate the use of ipyparallel to parallelize the simulations and reduce runtime. .
Assuming we are interested in the mean number of sheep and wolf agents over a timeframe of ticks, we first create an empty DataFrame to store the results.We then simulate the model over the experiments, by reading input parameters from the param_values array generated by SALib and using the repeat_report method to track the outcomes of interest over time.   .
The sheep-gain-from-food parameter has the highest S and ST indices, indicating that it contributes roughly % of output variance on its own, and over % when accounting for interactions with other parameters.However, the first-order confidence bounds are overly broad due to the relatively small n value used for sampling (i.e. ), so that a larger sample would be required for reliable results.
. We can use a more sophisticated visualization to include the second-order pairwise interactions between inputs, shown in Figure .The size of the ST and S circles correspond to the normalized total and first-order indices, and the width of connecting lines between variables indicates the relative importance of their pairwise interactions on output variance. .In this case, the sheep-gain-from-food variable has strong interactions with the wolf-gain-from-food and wolfreproduce inputs in particular, as indicated by their thicker connecting lines.

Using ipyparallel for parallel simulation
. ipyparallel is a standalone package (available through the conda package manager) which can be used to interactively run parallel tasks from IPython on a single PC, but also on multiple computers.On machines with multiple cores, this can significantly improve performance: for instance, the multiple simulations required for a sensitivity analysis are easy to run in parallel.This subsection will repeat the global sensitivity analysis presented in the previous subsection, this time using ipyparallel to distribute the simulations across multiple cores on a single computer.The code fragments assume the analysis is executed from a Jupyter notebook; as with the previous examples, the full notebook is available from the pyNetLogo repository.
. ipyparallel first requires starting a controller and multiple engines, which can be done from a terminal or command prompt with the following: i p c l u s t e r s t a r t −n .
The optional −n argument specifies the number of processes to start ( in this case).By default, the number of logical processor cores will be used.
. Next, we can connect the interactive notebook to the cluster by instantiating a client (within a notebook), and checking that client.idsreturns a list of available engines.
i m p o r t i p y p a r a l l e l c l i e n t = i p y p a r a l l e l .C l i e n t ( ) p r i n t ( c l i e n t .i d s ) .
A er defining the SALib problem dictionary and input sample as in the previous subsection, we can then set up the engines so that they can run the simulations, using a "direct view" that accesses all engines.We first need to ensure the engines can access the current working directory in order to find the NetLogo model.We can then also pass the SALib problem definition dictionary to the engines. .
The %%px command can be added to a notebook cell to run it in parallel on each of the engines.Here the code first involves some imports and a change of the working directory.We then start a link to NetLogo, and load the example model (assumed to be in the working directory) on each of the engines. .
We can then use ipyparallel's map functionality to run the sampled experiments, now using a "load balanced" view to automatically handle the scheduling and distribution of the simulations across the engines.This is useful when simulations may take di erent amounts of time.
. We first slightly modify the simulation code used previously, setting up a simulation function that takes a single experiment (i.e. a vector of input parameters) as an argument, and returns the outcomes of interest in a pandas Series.
d e f r u n _ s i m u l a t i o n ( e x p e r i m e n t ) : # S e t t h e i n p u t p a r a m e t e r s f o r i , name i n enumerate ( problem [ ' names ' ] ) : i f name == ' random−seed ' : # The NetLogo random seed r e q u i r e s a d i f f e r e n t s y n t a We then create a load balanced view, and run the simulation with the view's map_sync() method.This method takes a function and a Python sequence as arguments, applies the function to each element of the sequence, and returns results once all computations are finished.In this case, we pass the simulation function and the array of experiments (param_values), so that the function will be executed for each row of the array.
. The DataFrame constructor is used to immediately build a DataFrame from the results (which are returned as a list of Series). .
We can then proceed with the analysis as in the previous subsection.Figure compares the runtimes obtained with ipyparallel and a sequential simulation (using an Intel i -HQ CPU) for experiments.The elapsed parallel runtime is approximately one-third of the sequential runtime; given that we were using engines, this is slightly more than could be expected from a perfectly parallel computation, due to the overhead involved in data exchanges, etc.

Conclusions
. The analysis and communication of agent-based models can benefit from the comprehensive analysis features which are available in specialized so ware environments.To this end, this paper first introduced the pyNetLogo connector, which interfaces the NetLogo agent-based modelling so ware with a Python environment.This connector provides basic command and reporting functionalities similar to the RNetLogo package in R.These features were illustrated using one of NetLogo's sample models.As an example of the more complex analyses which are enabled by a Python interface, the SALib Python library was then used for a Sobol variance-based global sensitivity analysis of the model.This analysis was performed using sequential simulations, then parallelized for improved performance using the ipyparallel library.
. The current implementation of pyNetLogo relies on a Java Native Interface (JNI) through the JPype library, which allows Java classes (and thus NetLogo) to be called from Python.However, this does not support a bidirectional linkage through which a NetLogo model could also directly execute Python code.For applications in which this functionality would be helpful (for instance by using more advanced statistical or geospatial functions in NetLogo models), the Python extension for NetLogo can instead be used to execute Python code from NetLogo through a JSON interface.As a complement to existing interfaces which link NetLogo with R or Mathematica, the combination of these tools thus allows modellers to extend NetLogo's capabilities with Python's extensive ecosystem for scientific computing.

Figure :
Figure : Interactions between Python and NetLogo.
Figure : Basic plots generated in Python: agent coordinates (le ); distribution of energy attribute across agents (right).
Figure : Python plots using repeat_report method: number of agents as a function of time (le ); number of sheep agents as a function of wolf agents (right).

Figure :
Figure : Python plot using patch_report method: distribution of the countdown patch attribute across the NetLogo environment.
from S A L i b .sample i m p o r t s a l t e l l i from S A L i b .a n a l y z e i m p o r t s o b o l n = # G e n e r a t e s an i n p u t a r r a y o f shape ( n * ( p + ) , p ) w i t h rows f o r each # e x p e r i m e n t and columns f o r each i n p u t param_values = s a l t e l l i .sample ( problem , n , c a l c _ s e c o n d _ o r d e r = True ) r e s u l t s = pd .DataFrame ( columns = [ ' Avg .sheep ' , ' Avg .wolves ' ] ) f o r run i n r a n g e ( param_values .shape [ ] ) : # S e t t h e i n p u t p a r a m e t e r s f o r i , name i n enumerate ( problem [ ' names ' ] ) : i f name == ' random−seed ' : # The NetLogo random seed r e q u i r e s a d i f f e r e n t s y n t a x n e t l o g o .command ( ' random−seed { } ' .f o r m a t ( param_values [ run , i ] ) ) e l s e : # O t h e r w i s e , assume t h e i n p u t p a r a m e t e r s # a r e g l o b a l v a r i a b l e s n e t l o g o .command ( ' s e t { } { } ' .f o r m a t ( name , param_values [ run , i ] ) ) n e t l o g o .command ( ' setup ' ) # Run f o r t i c k s and r e t u r n t h e number o f sheep and w o l f a g e n t s a t # each time s t e p c o u n t s = n e t l o g o .r e p e a t _ r e p o r t ( [ ' count sheep ' , ' count wolves ' ] , ) # F o r each run , s a v e t h e mean v a l u e o f t h e a g e n t c o u n t s o v e r time r e s u l t s .l o c [ run , ' Avg .sheep ' ] = c o u n t s [ ' count sheep ' ] .v a l u e s .mean ( ) r e s u l t s .l o c [ run , ' Avg .wolves ' ] = c o u n t s [ ' count wolves ' ] .v a l u e s .mean ( ) .We can then proceed with the analysis, first using a histogram to visualize output distributions for each outcome as shown in Figure .

Figure : .
Figure : Output distributions for the average number of sheep agents (le ) and wolf agents (right) over ticks.

Figure : .
Figure : Scatter plots with linear trendlines for the average number of sheep agents as a function of each input parameter.

Figure :
Figure : First-order, second-order and total Sobol indices for the average number of sheep agents.
d i r e c t _ v i e w = c l i e n t [ : ] i m p o r t os # Push t h e c u r r e n t w o r k i n g d i r e c t o r y o f t h e notebook t o a # " cwd " v a r i a b l e on t h e e n g i n e s t h a t can be a c c e s s e d l a t e r d i r e c t _ v i e w .push ( d i c t ( cwd= os .getcwd ( ) ) ) # Push t h e " problem " v a r i a b l e from t h e notebook t o a # c o r r e s p o n d i n g v a r i a b l e on t h e e n g i n e s d i r e c t _ v i e w .push ( d i c t ( problem = problem ) ) %%px i m p o r t os os .c h d i r ( cwd ) i m p o r t pyNetLogo i m p o r t pandas as pd n e t l o g o = pyNetLogo .N e t L o g o L i n k ( g u i = F a l s e ) n e t l o g o .load_model ( r ' W o l f Sheep P r e d a t i o n _ v .nlogo ' ) x n e t l o g o .command ( ' random−seed { } ' .f o r m a t ( e x p e r i m e n t [ i ] ) ) e l s e : # O t h e r w i s e , assume t h e i n p u t p a r a m e t e r s a r e g l o b a l v a r i a b l e s n e t l o g o .command ( ' s e t { } { } ' .f o r m a t ( name , e x p e r i m e n t [ i ] ) ) n e t l o g o .command ( ' setup ' ) # Run f o r t i c k s and r e t u r n t h e number o f sheep and w o l f a g e n t s a t each time # s t e p c o u n t s = n e t l o g o .r e p e a t _ r e p o r t ( [ ' count sheep ' , ' count wolves ' ] , ) r e s u l t s = pd .S e r i e s ( [ c o u n t s [ ' count sheep ' ] .v a l u e s .mean ( ) , c o u n t s [ ' count wolves ' ] .v a l u e s .mean ( ) ] , i n d e x = [ ' Avg .sheep ' , ' Avg .wolves ' ] ) r e t u r n r e s u l t s .
l v i e w = c l i e n t .l o a d _ b a l a n c e d _ v i e w ( ) r e s u l t s = pd .DataFrame ( l v i e w .map_sync ( s i m u l a t i o n , param_values ) )

Figure :
Figure : Comparison of runtimes for sensitivity analysis ( total experiments), using sequential and parallel simulations.