Catch Me if You Can: Using a Threshold Model to Simulate Support for Presidential Candidates in the Invisible Primary

The invisible primary is an important time inUnitedStatesPresidential primarypolitics as candidates gainmomentum for their campaigns before they compete formally in the first state caucus (Iowa) andprimaries (e.g. NewHampshire). This critical period has not been possible to observe, hence the name. However, by simulating networks of primary followers, we can explicate hypotheses for howmessages travel through networks to a ect voter preferences. To do so, we use a threshold model to drive our simulated network analysis testing spread of public support for candidates in invisible primaries. We assign voter thresholds for candidates and vary number of voters, attachment to candidates and decay. We also vary social graph structure and model. Results of the algorithm show e ects of size of lead, an unwavering base of support, and information loss.


Introduction
. When Donald Trump descended the elevators in Trump Tower in June, to announce his candidacy for the Republican Party nomination for president, few pundits gave him much of a chance of winning the party primary, much less the general election. But shortly a er his entrance in the race, he jumped into the polling lead (Schleifer ) and dominated media coverage of the race (Patterson ). Trump was able to parlay this consistent edge in support during the invisible primary -the time between when people announce their candidacy and when the first votes are cast in the first state (Iowa) -into securing the nomination. .
On one hand, Trump's ability to win the nomination was not surprising given his level of support in the invisible primary. Candidates with higher polling numbers during the invisible primary are more likely to win the nomination (Adkins & Dowdle ; Dowdle et al. ; Steger , ; Donovan & Hunsaker ). Furthermore, a lead in the polls o en aligns with a significant amount of media coverage. While Steger ( ) notes that the media does not tell people what to think, it does influence what primary voters think about. Although media attention may not directly factor into who wins a nomination (Steger ), other scholars note that it can indirectly influence the contest through fundraising during the invisible primary (Mutz ; Damore ). .
On the other hand, many scholars and data journalists were skeptical of Trump's chances of winning the Republican Party nomination because they predicted that party elites would rally against him by endorsing another candidate (Cohn ; Byco ee ; Prokop ; Sides ). Winning endorsements is important because it signals to donors, party activists, and party organizers that a particular candidate is both viable and preferred by the party elite (Cohen et al. ). While receiving endorsements can help a candidate win the nomination (Summary ; Steger ; Whitby ), Steger ( ) notes that party o icials should be generally unified in order to maximize this e ect. .
These scholars make it clear that where candidates stand at the end of the invisible primary is important for understanding which candidates win the major parties' presidential nominations. As such, it is worthwhile to explore how candidates gain an edge during this critical time period, particularly since there is a high level of volatility in the public's candidate preferences during presidential primaries, in part due to shi s in public attention from one candidate to another (Steger ). While some scholars explore which factors impact the length of a candidacy and include the invisible primary in their analysis (e.g. Shen ; Norrander ), the preponderance of the presidential primary literature focuses on the nomination outcome. .
The nature of the invisible primary is such that conducting simulations can be helpful. First, primary elections lack the traditional partisan cues that are present in general elections, leaving voters to rely on other mechanisms to decide on which candidate to support such as candidate characteristics (e.g. Jackman & Vavreck ; Norrander ; Campbell ), ideology (e.g. Wattier ; Bartels ), issues (e.g. Bartels ), viability (e.g. Abramson et al. ; Collingwood et al. ; Bartels ), or some combination thereof (e.g. Stone et al. ; Abramowitz ). Second, no actual voting occurs during the invisible primary, making it di icult for candidates to create momentum surrounding their campaigns (e.g. Steger ; Norpoth & Perkins ; Adkins & Dowdle ). .
In this manuscript, we utilize an interdisciplinary approach that uses a threshold model of social interaction with a political outcome: which candidate wins at least a plurality of support at the end of the invisible primary.
To do this, we conduct a network analysis simulation using three candidates while varying the size of the primary electorate. Such analyses are not unprecedented when exploring election dynamics (e.g. Lou et al. ) but we have not seen many models that include flexibility of supporter intensity (though Rolfe , models unconditional cooperators in her models of voting turnout) and the existence of decay.
. We find that frontrunners are more likely to lead at the end of the invisible primary than non-frontrunners, but only under certain conditions. Specifically, there are three components to a candidate holding the lead prior to the Iowa caucuses: a sizable lead in the polls at the onset of the race, an environment in which there is little information decay, and an unwavering base of support. The absence of any of these factors can lead at least a plurality of voters to be undecided thus creating uncertainty as to who will win the nomination. These results suggest that campaigns must use tools to keep their supporters informed and strengthen their allegiance to the candidate.

Public attention and networks .
Campaigns are "attempts by competing partisan elites to reach citizens with political communications and persuade them to a point of view" (Zaller , p. ). However, the transmission of these communications is mediated by social factors and influence (Lazarsfeld et al. ; Berelson et al. ). Thus, although most communications are elite initiated, the dynamics of mass communication deserve attention as well. .
Recently, Swearingen et al. ( ) posited that the analysis of primary campaigns relies excessively upon elitelevel variables and found that a mass component called public attention (Ripberger ) is important in predicting primary results. Public attention can be understood as the result of information flows through individuals' networks. Individuals are organized into social networks with ongoing patterns of relationship with other members (Granovetter , ) that "encourage some views and opinions while discouraging others" (Huckfeldt & Sprague ). The decisions of most actors are conditional on the behavior of others and people tend to care more about the actions and opinions of those closer to them (Hu et al. ; Siegel ; Rolfe ; Siegel ).
. An individual's position in a network, the size of their network (Rolfe ; Siegel ) and consequent types of information flows has been found to have an array of economic (Rolfe ; Granovetter ), social (Takács moderate to high levels of structural consolidation (Centola ). Actually casting a vote for one primary candidate vs. another is a behavior rather than only an adoption of complex communication. Centola ( ) finds that adoption of behaviors is much more likely in a lattice network, where reinforcement is common, than in other types of networks, which may have many loose ties. .
Supporting the notion of a political campaign as complex communication, Romero et al. ( ) find significant di erences based on content in the way messages on Twitter di use. Relative to other types of content, political messages have a higher "stickiness," which is the probability of adoption a er one or more exposures. They also find that political messages have higher "persistence," which is the extent to which repeated exposures continue to have significant marginal e ects in probability of adoption. The transmission of political and other kinds of potentially controversial information through a network may also subject it to selective disclosure (Cowan & Baldassarri ) where members of networks do not express disagreement with an idea but only agreement. Collective action is also less likely if networks connected by information and connective technology (ICT) is not strong (Hu et al. ).

Threshold model .
We posit that a type of social contagion model called a threshold model (Granovetter ) explains levels of public attention. In this model, each actor displays inertia but will act a er his or her threshold (number or proportion of others who first make a similar decision) has been reached (Watts ). For example, a primary voter may be willing to vote for a candidate only if he or she knows at least two people who will vote for that candidate. In this example, the voter's threshold for that candidate is two. This primary voter may have di erent thresholds for di erent candidates and a di erent primary voter may have a di erent threshold for each candidate. .
A threshold model allows a focus not only on individuals and their preferences but also on how those individuals interact and how preferences aggregate (Granovetter ). Such a model also fits well with dynamics of mass political information and voting behavior. Voters possess variable but on average low amounts of poorly integrated political information (see, for example, Carpini & Keeter or for a more current example Bowyer et al. ). As a consequence, the typical voter has little ability to absorb, integrate or resist political messages (Zaller ). Adoption of political messaging, then, becomes a matter of repeated exposure. Although individuals vary in their susceptibility to influence based on political knowledge and their values driven dispositions, more exposure increases the likelihood of adoption. .
The exposure must be reinforced, however, or a decay process sets in wherein people forget the messages they have adopted unless they have heard them recently (Zaller ). Thus, exposure to political messages immediately prior to a campaign should figure most prominently in spurring voting (Centola & Macy ). Further, the process of decay may cause an unraveling of support across the network. If one actor in the network fails to regularly communicate a political message, then those other actors whose thresholds have been reached in part by the first actor may not support the candidate any longer and so on.

Threshold Models and Primaries
. In this section, we explain why our model is suited for the primary election context. To that end, our model is appropriate for the specific contexts of primaries as elections with low structure, low salience, and low information, characteristics o en exacerbated by multiple candidates and lengthy campaigns. Because of these features, the activated networks tend to be of small size but higher impact because of low likelihood of exit and more diversity of messages. We explain these unique factors of primaries and their e ects on networks below.
. A threshold model is especially useful in analyzing situations "where many actors behave in ways contingent on one another, where there are few institutionalized precedents and little preexisting structure" (Granovetter ). Primaries fit this description well, as they are low salience campaigns with low amounts of information and messaging where party labels cannot provide a heuristic for vote choice.

.
Elections, general or primary, "involve equal members of a generic, loose-knit community" (Rolfe , p. ). However, there are also some key di erences between the two types of elections. People tend to organize themselves into homophilous networks, meaning they connect themselves to people similar to themselves (Santoro & Beck ; Hu & Keller ; Centola & Macy ). There is evidence that the two major American political parties have unique social networks (Grossmann & Dominguez ). This does not mean that the structures are inherently di erent; rather, they have separate constituencies and coalitions. Primaries are an interesting case in that, because of lower salience, we would not ordinarily expect primary voters to select only other network members who agree with their primary candidate preferences. In other words, they have selected into party networks based on the general election not for a primary. Similarly, network exit based on di erent primary candidate preferences is unlikely. .
There is also an interaction between salience and size of the relevant network. In less salient elections, the number of relevant actors in a member's network is smaller than in more salient elections. Candidates and supporters have less incentive to reach out and campaign as hard and so network members are less likely to be exposed to new messages. Thus, in selecting a candidate, voters are more likely to rely on the opinions of their family members and perhaps a close friend or two (Rolfe ).
. Due to the low information available in a primary, we also expect networks to be more influential during primaries than during general elections. Consequently, network members are less able to resist campaign messages (Zaller ). Since network members are not exiting (because they are in the network because they are homophilous with respect to partisan preferences) and since they are less able to resist the messages they do receive (due to low information), they will continually receive high impact messages from their networks. Thus, we reason that networks during primaries provide a potent mechanism for persuasion. .
As another consequence of members not opting out of networks with diverse messages, they are more likely to encounter them in the primaries than in the general election. General election vote choice literature that finds that network members exposed to diverse communication are more likely to change their minds than those exposed only to repeated similar messages although possibly less likely to be able to decide (Santoro & Beck ). As a result, loyal supporters or unconditional actors (those who support a candidate regardless of the messaging around them) may be critical to primary success. The percentage of unconditional actors in a network is critical to vote turnout (Rolfe ) and we expect that factor to be important in primary success as well.
. Another important di erence between general elections and primary elections is that in the United States, general elections generally provide a binary choice to voters (voter turnout is also a binary choice) while primary voters o en must select from several candidates. Models based on binary opinion such as the threshold q-voter model (Vieira & Anteneodo ) may serve better to emulate general election behavior. Another binary variant is the consensus model, where opinions are real numbers between and and an agent takes the average of compatible neighbors having a di erence of opinion less than some threshold value. Fortunato relates the convergence threshold to the behavior of the average degree of the social graph as the population increases ( ). Lou et al. ( ) note the importance of creating a model that can account both for multiple candidates and for heterogeneity in individual preferences. By allowing for multiple candidates within the same network, our model is suited for a primary election rather than a general election with two major party candidates. To date the literature describing threshold models for di usion in a voter network with multiple candidates o en concentrates on the Influence Maximization Problem âĂŞ-that is, how to select the necessary seeds at the beginning in order to get a maximum of the voters to prefer one candidate over the others. For example, see Lu et al. ( ) or Kim et al. ( ). While this information would doubtless be very useful, we see our contribution as a logically prior step to maximizing influence.

.
Finally, the presidential primary process is much longer and drawn out than its general election counterpart.
Major candidates routinely announce their candidacies -months prior to the Iowa caucuses and in upwards of -months before the major party nomination conventions. By comparison, the general election formally kicks o (per Federal Election Commission finance regulations) a er the conventions, leaving roughly -months of campaigning. This is important for our decay parameter, as the primary's drawn-out process is more relevant for this context. While this model may be modified for a general election context, its current specification is better suited for the longer, slower-paced invisible primary.

Theory and model expectations .
Based on the above discussion, we derive the following expectations. Elites direct primary campaigns and other types of political campaigns at individuals in an attempt to persuade them to adopt the elites' messages. Primary voters are situated variably within communication networks, which act as a filter upon the type and amount of information they receive. Each person in the network has a threshold for each primary candidate, which is determined by his political information, awareness and values. The level of political information across voters varies but is on average quite low. Thus, most voters are unable to resist a campaign message when it has reached a certain threshold (which varies stochastically across voters). Some voters will have a very low adoption threshold for one candidate and a very high threshold for other candidates (i.e. true believers). .
When a voter's threshold has been reached for a candidate, he becomes persuaded by that candidate. He signals his support to others in his network which counts towards reaching the threshold of his neighbors. Because of the process of decay described above, however, the message he receives from his neighbors must be reinforced or his threshold may no longer be met and he may withdraw support. This loss of support could cause cascading decreases in support among his neighbors if their thresholds are no longer met. Thus a true believer, though in the minority of cases, is quite valuable to a candidate because he or she can both spread support and resist e ects of decay. .
We explicate the following model expectations (E -): E : Ceteris paribus, the initial frontrunner is more likely to win the simulated invisible primary than other candidates.
E : As the size of the initial lead between the frontrunner and runners-up decreases, undecided voters are more likely to become a plurality of the invisible primary electorate.
E : As a candidate's proportion of true believers increases, that candidate is more likely to win the simulated invisible primary. E : (a) As a candidate's proportion of true believers decreases, the proportion of undecideds increases. (b) This relationship will be more pronounced under conditions of decay.
E : Under conditions of no decay, the frontrunner is more likely to win the simulated invisible primary than under conditions of decay.
E : As decay increases, the likelihood that the frontrunner wins the simulated invisible primary decreases.

Methods
. In this section, we describe the underlying algorithm for the Threshold Network Di usion Model (see model and output data at https://github.com/lseiter/threshold-model). This model consists of a network of voters with various threshold levels for the candidates, and an iterative sequence of preference matrices, which record the preferred candidates for the voters per iteration. The preferred candidate for each voter is determined by the percentages of support that the voter's neighbors have for each candidate in comparison with the voter's threshold levels.

Definition and initialization of elements of the threshold network di usion model .
The model begins with n voters deciding on k candidates. The voters are put in arbitrary order from 1, . . . , n.
The candidates are also put in arbitrary order and assigned numbers from 1, . . . , k. The voters influence each other's preferences via social relationships which we represent by a simple, connected, undirected graph G, such as one given in Figure below. Each node represents a voter; each edge connecting two nodes represents an influencing relationship between the corresponding voters. Two nodes connected by an edge are called "neighbors."

Preference matrix .
A er each iteration of the simulation, a voter will either prefer exactly one candidate over all of the others or have no preference. We record the preferred candidate of each voter at time t as a time-dependent n×k matrix, P (t). The entries are all or . Namely, if Voter i prefers Candidate α over all of the others a er the t th iteration of the model, then P iα (t) = 1. Otherwise P iα (t) = 0. Voter i is "undecided" if P iα (t) = 0 for all candidates α = 1, . . . , k. In Figure , the preference matrix indicates that Voters , and prefer candidate , while voter prefers candidate and voter prefers candidate . The remaining voters are undecided. .
The initial preference matrix P (0) begins with all rows being , except for the rows of a certain, randomly chosen set of voters called seeds. The proportion of seeds among the voters is recorded as a vector of size k, S = (s 1 , . . . , s k ). For each α = 1, . . . , k, the value s α represents the proportion of voters in the network that start  with a preference for Candidate α. The seeds will be randomly chosen from the graph. In Figure , the Seed Vector is given as S = (0.3, 0.1, 0.1). The % initially supporting Candidate were chosen to be Voters , , and ; the % for Candidate was Voter ; and Voter was the sole seed for Candidate .

Neighbor preference matrix .
Voters' preferences are determined by those of their neighbors. For each voter, we record the proportion of neighbors who prefer each of the candidates. Specifically, for each i = 1, . . . , n and α = 1, . . . , k, we define N iα (t) to be the proportion of the neighbors of Voter i who prefer Candidate α a er the t th iteration of simulation.
. Figure shows Voter has neighbors , , , and . The Neighbor Preference Matrix N indicates % of Voter 's neighbors support Candidate (Voters , and ), while % support Candidate (Voter ). The rest are undecided. Note also that some voters, such as Voter , are surrounded completely by undecided voters. Hence, Row will consist of all 's.

Threshold matrix
. Each voter requires a certain proportion of neighbors in order to prefer a given candidate. This proportion is called the threshold level. We put the various threshold levels together as a matrix T . Specifically, to prefer Candidate α, Voter i needs at least T iα of its total neighbors to prefer Candidate α. If multiple thresholds are met, the preference will be given to the candidate with the smallest threshold, or a random pick in the event of a tie.
. Note: Either 0 ≤ T iα ≤ 1 or T = ∞. If T iα = 0, then Voter i will always prefer Candidate α. If T iα = 1, then Voter i will prefer Candidate α only if all neighbors prefer Candidate α. If T iα = ∞, then Voter i will never prefer Candidate α. .
The threshold values for the voter network are determined as follows. First, using the proportions given by the Seed Vector , the seeds for each candidate are randomly chosen among the voters. If Voter i is a seed for Candidate α, then we set T iα = 0 and T iβ = ∞ for all β = α. Second, the threshold values of the nonseeds are randomly determined from a range between 2/d i and , where d i is the number of neighbors of Voter i in the network. As a consequence and consistent with the dynamics of complex communication discussed in the literature review, at least two neighbors must support a particular candidate before a node can change preferences. The values of the resulting n × k matrix T = (T iα ) never change during the run of the simulation. .
In Figure , the threshold levels for Voter are . , . , . for Candidates , and , respectively. Voter preference is assigned based on the lowest threshold that is met. This means that if % of Voter 's neighbors support Candidate , then Voter will support Candidate . However, if the % threshold for Candidate is not met, the next highest threshold of % for Candidate will be checked. Finally, all of Voter 's neighbors need to support Candidate for Voter to support Candidate .

.
Since Voters , and are seeds for Candidate , the corresponding rows in threshold matrix T contain in column , indicating this set of voters does not require any neighbors to convince them to retain their candidate preference, while the value ∞ indicates they will never switch to Candidate or regardless of neighbor preference.

Voter preference update .
The simulation is recursive with respect to time t. We define the current Neighbor Preference Matrix N (t) by using the neighbor proportions generated from the previous Voter Preference Matrix P (t − 1). Then we generate the current Voter Preference Matrix P (t) by comparing the proportions in the current Neighbor Preference Matrix N (t) with the levels given in the Threshold Matrix T . The general schematic of the simulation can be visualized as the following functional dependencies. .
In our example, we ( Figure ), the proportion of Voter 's neighbors that prefer Candidate ( %) meets the minimum threshold for Candidate ( %). Since none of Voter 's neighbors prefer Candidate , the % threshold is not met. Similarly, the % that prefer Candidate fail to meet the % threshold. Thus, a er the first iteration of the simulation, the initially undecided Voter will set their preference to Candidate . This update is reflected in the voter preference matrix P of Figure   Convergence of voter preference .
The iterations continue until convergence of some sort occurs. A er each iteration, we compile the results in a vector C of size k + 1, each of the first k entries being the proportion of voters preferring the corresponding candidate and the last entry being the proportion of undecided voters. The simulation ends when the distance between current and previous preference proportions reaches some minimal value. .
A er the first iteration, % of voters prefer Candidate , while Candidates and both maintain their original %. The remaining % of voters are undecided. The updated voter preference proportion vector is thus ( . , . , . , . ), with the fourth value being the undecided vote.

Initial results
.
The model was tested with graphs of various size and structure. Figure presents the result of random small world graphs. Small world networks exhibit short distances between nodes and high transitivity or clustering, which many real world networks also display (Watts & Strogatz ). In Figure , the small world graph has size , with each node connected by default to its nearest neighbors, and a % probability of a neighboring edge being rewired to a random node. The seed proportion for candidates and was held constant at % each, while the proportion for candidate ranged from . to . %, with increments of . %.

.
As Figure indicates, all simulations ended with either a majority of undecided voters or with Candidatethe initially leading candidate -winning. This finding provides preliminary evidence for our first expectation, holding the proportion of True Believers constant and randomizing social network changes, that, the candidate with the initial lead is more likely to win the invisible primary than other candidates. The seed proportion appears to have a significant role to play in the final outcome of the race. Namely, if the proportion of seeds for Candidate is high enough, then all else equal, Candidate is sure to win the election. On the other hand, if Candidate 's seeds are proportionally low, then Candidate will fare no better than the other candidates and the race will end with mostly undecided voters. According to these results, the bifurcation proportion value for Candidate seems to reside somewhere between . and . . This finding supports our second expectation that as the size of the initial lead decreases, undecided voters are more likely to constitute a plurality of the electorate. .
Note that the results in Figure are based on a threshold proportion for non-seed voters that ranges between 2/d i and , where d i is the degree of Voter i in the network. Thus, a minimum of neighbors must prefer a given candidate, which is consistent with the characteristics of complex communication discussed above. If the minimum proportion is reduced to 1/d i , the seed bifurcation for Candidate decreases significantly to approximately .
for the same small world graph configuration.
Model extension: True believer proportion .
The model described above assumes that all seeds for the candidates are what we call true believers, that is, Figure : True Believer Model.
they start preferring one candidate and will never waver from this position. In order to see if the seed proportion alone accounted for the final result of the election, we adjusted the model so that some of the seeds would not necessarily be true believers.
. This model extension di erentiates between candidate seeds who are considered true believers and those considered adherents. Both share the same initial candidate preference in voter preference matrix P ; however, the threshold value for their preferred candidate will di er. A true believer's threshold is set such that their candidate preference will never change, while an adherent's threshold allows a change in candidate preference. .
In addition to seed vector S, the threshold matrix is initially generated based on the proportions specified in a true believer vector T B = (tb 1 , . . . , tb k ). Each tb α represents the proportion of seed voters s α that are true believers of Candidate α, while (1 − tb α ) represents the proportion of adherents. The voter preference matrix P (0) is generated from the seed vector following the original algorithm. However, we need to adapt how values in threshold matrix T (0) are generated for candidate seeds. If Voter i is a true believer for Candidate α, then T (0) iα = 0 and all other elements in row i are set to ∞ as in the original algorithm. If Voter i is an adherent for Candidate α, then a random set of threshold values are generated, with T iα assigned to the minimum value in the set. An adherent thus has a threshold > that must be met to retain their initial preference for Candidate α.
. For example, T B = ( . , . , . ) indicate % of candidate seeds are true believers. However, T B = ( . , . , . ), implies % of candidate seeds are true believers, with the remaining seeds deemed adherents. The example preference matrix P in Figure shows three seeds for candidate (Voters , and ), thus two are randomly chosen as true believers ( , ). The adherent voter is assigned a threshold > for their preferred candidate. .
Figure shows the impact of the true believer proportion on voter preference for Candidate , while candidates and were held constant at . . Note that the rightmost facet with true_believer = . corresponds to Figure  , the default when we assume each seed is a true believer.
. We see that, if we lower the proportion of true believers among the seeds, the overall outcome of the campaign gets increasingly hazy, ending o en in a majority of undecided voters. This is striking in the rightmost facet, which corresponds to starting the campaign season with a high proportion of Candidate seeds. When all of the seeds are true believers, the campaign ends with an easy victory for Candidate when the seed proportion is su iciently high. But, when only % of those seeds are true believers, then most of the campaign simulations end with a majority of undecided voters. .
These findings support expectation (as a candidateâĂŹs proportion of true believers increases, he or she is more likely to win the simulated invisible primary) although we only test the expectation for Candidate . They also support expectation (a) -as a candidate's proportion of true believers decreases, the proportion of undecideds increases. In some ways, the Democratic primary helps illustrate the evidence in support of expectation . Pollsters do not generally ask respondents if they are true believers, but they do gauge a respondent's level of enthusiasm. As Bump ( ) shows, there was a correlation between enthusiasm and support during the early primaries. Interestingly, it was Hillary Clinton whose supporters were more enthusiastic about their candidate of choice. While this is not conclusive evidence, it helps to shed some light on how increasing a candidate's share of true believers can enhance his/her likelihood of winning. .
Similarly, comparing results along the rows shows us the impact of the di erent proportions of seeds when the true believers are held constant. As might be expected, the results indicate that, if either the seed proportions within the overall population or the true believer proportions within the seeds are too meager, the election will not converge to a solid lead for any of the candidates.
. Intriguingly, there is indication that the seed ratios have more impact than the true believer ratios. The middle facet shows the results for the case where the seed proportion for Candidate is % of the entire voter population but the true population among these seeds is relatively low (only % of these seeds are true believers). The proportion . × . = . of seeds are true believers and will never waver away from their support of Candidate . As the figure shows, voters choose Candidate over % of the time. .
We compare this scenario with the results of the rightmost facet (true believer proportion is . ), which involves the situation where the seed proportion for Candidate is slightly smaller ( . ) but consists entirely of true believers. For this scenario, the overall results are less likely to be positive for Candidate . Very few of the simulations end with a Candidate victory. Thus, we see that a comparatively large population of true believers does not make up for a relatively small group of initial supporters. This is perhaps best illustrated by Ron Paul's candidacy for the Republican Party nomination. A Public Policy Polling survey in December, , just weeks before the Iowa caucuses, asked how committed a respondent was to their top candidate choice. Although Paul placed a distant third place behind Mitt Romney and Newt Gingrich, his supporters were the most committed to his candidacy: percent were strongly committed to him, compared to percent of Romney supporters and percent of Gingrich supporters. Ultimately, Paul ended up finishing fourth in popular vote (RealClearPolitics a) and third in the delegate count (RealClearPolitics b).
By default, each voter is weighted equally in computing the neighbor preference matrix N . The final adaption of the model involves decay in voter influence, which is achieved by modifying voter weight. A voter has a default weight of . Thus, all neighbor preferences are equal in determining whether a given candidate threshold is met. A subset of voters is randomly chosen at each time interval, with each voter in the subset assigned a random weight in the range of . to . . The lower voter weight results in smaller proportions in the neighbor Figure : Decay.
preference matrix, making it more di icult for candidate thresholds to be met.
. Figure shows the results when decay is introduced into the model. Note that the top row in Figure is the same as the plot from Figure , corresponding to no decay. The role of decay is clearly prominent according to these results. In all cases, the presence of decay greatly increases the probability that the election will not converge to a clear winner.
. Generally then, the frontrunner is more likely to win the elections when no decay is incorporated. This finding supports expectation , that under conditions of no decay, the frontrunner is more likely to win the simulated invisible primary than under conditions of decay. As decay increases, the likelihood that the frontrunner wins decreases (expectation ). Further, a smaller proportion of true believers under conditions of decay is even more likely to result in a plurality of undecideds than under conditions of no decay. This finding supports expectation (b), that a decrease in the proportion of true believers is associated with an increase in the proportions of undecideds and that the relationship will be more pronounced under conditions of decay.
. ). If such findings can be generalized to most/all campaign information dissemination, there is plenty of room for uncertainty in an invisible primary. This uncertainty can provide an opening for candidates who are not the front-runner.

Graph variation .
As discussed in the literature review, network structure can impact results. To examine the e ects of our model in di erent structures still consistent with elections as involving "equal members of a generic, loose-knit community" (Rolfe , p. ) we used Small World graphs with varying degree and rewiring probability, as well as scale free Barabasi-Albert graphs of varying growth factor, with graphs of size generated per configuration. Figure contains the proportion of simulations that result in a win for Candidate for each graph configuration. The Small World graphs having degree , along with the Barabasi-Albert graph (growth factor m = , average degree . ) outperform the graphs with degree , including the Barabasi-Albert (m = , degree . ). .
While graph configuration has some impact on the outcome, the di erences are not as significant as those related to seed and true believer proportion. Figure shows the relative variable importance determined through the construction of a conditional inference tree, with Seed and True Believer proportion being the most important predictors of outcome, while graph type and rewiring probability, which a ects the mean distance between nodes, being the least important.

Discussion
. In this paper, we developed and ran a series of simulations based on the literature about the invisible primary and using a threshold type of network model. Simulations are particularly useful in studying the invisible primary because of the di iculty of testing hypotheses using polling data. Through the simulations, we find three components that are important to a candidate holding the lead prior to the Iowa caucuses. First and intuitively, a sizeable lead in the polls (i.e. seed proportion) at the outset of the race is critical. In our models, the battle is not between the first place and second place candidates. Rather, it is between the first place candidate and the undecideds. Thus, for the candidate doing the best in the polls, it is his or her race to lose. More specifically, we find that when the frontrunner polls above %, it is much more certain that she will maintain her standing going into the Iowa caucus. .
The second component that a ects the lead candidate's chances is an unwavering base of support (i.e., how true are the true believers). Our results show that as the proportion of true believers decreases for the candidate, his or her chances of being overcome by an undecided electorate increases. This result sheds light on why some frontrunners are able to maintain their lead while others are unable to convince the electorate to continue supporting them. While our initial model showed that gaining % of the lead in the polls is su icient for the lead candidate to win, that finding was contingent upon very sticky support. As the true believers decreases to % of the candidate's support, it is overwhelmingly likely that the electorate will remain undecided. .
The third component that is critical to frontrunner success is little to no information decay. When we relax the assumption that nodes (i.e., voters) will automatically communicate their preferences to their neighbors, the front runner's chances again diminish. Candidate communications may lapse or voters could become inatten-tive for a variety of reasons. Even when the lead candidate is polling at % and has an unwavering base of support, decay at a level of . (which we can interpret as neighbors not sending or receiving information % of the time) a ects the front runner's chances of holding on. .
The absence of any of these factors can lead at least a plurality of voters to be undecided thus creating uncertainty as to who will win the nomination. This uncertainty impacts the race; it can encourage more candidates to enter (e.g., as in the Republican Party in ) or it can lead to elites withholding endorsements. These results suggest that campaigns must do whatever they can to keep their supporters informed and strengthen their allegiance to the candidates.

Conclusion and Future Research
. The importance of a polling lead, an unwavering base and full information transmission has been demonstrated in this paper. What we have demonstrated is not how a candidate can be upset by another candidate, but the conditions under which a plurality of the primary electorate will remain undecided, creating favorable conditions for a political upset to occur.
. Nonetheless, the results of our model do have empirical referents. Between late August and mid-December of , Donald Trump generally ranged from about -% in the polls. During this time, other candidates such as Ben Carson and Carly Fiorina saw their own support surge. Once Carson's support began dropping, however, Trump surpassed the % level, sustaining his lead until the end of the invisible primary (RealClear-Politics ). This finding suggests that Trump's level of support, coupled with the large field, was significant in cementing his status as the winner of the invisible primary despite his lack of endorsements and fundraising prowess. .
Future work should incorporate exogenous shocks into the model. It is our hunch that under the conditions we have established, that a shock could tip the balance from the front runner to a di erent candidate. In the invisible primary, examples of shocks include debate performances, fundraising reports, endorsements or scandals. During the primary phase, this would also include election outcomes in various states.

Model Documentation
The model was programmed using R version . . and the code is available at https://github.com/lseiter/ threshold-model.