RecovUS: An Agent-Based Model of Post- Disaster Household Recovery

The housing sector is an important part of every community. It directly a ects people, constitutes a major share of the building market, and shapes the community. Meanwhile, the increase of developments in hazard-prone areas along with the intensification of extreme events has amplified the potential for disasterinduced losses. Consequently, housing recovery is of vital importance to theoverall restorationof a community. In this relation, recoverymodels canhelpwith devising data-drivenpolicies that canbetter identify pre-disaster mitigation needs and post-disaster recovery priorities by predicting the possible outcomes of di erent plans. Although several recovery models have been proposed, there are still gaps in the understanding of how decisionsmadeby individuals anddi erent entities interact to output the recovery. Additionally, integrating spatial aspects of recovery is a missing key in many models. The current research proposes a spatial model for simulation and prediction of homeowners’ recovery decisions through incorporating recovery drivers that could capture interactions of individual, communal, and organizational decisions. RecovUS is a spatial agent-based model forwhich all the input data canbeobtained frompublicly available data sources. Themodel is presented using thedata on the recovery of Staten Island, NewYork a erHurricane Sandy in 2012. The results confirm that the combinationof internal, interactive, andexternal driversof recoverya ecthouseholds’ decisionsandshape the progress of recovery.


Introduction
. Population growth in hazard-prone areas together with the increase in severity of extreme events (Bergholt & Lujala ; Guha-Sapir et al. ; Pravettoni ; Smith ) has raised the potential for disaster losses (Cutter et al. ; Schwartz ). Accordingly, a better understanding of the recovery process is necessary. Recovery process is a continuum of interdependent and mostly concurrent activities during pre-disaster preparedness and post-disaster short-term, intermediate, and long-term recovery. Among the components of this continuum, early-decided policies have a significant e ect on the progress of recovery (FEMA ). In relation, analysis and modeling capabilities could help with capturing the dynamics of recovery and underpinning recovery plans. .
Within a community recovery, housing restoration is of vital importance. Housing is a primary element of peoples' lives, which influences their well-being by providing a safe and secure place and creating a positive sense of self-worth and empowerment (Bratt ). Furthermore, residential structures constitute the major share of building stock in the United States (Comerio ). In , the number of U.S. houses was about million units (USBC ), and their associated mortgage balance was reported at trillion U.S. dollars in the second quarter of (FRBNY ). Additionally, the residential sector plays a significant role in shaping the built environment. Neighborhood characteristics such as availability of transportation systems, schools, employment opportunities, commercial establishments, recreational centers, etc. are influenced by households' preferences and demands. These factors make housing a major sector of the U.S. financial and social infrastructure

Literature Review
Drivers of recovery . The parameters that a ect housing recovery can be classified into three general categories concerning their relationship with households, including internal, interactive, and external drivers (Moradi a). Internal drivers are the factors directly related to households, such as household attributes and level of damage caused by a disaster. Di erent socioeconomic conditions di erentiate the resilience of households (Burton ; Moradi et al. ) and contribute to the emergence of dissimilar patterns of recovery. Economic status, for example, a ects households' post-disaster recovery. Individuals with greater financial power may apply disaster mitigation more o en and consequently be less impacted (Hunter ), while lower incomes are more likely to experience a slower recovery (Peacock et al. ). Level of educational attainment has also been reported to positively influence restoration (Burton ). Further, racial disparity was found as the major cause of the lengthy process of recovery a er Hurricane Andrew (Zhang & Peacock ) and Hurricane Katrina (Bullard & Wright ). Age (Henderson et al. ; Sanders et al. ), gender (Nejat et al. ), marital status (Nejat & Ghosh ), and household size (Nejat et al. ; Sadri et al. ) are other attributes that can a ect recovery. Another important driver is the damage severity. The e ect of the damage can last for several years a er a disaster (Hamideh et al. ; Peacock et al. ). A higher relocation ratio has been reported for more-impacted residents (Mayer et al. ; McNeil et al. ; Myers et al. ). Additionally, among households who stay, restoration could take longer for those whose properties have sustained more damage (Sadri et al. ). .
Interactive drivers are developed through the interaction of individuals with their community. Social capital, place attachment, and recovery of neighbors are among these drivers. Several studies have demonstrated the role of social capital in post-disaster recovery (Aldrich ; Burton ; Sadri et al. ), as it facilitates the achievement of common goals through mutual communications (Jamali & Nejat ) and provides informal resources for recovery (Airriess et al. ; Aldrich ). Place attachment also a ects households' recovery decisions. Residents are connected to their place of living via the resources as well as the sense of identity o ered by the neighborhood (Jamali & Nejat ). Sense of place has been reported as a key player in decisions of households against relocation (Binder et al. ; McNeil et al. ). Additionally, households' recovery decisions are influenced by their neighbors. Recovery of neighbors relays a positive message on restoration of the neighborhood and encourages other residents to repair/reconstruct (Nejat & Damnjanovic ; Rust & Killinger ).
. External drivers are provided by di erent public, private, and non-profit organizations. Examples include financial resources and restoration of infrastructure and community assets. Financial aids provided through insurance policies, disaster loans, and public funds enhance the progress of restoration (Nejat & Ghosh ). Distribution of these resources a ects the pattern of recovery such that regions with less assistance may experience a higher rate of relocation (Kamel & Loukaitou-Sideris ). Further, infrastructure and community assets, such as transportation systems, commercial features, schools, and healthcare facilities, provide services vital to the residents and which address their regular and recovery-specific needs (Aghababaei et al. ; Comerio ; Ronan & Johnston ; Xiao et al. ). Additionally, post-disaster functionality of infrastructure and community assets influences households' perception of their neighborhood reestablishment and can impact their decisions in favor of or against repair/reconstruction (Dehghani & Shafieezadeh ; Moradi b; Nazarnia et al. ).

Perceived neighborhood .
People di er in how they delineate their neighborhood even though they may live in geographic proximity (Coulton et al. ). Consequently, features they expect from their neighborhood, as well as its boundaries, may be dissimilar (Nejat ; Nejat et al. ). Recognizing these preferences and integrating them into the modeling can provide a more realistic picture of the residents' needs and priorities (Moradi et al. ). Various factors a ect the perception of a neighborhood, such as residents' sociodemographic attributes, neighborhood characteristics, and physical elements. Individuals with a longer duration of residence and higher income, education, and engagement in neighborhood activities may perceive a larger neighborhood (Coulton et al. ). Racial similarities can also cause adjacent areas to be included in or excluded from an individual's perceived neighborhood (Campbell et al. ; Krysan ). Further, residents may perceive a smaller neighborhood in high-density and mixed-used areas (Coulton et al. ) and a smaller neighborhood in suburban regions (Haney & Knowles ). Physical elements such as streets, parks, and rivers can also a ect perceived neighborhood boundaries (Campbell et al. ).
. Nejat et al. ( ) developed an index, Anchors of Social Network Awareness index (ASNA-i), to classify households based on the perception of their neighborhood. The index considers three classes of households: index , or infrastructure-aware class, is characterized by its preference for transportation and geographical features; index , or social-networks-aware class, opts for friends and families and neighborhoods; and index , or community-assets-aware class, prefers community assets and public services and safety. The classification is estimated by a latent class regression model that uses the logarithm of the population density of a county of residence, household income, and householder educational attainment and race as covariates (Nejat et al. ). Perceived neighborhood areas and their relationships with ASNA indexes have also been investigated by Nejat ( ) and Moradi et al. ( ).

Research Methodology
. RecovUS was developed at the household level to simulate post-disaster recovery decisions of households residing in their own single-family houses. Each household is represented in the model by an agent located on the polygon centroid of its home. A household agent possesses particular attributes, including characteristics of the householder (e.g., income and ASNA index) as well as information on the house in which it resides (e.g., pre-and post-disaster value and square footage). Based on its specific characteristics, each agent senses the environment and/or other agents, evaluates the conditions, and decides for its recovery. Recovery choices are repair/reconstruction of the damaged house, waiting without repair/reconstruction, or selling the house (and relocating). .
RecovUS is founded on the assumption that housing recovery is a function of households' financial conditions and community recovery. It assumes that a household would meet the prerequisites of repair/reconstruction if ) it has enough financial resources, and ) its community has recovered adequately ( Figure ). . A household's financial conditions are evaluated by two categories of variables: costs and resources ( Figure  ). Costs include repair or reconstruction costs and rent of another property when the primary house is uninhabitable. Resources comprise the money required to cover the costs of repair/reconstruction and to pay the rent (if necessary). The repair/reconstruction resources include settlement from the National Flood Insurance (NFI), Housing Assistance provided by the Federal Emergency Management Agency (FEMA-HA), disaster loan o ered by the Small Business Administration (SBA loan), a share of household liquid assets, and Community Development Block Grant Disaster Recovery (CDBG-DR) fund provided by the Department of Housing and Urban Development (HUD). Furthermore, household income determines the amount of rent the inhabitants can a ord.

.
Community conditions are assessed for each household based on the restoration of specific anchors ( Figure  ). ASNA indexes (Nejat et al. ) are estimated to identify the category of anchors important to the recovery decision of each household. Accordingly, households are indexed into three classes for each of which recovery of infrastructure, neighbors, or community assets matters most. Furthermore, among similar anchors, those anchors are important to a household that are located in its perceived neighborhood area (Moradi et al. ; Nejat ).
. Figure shows the steps of evaluating the recovery criteria and predicting the households' decisions. In each time step, the program implements the following procedure: . The algorithm starts with reimbursing financial aid to the eligible households that have been damaged by the disaster. These aids include NFI settlements, FEMA-HA, SBA loans, a share of liquid assets that households spend on recovery, and CDBG-DR assistance. The resources are reimbursed in sequence in order not to duplicate each other.
. Next, the model compares each households' available financial resources to the damage cost. If the financial criterion is not satisfied (i.e., the available financial resources are not enough to cover the repair/reconstruction costs), the program evaluates the habitability of the house. If it is habitable, the  household is expected to wait with a probability of r 1 %, but it also has the alternative of selling the property with a probability of ( -r 1 )%. If the house is uninhabitable, two additional conditions are evaluated. If the household can a ord to pay for the rent of another property and a vacant rental unit is also available, the options would be waiting or selling (like the previous case), but if either is not met, the only option would be selling the house.
. If the financial criterion is satisfied, the community criterion is evaluated. Based on the ASNA index of a household, restoration of infrastructure, neighbors, or community assets is compared to its desirable threshold (adq_infr, adq_nbr, and adq_cas, respectively). If the perceived community has adequately recovered, the household is expected to decide in favor of repair/reconstruction with a probability of r 2 %, though it would also have the alternative of selling with a probability of ( -r 2 )%. However, if community recovery is inadequate, the model proceeds like the situation in which financial resources were not enough (no. above). In other words, habitability, rent a ordability, and vacancy are checked, and the household decides to wait or sell the house.
. If a house is sold, the buyer would decide to repair/reconstruct the house with a probability of r 0 %, or wait (or sell again) with a probability of ( -r 0 )%.
. The thresholds r 0 , r 1 , r 2 , adq_infr, adq_nbr, and adq_cas are model parameters and their values are determined by calibration. The model algorithm is discussed in detail in Appendix, using the Overview, Design concepts, and Details (ODD) protocol (Grimm et al. , ).
. Figure illustrates the linkage of drivers of recovery to the model. Internal drivers relate to both financial and community conditions. Education level, race, and income connect to community conditions by identifying the anchor-based index of a household and its perceived neighborhood area. Physical damage is tied with financial conditions due to its association with repair/reconstruction costs. Also, the level of damage determines whether a home is habitable or the household needs to rent another residence. Furthermore, household income is related to financial conditions by estimating the amount of resources available for repair/reconstruction and the ability to pay for rent. Interactive drivers are coupled with community conditions through the influence of neighbors' recovery on recovery decisions of the index-households. Finally, external drivers relate to both financial and community conditions. The linkage to the former is via the provision of housing financial assistance, and to the latter through the e ect of recovery of infrastructure and community assets on recovery decisions of indexand index-households.

Data
. Recovery of Staten Island, New York a er Hurricane Sandy was selected as the case study to provide the input data. Sandy was a hurricane/post-tropical cyclone that hit the eastern coast of the United States on October , . It was one of the largest Atlantic tropical storms that extended into a territory with a diameter of , miles and a ected States. The highest storm surge was recorded nine feet along the shoreline in Manhattan and Staten Island. Sandy resulted in at least deaths, caused loss of power in . million houses and $ billion in damage, and damaged or destroyed , houses and hundreds of thousands of businesses (GAO ; HRD ; NOAA & NWS ). .
The input data for the model is classified into two general categories of non-disaster-related and disaster-related data. The non-disaster-related data includes information that does not change based on a specific disaster, such as household income, educational attainment, race, and housing characteristics. The disaster-related data, on the other hand, includes information that di ers based on the hypothesized disaster case, such as damage to houses, housing financial assistance, and damage and restoration of infrastructure and community assets. The input data consists of: . Housing attributes, including Staten Island single-family detached homes, their spatial location, level of damage caused by Hurricane Sandy, and restoration status a er two years.
. Household attributes, including household income and ASNA index.
. Financial resources, including distribution of NFI, FEMA-HA, SBA loan, CDBG-DR, and liquid assets.
. Households' financial ability to pay rent.
. Damage to infrastructure and community assets and their restoration progress. .
While this section provides a brief description of the data, a more detailed explanation is provided inAppendix (section Input data). The first category of data is housing attributes. This information was extracted from the New York City tax assessment data (NYC a). Cleaning the data to include only single-family detached homes of Staten Island resulted in , houses. Improvement market values for each house in each year was calculated by subtracting the land market value from the total market value. The improvement value estimates the worth of additions to the land, mainly due to the structure. The improvement values were discounted back to a common time (August ) using monthly Consumer Price Indexes (CPIs) for housing in New York, Newark, and Jersey City (BLS ) to have the same basis for comparison. The discounted improvement market (DIMP) values were used to estimate the damage to the properties. If the DIMP value of a house showed a decrease in the first post-Sandy year compared to the pre-Sandy year, the house was assumed damaged and the amount of damage was estimated as the di erence between the pre-and post-sandy DIMP values. Additionally, once the DIMP value in the subsequent years reached or exceeded the pre-Sandy DIMP value, the house was assumed to have been completely restored to its pre-disaster condition. The data were joined to the shapefile of the lots' polygons (NYC b) using ArcGIS Desktop . . (ESRI ) to provide the input file for RecovUS ( Figure ). More information is presented in theAppendix (section Input data: Housing attributes). . Household income and the householder ASNA index are of the other inputs. Household income is applied to estimate households' liquid assets and renting power and to control the eligibility criteria to receive financial assistance. Additionally, household income together with the population density of county of residence, householder educational attainment, and householder race is used to estimate ASNA indexes based on the latent class regression model proposed by Nejat et al. ( ), the details of which are provided in the Appendix. While population density is simply calculated, data on the other three covariates are not generally available at the household level. In the current research, this data was synthetically generated by Iterative Proportional Fitting (IPF) of the census data on household income, educational attainment, and race (USBC a,b,c), as explained in the Appendix. To reduce RecovUS runtime, ASNA indexes were estimated in advance from the covariates and were joined with the household income to the lots' shapefile using ArcGIS. Please see theAppendix (section Input data: Household attributes) for more explanation. .
Another category of input is the data on financial resources including NFI, FEMA-HA, SBA loan, CDBG-DR, and liquid assets. The National Flood Insurance Program (NFIP) paid , losses (totaling $ . billion) in the states impacted by Hurricane Sandy (III a,b). In the absence of higher-resolution data, % of the houses located within areas at high risk of flooding (FEMA a; Shawnee County ) were assumed to be reimbursed by the NFIP. Additionally, zip-code level data on reimbursement of FEMA-HA and SBA loan were obtained from FEMA ( b) and SBA ( ), respectively. Additionally, zip-code-level data on the distribution of CDBG-DR funds to single-family houses was obtained from NYC ( c). Furthermore, the census data on households' wealth (USBC ) was used to estimate households' net worth based on their income. Households were assumed to consider a share of their net worth ( -%) as the liquid assets assigned for repair/reconstruction costs. Please see theAppendix (section Input data: Financial resources) for more detail. The financial ability of households to pay rent is another input data. Household financial power regarding rent was assumed to be a fraction of the household income (up to %). Additionally, the amount of rent was estimated using the HUD's Fair Market Rent (FMR) for Staten Island (HUD ). Comparison of a household financial power to the amount of rent determined whether a household could a ord to pay the rent. More explanation is provided in theAppendix (section Input data: Rent). .
The last group of input is the data on damage and restoration of infrastructure and community assets. This data was estimated from qualitative reports. A report published by the City of New York (NYC ) describes the important transportation infrastructure and community assets. For example, it mentions that a "transportation asset on the East and South Shores is the SIR, a -mile commuter rail line operated by the Metropolitan Transportation Authority (MTA). . . ". The report then explains what happened in Sandy: "Major damage also occurred at the SIR's operations and maintenance facilities, limiting service in the days a er the storm (ultimately, full service was only restored in mid-December)" (NYC ). The authors used these descriptions to subjectively estimate the infrastructure damage every three months a er the disaster. For example, it was assumed that the SIR was % unfunctional immediately a er the hurricane ("Major damage") and was completely restored within the first three months ("full service . . .in mid-December"). The estimated damage was also checked for consistency with other reports (Kaufman & Shaby ), where possible. Since the items in the report were major infrastructures that a ected most residents, the average damage to the infrastructure system (rather than individual infrastructures) was calculated and fed into the model. Damage to the community assets was estimated for community assets using a similar approach and joined to their shapefile (NYC b). RecovUS evaluates recovery of community assets based on the geographic location of the assets since they were assumed to mostly serve the local residents. More information is provided in theAppendix (section Input data: Infrastructure and community assets).

Results and Discussion
Model calibration . Six thresholds in the model a ect households' decisions: adq_infr, adq_nbr, adq_cas, r 0 , r 1 , and r 2 . The values of these variables are obtained by calibrating the model. The objective of calibration (also called training in the field of machine learning) is to optimize the parameters such that the model would have the least (training) error value (i.e., the overall model predictions would have the least di erence with empirical data). Herein, the model predictions are recovery decisions of households (repair/reconstruct or not) at the end of the th month and the empirical data include the recovery status of houses estimated from tax assessment data as described previously. The model was run using the facilities of Texas Tech High-Performance Computing Center (HPCC ). Calibration started on a broader range of values for the parameters and a smaller number of repetitions and continued with narrowing the range of values and increasing the number of repetitions (Table  ). The reason for this configuration was to evaluate the predictions of the model on a wide range of values and accommodate the HPCC runtime limitations. The predictions from each set of values were averaged on the number of repetitions to obtain the average number of repairs/reconstructions. Then, the prediction error, called the training error, was calculated by comparing the average number of repairs/reconstructions predicted by the model with the number of repairs/reconstructions from the empirical data. .
The value of % obtained for r 1 through calibration implies the probability of waiting (against selling) for a household whose recovery criteria have not been met yet is %. Similarly, the value of % for r 2 means the probability of repair/reconstruction for a household with satisfied recovery criteria is %. The calibrated values indicate that the behavior predicted by the model corresponds to its fundamental hypotheses, meaning that once both criteria are met, households mostly decide to repair/reconstruct (r 2 = %). Conversely, when Setting adq_infr adq_nbr adq_cas r0 r1 r2 Repetitions : : : : : : : : : : : : : : : : : : : : : : : : : : : :  ) suggested that availability of financial resources a ected decisions of Staten Island residents, who had been impacted by Hurricane Sandy, in favor of repair/reconstruction. Additionally, recovery of neighbors, infrastructure, and community assets has been reported to positively influence housing reconstruction (Burton ; Comerio ; Rust & Killinger ). .
The value of % for r 0 entails that the probability of buyer repair/reconstruction is about half the probability of waiting. The empirical data also supported this value. The tax assessment data (NYC a) includes the names of properties' owners in each fiscal year. The analysis of the data revealed that about % of the damaged properties whose owners had changed a er Sandy (sold the property) had not yet recovered until the second year (equivalent to r 0 = . %). A similar ratio has also been reported for the recovery of auctioned houses in Long Island (Polsky ).
. Furthermore, the analysis showed low sensitivity of the model to changes in adq_infr and adq_cas. The reason for the former is the rapid recovery of infrastructure within the first few months. Thus, no matter what value was selected for adq_infr, the criterion of community recovery for index-households was satisfied in the initial time steps. Additionally, since index-households constitute a small share of the population (about %), change in the value of adq_cas does not significantly a ect the overall output. Finally, the optimized value for adq_nbr implies that at least % of neighbors should restore in order to satisfy the community criterion for indexhouseholds.
The small training error obtained for RecovUS (-. %) means the model is not underfitting the data, i.e., using the input dataset together with the values obtained for the parameters from calibration would lead to a prediction that adequately represents the pattern observed in the empirical data (Goodfellow et al. ). However, measuring the performance of a model against the same dataset used for calibration may result in an optimistic conclusion. The more meaningful approach is to validate the model, i.e. examine generalization performance of the model over previously unobserved datasets to assure it would not overfit the data (Theodoridis ). A small gap between training error and test error would satisfy this purpose (Goodfellow et al. ). This study applied the k-fold cross-validation method, which is a common technique for estimating a model's test error. The idea is to randomly divide the dataset into k subsamples with roughly equal sizes, form a training set by deleting one group, and assign the deleted group to the test set. The model is calibrated over a training set and is used over the corresponding test set to obtain the prediction error, also known as the test error. The test errors are finally averaged on the number of sets (k) to yield an average estimation of the generalization error (Theodoridis ; Twomey & Smith ; Wold ). .
The dataset was divided into four subsamples constituting the four training and test sets (Table and Figure  ). Count and Ratio in Table represent the number and percentage of damaged houses in each dataset, respectively. Since RecovUS captures the spatial aspect of recovery by evaluating the recovery of neighbors and community assets, subsamples are four geographic regions of the dataset (zip codes) with almost the same number of damaged houses rather than sparse samples. As depicted in Table , each test set consists of about one-fourth (since k = ) of the total number of damaged houses.

Dataset
Train Test

Count Ratio (%) Count Ratio (%)
All . - . set-. . Table : Training and test sets for k-fold cross-validation Figure : Training and test sets for k-fold cross-validation.
. Using the k-fold cross-validation method, the model was first calibrated using one training set at a time to obtain the optimized values for the parameters. Calibration was performed using a similar procedure as explained before. repetitions) with the number of repairs/reconstructions from the empirical data. The set of values associated with the minimum training error was selected as the calibrated values. The calibrated values were again used over the same training dataset, but the model was run , times to calculate and report a more accurate training error (Table ). Then, the calibrated model was run , times over the corresponding test dataset, and the average error, called the test error, was calculated by comparing the number of repairs/reconstructions predicted by the model with the number of repairs/reconstructions from the empirical data. This procedure was repeated over the four folds of the data and the associated errors were calculated. Finally, the model's test error, calculated by averaging the four test errors weighted by the number of samples, was obtained to be equal to . % (Table ). The small di erence between training and test errors ( . %) assures that the model generalizes well on unobserved data.  Two series of sensitivity analyses were performed to evaluate the relative importance of model variables and assumptions. Herein, relative importance means the amount of which a variable or assumption a ects the model output in terms of damaged houses repaired/reconstructed. In the first series of analysis, the six thresholds of the model were set to the calibrated values ( Table ), the value of a variable changed, and variation in the predicted progress of recovery was monitored. Six variables, including r_asna, r_prds, r_hbt, and r_vac, insurance penetration, and recovery rate of infrastructure were selected to showcase the model sensitivity (while these variables are described in the following section, a detailed explanation has been provided in the Appendix). The  .
The first variable was r_asna. As explained in the Appendix, households' ASNA indexes were predicted by a latent class regression model using household attributes. Then, the model assigned the predicted index to a household with a probability of r_asna% and assigned one of the other two indexes with a probability of ( -r_asna)%. Therefore, this variable together with the household attributes a ects the share of the population in each class, which in turn influences how the criterion on the recovery of the community will be satisfied. Figure a illustrates the percentage of repaired/reconstructed houses (averaged on runs) predicted on threemonth intervals for di erent values of r_asna. The default value for the variable was %. The results suggest that the model is not sensitive to the changes in r_asna. However, insensitivity was caused by the input data, not the model structure. First, with the generated household attributes and assuming r_asna = %, about %, %, and % of the households were predicted to be of ASNA index , , and , respectively. Secondly, the rapid recovery of infrastructure caused the criterion on community recovery to be easily satisfied for indexhouseholds in the rd month. Consequently, given the satisfaction of the financial criterion, there would be a good chance of repair/reconstruction for these households who meanwhile constitute a major share of the population. This also a ects recovery decisions of the second largest part of the population (i.e., the index-(social-networks-aware) households). Since the community criterion for index-households is the adequate recovery of their neighbors, and many of the neighbors are of index for which community criterion has already been satisfied, a high chance of repair/reconstruction once again exists for the index-households. Furthermore, although variation in the value of r_ansa changes the proportions of households, the dominant indexes are still and . Therefore, the major shares of these classes combined with the fast recovery of infrastructure brought about the insensitivity of the model to changes in r_asna.

.
The next variable was r_prds. This variable identifies the random deviation of a household's perceived neighborhood radius from the median radius corresponding to its ASNA index. For example, r_prds = % (default) means that for each household, the index-specific median radius is multiplied by a random draw from the interval ± = [ , ]% to yield the radius of its perceived neighborhood. Figure b suggests that the model is not sensitive to the changes in the value of r_prds. Since the community criterion for index-and index-households is evaluated on the recovery of neighbors and community assets that exist in the perceived neighborhood, changes in the radius of perceived neighborhoods alter their quantity and consequently a ect the satisfaction of the criterion. However, the major share of index-households as neighbors of index-households and the small share of index-households caused the model not to show sensitivity to r_prds. Therefore, once again, this observation was produced by the input data and cannot be interpreted in general as the insensitivity of the model to r_prds.

.
Another variable was r_hbt. This variable identifies the threshold for the level of damage above which a house is assumed to be uninhabitable. The value of r_hbt impacts the model output in two aspects. First, it a ects the amount of financial assistance that eligible households may receive from FEMA, as FEMA-HA is paid up to the amount of returning to the habitability level. Second, it determines whether households whose recovery criteria have not been satisfied can wait in their own house or must rent another residence (and if cannot, sell it). Figure c shows the sensitivity to r_hbt. The solid red line with triangular markers illustrates the output for the default value of r_hbt = %. The results show that the model is sensitive to this variable such that its smaller values speed up the progress of repair/reconstruction. The behavior corresponds to the output expected from the model algorithm. When r_hbt increases, the amount of FEAM-HA reimbursement decreases, which in turn causes the financial criterion not to be met for a higher number of households. Additionally, the increase in the value of this variable means that more households whose recovery criteria have not been met can wait in their own houses. Both consequences contributed to the overall decline in the progress of repair/reconstruction. Figure c further suggests that the model is almost insensitive to changes in r_hbt for values greater than %. This happened due to the level of damage estimated from tax assessment data. Based on this estimation, about % of damaged houses sustained a damage level greater than %, and only % were damaged by more than %. This share, however, was greater for lower levels of damage such that %, %, and % of houses burdened damage greater than %, %, and %, respectively. Therefore, since a few houses experienced damage greater than %, r_hbt above % did not a ect the model output. .
Sensitivity analysis was also performed on r_vac. This variable estimates the probability of availability of vacant rental units. If recovery criteria are not satisfied, the home is uninhabitable, and the household can a ord to rent another residence, there is a probability of r_vac% that a vacant rental property, and consequently the option of waiting, would be available to the household. The results show that smaller values of the variable speed up the progress of repair/reconstruction. The change is less for smaller values of r_vac but increases with its growth such that changing r_vac from % to % decreases the percentage of repair/reconstruction in the th month by about %, but changing the variable from % to % decreases the percentage by about %. This decrease is a result of the assumption of the model on decisions of households whose recovery criteria have not been met, their homes are uninhabitable and can a ord to rent another residence. In such a case, most households prefer to wait (r 1 = %) rather than sell their property. Therefore, an increase in the value of r_vac means that more households will be able to wait in the hope of better conditions, which in turn results in a decrease in the overall percentage of repair/reconstruction, which could have happened a er the sale. Oppositely, with a smaller r_vac, such households cannot find a place to rent and must sell their residence. A higher share of sold houses means that buyers, some of whom may decide to repair/reconstruct (r 0 = %), will have a higher share in the overall recovery. Although based on the model assumptions, a decrease in the availability of rental units would result in more repair/reconstruction, the extra amount is caused by new owners (buyers) at the cost of relocating the current households. Relocation is commonly deemed an unfavorable social phenomenon, as relocators may encounter problems in adapting to their new living environment, developing new social ties, and finding jobs which can trigger psychological burdens (Goenjian et al. ; Jamali et al. ; Najarian et al. ; Riad & Norris ; Uscher-Pines ). Accordingly, motivating individuals to withstand deficiencies and rebuild instead of relocation is a near-consensus position among policymakers and researchers (Berke et al. ; Birkland ; Jamali et al. ; Kumar & Havey ). Therefore, the availability of rental units in the post-disaster setting is desirable, as it could help with reducing relocations and increasing households' resilience and coping capacity (Cutter et al. ; Moradi et al. ).
. The next sensitivity analysis was implemented on the penetration of flood insurance. Based on the model algorithm, NFI settlements were the first type of financial resource assigned for repair/reconstruction. As described previously, it was assumed that % of households residing in high-risk flood zones benefited from flood insurance. Sensitivity analysis was performed to examine how changes in the default assumption impact the model outcome. For this purpose, it was first assumed that only high-risk zones may have insurance and the penetration rate changed from % to %. The results, as depicted in Figure e, suggest that the overall outcome is not sensitive to variation in the insurance penetration rate. Again, this observation is rooted in the input houses of which , houses were estimated to be damaged by Hurricane Sandy. However, a small number of them were located within the zones with a high risk of flooding ( , of total and , of the damaged houses). Consequently, considering the small share of damaged houses that potentially may have insurance (less than %), changes in the value of insurance penetration rate would not significantly alter the overall outcome. Therefore, in the second round of analyzing the sensitivity to the insurance penetration rate changes, eligible zones were expanded to the whole island. It was assumed that all zones may have insurance (regardless of the level of flood hazard), and the penetration rate varied from % to % (Figure f). The outcome shows that the model is sensitive to changes in the insurance penetration rate when it is applied to a su icient number of houses. Insensitivity of the model when only high-risk zones are considered and its sensitivity once the region is expanded correspond to the reports that suspect accuracy and su iciency of -year flood maps as the boundaries of high-risk zones since a major share of losses happen outside of these boundaries (Brody et al. ; Highfield et al. ). Further, the model predicts that expanding the region to the whole island enhances the progress of repair/reconstruction even when a small penetration rate of % is applied. This outcome is also supported by the literature on the significant role of insurance in housing recovery and household resilience to disasters (Moradi et al. ; Nejat & Ghosh ; Tobin ). However, the model predicts that improvement in the amount of repair/reconstruction due to an increase in flood insurance penetration is more significant in the first few months and is diluted over time. For example, while the di erence in the percentage of repair/reconstruction for penetration rates of % and % is about % in the th month, it is lowered to % in the th month. This means that while a higher penetration accelerates the progress of repair/reconstruction and speeds up the return to normalcy, it might not significantly change the long-term outcome. Meanwhile, considering the controversial aspects of the National Flood Insurance Program such as recognizing improper land development, mitigation failures, and actuarial unsoundness (Burby ; Kunreuther ; McMillan ), the decision for increasing insurance penetration requires more consideration. .
Finally, the sensitivity to di erent regimes of infrastructure recovery was evaluated. As described before, it was estimated that infrastructure majorly recovered in the first three months ( . %) and completely recovered by the end of the th month. To examine the influence of di erent recovery regimes, the three-month rate of infrastructure recovery changed from % to % (Figure ), and the predicted progress in housing repair/reconstruction was compared (Figure g). . Figure g shows four patterns for housing repair/reconstruction based on di erent values for infrastructure recovery rate: a pattern for an infrastructure recovery rate of %, a pattern for %, a pattern for % and %, and a pattern for % to %. The outcome suggests that quicker infrastructure recovery results in an enhancement in the progress of housing repair/reconstruction in months -. This improvement is caused due to satisfaction of the community criterion for index-households and its ripple e ect on the recovery of index-households, which together constitute about % of the households. This observation is aligned with the literature that reports the positive e ect of infrastructure functionality on housing repair/reconstruction (Arup ; Burton ; Comerio ; Miles & Chang ; Moradi et al. ). .
The sensitivity analyses presented above rely on a local technique in which variables are changed one at a time. When a model includes nonlinearities and interactions, a global sensitivity analysis is also necessary since local methods do not adequately represent its sensitivity (Saltelli et al. ). In this research, an ablation study was performed to evaluate the global sensitivity of the model. Ablation study examines a model by removing its building blocks to examine their e ect on the model output (Hessel et al. ; Meyes et al. ; Sheikholeslami ). As described previously, RecovUS is based on two fundamental assumptions stating that financial conditions and community recovery are the prerequisites for housing repair/reconstruction. Each of these assump-tions includes several blocks; financial conditions consist of five blocks of NFI, FEMA-HA, SBA loan, liquid assets, and CDBG-DR, and community recovery includes three blocks of recovery of infrastructure, neighbors, and community assets. To examine the e ect of each block on the overall repair/reconstruction of households, the model was run with excluding the block from the algorithm. For example, the e ect of liquid assets on the percentage of repair/reconstruction was evaluated by removing the block of liquid assets. In other words, it was assumed that households would not spend any share of their liquid assets on repair/reconstruction (Figure  a), the model was ran times, and the results were averaged. Additionally, the e ect of each fundamental assumption was evaluated by removing all of its constituent blocks. Figure b schematically shows removing the fundamental assumption of community recovery as an example. The analysis results are illustrated in Figure    .
Four distinctive patterns are observed in Figure . The highest ratio of repair/reconstruction is achieved in the base case in which none of the blocks are removed. Removing NFI, FEMA-HA, SBA loan, CDBG-DR assistance, and recovery of community assets does not change this pattern significantly. With the input data and assumptions described before, the share of NFI, FEMA-HA, SBA loan, and CDBG-DR assistance from the total financial resources available to households for repair/reconstruction was about %, %, %, and % respectively. Therefore, each of these resources could not individually a ect the overall outcome much. However, although it might be argued that the mentioned resources were not shown to greatly influence the recovery of the community, it should be noted that each of them is important to their eligible population. For example, CGBG-DR assistance is mainly intended to help with the recovery of lower-income households. Therefore, although it constituted only % of the total resources and slightly influenced the global pattern of recovery, it helped with recovery of a particular portion of the population who could not a ord the cost of repair/reconstruction without this type of assistance. The insensitivity of the model to the recovery of community assets also resulted from the small share of community-assets-aware households (less than %) studied in this research. In the current study, recovery of community assets did not help much with the overall recovery of households; however, they are still important to the recovery of di erent communities where a higher share of households perceive community assets as the most important anchors of their neighborhood. .
On the other hand, removing the block of liquid assets, the assumption of financial conditions, or the assumption of community recovery results in the largest decline in the overall ratio of repair/reconstruction. Liquid assets constituted more than two-thirds of the financial resources. Consequently, removing this resource a ected the overall outcome like the case in which financial conditions were not satisfied at all. Interestingly, removing the assumption of community recovery as a recovery criterion impacted the overall repair/reconstruction of households similar to removing the assumption of satisfaction of financial conditions. This result suggests that the two fundamental assumptions of this study (i.e., availability of financial resources and restoration of perceived community) are equally important to homeowners' recovery decisions.
. Two other curves exist between these two extremes: removing the block of recovery of neighbors and restoration of infrastructure. By removing the block of recovery of neighbors, social-network-aware households will perceive that their neighborhood has not recovered at all. Removing the block of recovery of infrastructure has a similar meaning to the infrastructure-aware households. Figure shows that the e ect of infrastructure restoration on the overall recovery of households is more than the neighbors' recovery. This is because first, the number of infrastructure-aware households in this study was twice the number of social-network-aware households (about two-thirds and one-third of the households, respectively). Second, the non-functionality of infrastructure not only directly impacts the repair/reconstruction decision of infrastructure-aware households but also indirectly a ects recovery decisions of social-network-aware households since most of their neighbors are infrastructure-aware. Therefore, recovery of infrastructure had a ripple e ect on the recovery of households.
. The experiments explained above helped with addressing the research objectives. The results showed that internal, interactive, and external drivers of recovery a ected households' recovery decisions. As described before ( Figure ), the model incorporated three categories of drivers: internal (household income, education, and race, and physical damage), interactive (recovery of perceived neighbors), and external (financial assistance, recovery of infrastructure, and recovery of community assets). Household income, education, and race are the internal drivers that estimate a household's ASNA index. This index classifies households based on their perception of the community, which in turn identifies the share of recovery of infrastructure, neighbors, and community assets in households' decisions. Most households were estimated to be infrastructure aware (about %). The second major group was social-networks-aware households (about %) for which recovery of community assets mattered most. The last class (i.e., community-assets-aware households) constituted a minor share of households (less than %). Accordingly, recovery of infrastructure followed by recovery of neighbors had the highest impact on the recovery of households. It is worth noting that recovery of infrastructure has a ripple e ect since it directly impacts infrastructure-aware households and indirectly a ects socialnetworks-aware households (through their infrastructure-aware neighbors), which together constitute about % of the households. This relationship is observed in Figure g as the higher rates of infrastructure recovery are associated with accelerated progress in households' repairs/reconstructions. This positive relationship has been reported in the literature as described before (Comerio ; Nejat & Damnjanovic ; Xiao et al. ). Damage to houses is also an important driver, as it identifies whether a household can a ord the cost of repair/reconstruction. More severe damage requires more financial resources and is o en associated with lower odds of repair/reconstruction (Mayer et al. ; McNeil et al. ). Additionally, the impact of financial resources is reflected in Figure c and f. Increasing the amount of financial aid (e.g., increasing FEMA-HA reimbursement through decreasing r_hbt) and expanding the aid to accommodate more households (e.g., increasing insurance penetration rate) enhance the progress of repair/reconstruction. This observation is also supported by the literature (Kamel & Loukaitou-Sideris ; Nejat & Ghosh ; Tobin ). Therefore, seven out of the eight drivers of recovery employed in the model e ectively impacted the recovery of households. Recovery of community assets had a relatively small share due to the small number of households indexed as community-assets aware. Additionally, the results showed that the fundamental assumptions of the study (i.e., availability of financial resources and recovery of perceived community), both are important to homeowners' recovery decisions and a ect the overall pattern of repair/reconstruction to a similar degree (Figure ). Among constituents of the first assumption, liquid assets had the highest impact due to its major share in financial resources. Among the elements of the second assumption, restoration of infrastructure followed by recovery of community assets played the key role because most households were indexed as infrastructure-aware and social-network-aware.

Conclusions
. This research aims to develop a model that could integrate the e ects of financial resources and communal aspects of recovery by capturing spatial interactions of households with their perceived neighborhood. While several models have been proposed for the simulation of recovery, the major contribution of this research is to develop a model that simulates housing decisions through the mindset of insiders. In RecovUS, not only do factors such as level of damage, financial resources, and a ordability and availability of rental properties a ect households' decisions in favor of repair/reconstruction, waiting, or selling, but also their perception of their neighborhood and its restoration also play a critical role. The model separates this perception for di erent residents such that heterogeneous households prefer community features dissimilar in terms of type and distance. The output from the model confirms that internal, interactive, and external drivers of recovery e ectively play a role in the households' decision-making and influence the progress of repair/reconstruction. Therefore, RecovUS contributes to the field of disaster recovery modeling by weighting the communal aspects of recovery and calling out the necessity of including households' interactions with their perceived neighborhood for achieving a more realistic simulation and prediction.
. Like all studies, the current research was associated with limitations. Addressing these limitations could be future lines of study. A challenge was providing individual-level data on damage and restoration of properties. Although this information was estimated from tax assessment data, the approach comes with a limitation: properties are appraised within a fiscal year, not on a single date. Therefore, based on the time that a disaster impacts a region and the date on which a property is appraised, the value may be related to the status of a house in to days a er the disaster. This influences estimation of both damage and repair/reconstruction. Although discounting all values back to a base date helped with alleviating market inflations and comparability of the values, the estimation is still a ected by the recovery stage on which a building has been appraised. Moreover, the value of a building can be a ected by a variety of causes other than physical damage such as market dynamics, depreciation, upgrade, etc. However, in the spatiotemporal vicinity of a disaster (like the case studied in this research), the price changes can be plausibly assumed to be caused by the physical damage and the restoration progress, though there still might be inaccuracies. Damage could be estimated using other methods and applied to evaluate the expected capability of the model in simulating di erent patterns.
. Also, the model does not impose any time limit on the duration of renting. Households that a ord the rent and can find a vacant unit may wait up to the last run of the program ( months). However, although temporary housing may take weeks to months (Peacock et al. ), it is not necessarily via renting a place. For at least the first few months, many higher-income households may decide to stay at hotels and motels, while lower income households may stay with their friends and families (Morrow & Peacock ; Peacock et al. ). However, the model assumes they will rent. Additionally, the model does not include programs that facilitate temporary housing by providing cash rental assistance, mobile homes, etc. These subjects can be accommodated in future research. .
Furthermore, although RecovUS neither underfits nor overfits the data, the generalization of the model to other disaster scenarios needs additional consideration. For example, the spatial pattern of damage is di erent in a hurricane or earthquake from a tornado. While hurricanes typically a ect a vast area, tornadoes usually have a local spatial expansion and impact a limited region on and around their path. In addition to the type of disaster, other factors such as characteristics of households, types and timing of financial resources, and recovery of infrastructure and community assets can be significantly di erent. When input data is much di erent, the parameters should be optimized by recalibrating the model, while some submodules may even require modification. Such data can help with evaluating the generalizability of the model to other types of disasters, di erent socioeconomic structures, various distributions of financial resources, and di erent patterns in the restoration of infrastructure and community assets. .
The model development was also influenced by technical constraints. Due to the so ware and HPCC limitations, the number of model parameters was restricted to reduce the calibration duration. Consequently, Re-covUS was calibrated on six variables. In the absence of this limitation, more parameters could have been employed to accommodate di erent conditions. For example, the probability of selecting the wait option, conditioned on whether a home is habitable or not, could be represented by two di erent parameters, rather than by a single parameter (r 1 ) as is in the current version. Moreover, RecovUS was calibrated by minimizing the disagreement between the predicted and observed outcomes (i.e., recovery decisions). While percent agreement is a commonly used method, calibrating the model using other measures such as the kappa statistic could decrease the potential of chance agreement. Additionally, aggregating the household-level results into a larger geographical unit (e.g. block group) can provide additional insights into the spatial nature of recovery decisions.
. Another important subject is the possible loss of income due to the collapse of the local economy. In the current study, household-level income was estimated from the census data and was used to estimate households' ASNA indexes, rent power, net worth, and eligibility for SBA loan and CDBG-DR assistance. Although the changes in household income are expected to have been reflected in the American Community Survey estimates, it still might a ect the income-related values. The potential e ect of income variability can be integrated into future versions of the model.
. Finally, the current research examined homeowners' recovery decisions residing in their primary single-family detached houses. More research is required to study the recovery of other types of occupancy and housing characteristics. Recovery of renters, secondary homes, multifamily properties, etc. are important subjects that require more study. Therefore, expanding the model to include other aspects of recovery can be the subject of another study.

.
Despite the limitations, RecovUS provides a platform to spatially model houses and community assets to examine the impacts of financial aids and restoration of community on recovery decisions of households with heterogeneous characteristics. RecovUS benefits from the advantage of calibration and modular structure. If assumptions or input data are to be included that are significantly di erent from the current research, the model can be recalibrated to optimize its parameters or, if required, submodules can be easily modified to address the new conditions without impacting the model's integrity. For example, this study assumed that households whose recovery criteria have not been met may wait for the whole runtime (i.e., months), if their home is habitable or they are able to rent another place. Other options, such as staying with friends and families rather than renting, staying with significant others in the first few months and then renting another place, substituting the wait/sell parameter with time-sensitive parameters such that the weight of the sell option increases over time, etc. are of possible modifications that could be easily incorporated into the model. New submodules can also be added (e.g., to simulate business recovery), while the integrity of the model stays preserved. .
Another feature of RecovUS is that the required input can be obtained from free and publicly available data sources. The idea behind this design was to provide a model that could o er a perspective toward the situation at a minimum amount of time and cost. Additionally, if more specific and accurate data is provided, the model could be easily fine-tuned. Therefore, RecovUS can help with data-driven planning by providing a tool for predicting impacts of di erent resource allocation and reconstruction scenarios to outline policies that could better address social, economic, and political concerns while accommodating the various needs of di erent stakeholders.

Model Documentation
The model has been developed in the environment of NetLogo . . ). Hurricane Sandy: An investment strategy could help the federal government enhance national resilience for future disasters. Available at: https://www.gao.gov/products/GAO-15-515