Building Spatial Models

To this point in Spatial Modelling (I or II), we’ve really primarily dealt with regression models about spatial data. We have not really examined or understood how spatial structures or relationships themselves must be modeled. In the Regression module, we did talk a little bit about spatial fixed effects, which allow for there to be different mean/baselines in each area.

library(sf)
library(mosaic)
library(spdep)
bristol = sf::read_sf('./data/bristol-imd.shp')

Check Yourslf: Spatial Joins

Let’s say we need to have the ward ID for each LSOA. Right now, the LSOAs don’t have information on their wards built into the dataframe. If we have Bristol ward boundaries in the following geojson file:

Read in the data on ward boundaries:

wards = sf::read_sf('./data/wards.geojson')

What steps are necessary to join up the LSOAs that are within each ward? One workflow you’ve learned already might be:

Projection

First, we need to ensure that our data is in the same map projection.

To reproject the wards dataframe into the same projection as your bristol data, we use the sf::st_transform method. I find it easiest to set projections based on the exact projection string from the data we’re targeting:

wards = sf::st_transform(wards, sf::st_crs(bristol))

After we do the reprojection, we can check to see if the data is in exactly in the same projection by:

sf::st_crs(wards) == sf::st_crs(bristol)
## [1] TRUE

and, we can check if everything “looks right” by making a map:

plot(bristol['imd_score'], reset=FALSE, lwd=.1)
plot(sf::st_geometry(wards), border='white', lwd=2, add=TRUE)

Spatial Join

Finally, we need to do a spatial join to link up the ward identifier to the LSOAs.. This is a multi-step operation, so let’s walk through it.

  1. If we want the ward that each LSOA is inside of, we might try:
bristol_join = sf::st_join(bristol, wards, join=sf::st_within)

Do this, then plot(bristol_join['ward_id']). Why does that not look like you might expect? 2. A clever person might object, and think this might work:

bristol_join = sf::st_join(wards, bristol,join=sf::st_contains_properly, left=FALSE)

Why does that not work how you expect? 3. The way I’d recommend doing this is first getting the centroids of the lower-level units:

bristol_centroids = sf::st_centroid(bristol)
## Warning in st_centroid.sf(bristol): st_centroid assumes attributes are
## constant over geometries of x

Then, we can use st_within assign each LSOA to a ward that contains that LSOA’s centroid!

bristol_centroid_join = sf::st_join(bristol_centroids, wards, join=sf::st_within)

Now, to get the ward_id back alongside our original polygons, you can do a dplyr::left_join based on the LSOA code. First, we should only keep the two columns we need, LSOA11CD and ward_id. And, since we don’t want to have multiple geometry columns hanging around, we need to drop the geometries from bristol_centroid_join:

clean_ward_centroid = sf::st_drop_geometry(bristol_centroid_join)
clean_ward_centroid = clean_ward_centroid[c('LSOA11CD', 'name', 'ward_id')]
names(clean_ward_centroid) = c('LSOA11CD', 'ward_name','ward_id')

Now, our new clean_ward_centroid is a dataframe with two columns. It has an LSOA identifier matched up to the ward_id and ward_name for the ward that contained the LSOA’s centroid. Now, we want to do a join of this data back into our original Bristol LSOA data. A left_join takes two dataframes, x and y, and matches every row of x to a row in y, based on a “key” column that is shared between the two frames. So, for instance, looking in detail at a few rows of bristol:

## Simple feature collection with 10 features and 1 field
## geometry type:  MULTIPOLYGON
## dimension:      XY
## bbox:           xmin: 351942.9 ymin: 173775 xmax: 360652.9 ymax: 177452.7
## epsg (SRID):    27700
## proj4string:    +proj=tmerc +lat_0=49 +lon_0=-2 +k=0.9996012717 +x_0=400000 +y_0=-100000 +ellps=airy +towgs84=446.448,-125.157,542.06,0.15,0.247,0.842,-20.489 +units=m +no_defs
## # A tibble: 10 x 2
##    LSOA11CD                                                        geometry
##    <chr>                                                 <MULTIPOLYGON [m]>
##  1 E01014485 (((360256.3 174422.7, 360255 174422, 360251 174422, 360224 17…
##  2 E01014486 (((359676 174114.7, 359661.4 174127.8, 359661.7 174128.7, 359…
##  3 E01014487 (((359036.6 174653.1, 359035.2 174668.6, 359034.5 174676, 359…
##  4 E01014488 (((359632.6 174806.4, 359634.2 174807.4, 359648.1 174816, 359…
##  5 E01014489 (((359120.6 174097.5, 359120.3 174098.5, 359119.6 174102.1, 3…
##  6 E01014491 (((359467.6 175130.1, 359466.3 175131.7, 359461.6 175137.5, 3…
##  7 E01014492 (((353215.9 177024.6, 353210.2 177011.1, 353209.3 177009.1, 3…
##  8 E01014493 (((352337 177024, 352336.3 177025.2, 352331 177033.5, 352330 …
##  9 E01014494 (((352702 176779, 352703.2 176782.1, 352704.6 176785.9, 35271…
## 10 E01014495 (((353137.9 176866.4, 353138.4 176867, 353138.6 176867.3, 353…

and a few rows of our clean_ward_centroid:

## # A tibble: 10 x 3
##    LSOA11CD  ward_name                   ward_id  
##    <chr>     <chr>                       <chr>    
##  1 E01014485 Ashley                      E05010885
##  2 E01014486 Ashley                      E05010885
##  3 E01014487 Ashley                      E05010885
##  4 E01014488 Ashley                      E05010885
##  5 E01014489 Ashley                      E05010885
##  6 E01014491 Ashley                      E05010885
##  7 E01014492 Avonmouth & Lawrence Weston E05010886
##  8 E01014493 Avonmouth & Lawrence Weston E05010886
##  9 E01014494 Avonmouth & Lawrence Weston E05010886
## 10 E01014495 Avonmouth & Lawrence Weston E05010886

We want to stick the 'ward_id' from clean_ward_centroid in each row of bristol where LSOA11CD is the same in both dataframes. This would match up each LSOA with its ward_id.

To do this, we’ll use dplyr joins. dplyr has a bunch of different kinds of joins, but they all kind of have similar structure. They all take two dataframes (x,y), and a column that contains values that we can use to match up each row of x rows in y. And, all columns in x and y are in the final output.

  • An inner_join(x,y,by='key') is a join between two tables x and y using a column (specified with the by option). The inner part means that only rows of x and y that have a matching value in the by column are kept; all rows in x or y with no match are dropped from the result.
  • A left_join(x,y,by='key') keeps all rows in x, but puts NA or missing values when a row of y can’t be matched to a row of x.
  • Likewise, the right_join(x,y,by='key') keeps all rows in y, putting missing values anywhere where there is no match in x.
  • An outer_join(x,y,by='key') will use all rows of both x and y, with missing values on the y side if x a row has no match in y, and missing values on the x side of y has no match in x.

So, for our purposes, we want to match the ward_id inside clean_ward_centroid back up to the LSOAs in bristol. And, we want to see if our join failed to match up any wards to LSOAs. This means we want either a left or right join, since we want to keep values from bristol that might not have been matched successfully in our spatial join. Thus, I’d recommend:

bristol = dplyr::left_join(bristol, clean_ward_centroid, by='LSOA11CD')
# The equivalent Right Join would be flipped from the left join:
# bristol = dplyr::right_join(clean_ward_centroid, bristol, by='LSOA11CD')

Now, to check everything’s done:

plot(bristol['ward_id'], reset=F)
plot(sf::st_centroid(bristol), add=T, color='k', pch=20)
## Warning in st_centroid.sf(bristol): st_centroid assumes attributes are
## constant over geometries of x
## Warning in plot.sf(sf::st_centroid(bristol), add = T, color = "k", pch =
## 20): ignoring all but the first attribute
plot(sf::st_geometry(wards), add=T, border='white',lwd=2)

Ooh! Look at that weird one! Let’s zoom in:

plot(bristol['ward_id'], reset=F, 
     xlim=c(357750, 359500), ylim=c(175500, 175700))
plot(sf::st_geometry(sf::st_centroid(bristol)),
     add=T, color='k', pch=20, 
     xlim=c(357750, 359500), ylim=c(175500, 175700))
plot(sf::st_geometry(wards), add=T, border='white',lwd=5, 
     xlim=c(357750, 359500), ylim=c(175500, 175700))

Oof, despite the fact that this is “correct”, this still yields results that we might consider to be counterintuitive. It’s always important to visualize the results of your joins when you do them, to make sure they look like what you expect.

Spatial Effects

We’ve talked before about structural relationships between features over geographies. Called “spatial autocorrelation,” we talked about this in terms of how an observation is dependent on itself within a given geography. So, for instance, we’ve spent a long time looking at or thinking about something that looks like this:

dev.off() # you may need to run this after making complicated maps!!
## null device 
##           1
library(spdep)
nearby = spdep::dnearneigh(sf::st_centroid(bristol), 0, 2000)
nearby = spdep::nb2listw(nearby)
bristol['nearby_living_env'] = spdep::lag.listw(nearby, bristol$living_env)
plot(nearby_living_env ~ living_env, data=bristol, pch=20)
abline(lm(nearby_living_env ~ living_env, data=bristol), col='orangered', lwd=2)
abline(v=mean(bristol$living_env), col='slategrey', lty='dashed', lwd=2)
abline(h=mean(bristol$nearby_living_env), col='slategrey', lty='dashed', lwd=2)

But, most of our interest in social science comes from building models where some data \(X\) allows us to predict or explain some outcome, \(y\). How can we use geographical structure to account for this?

Four Common Spatial Effects

In our last lecture, we talked about four distinct kinds of spatial effects:

  1. spatial heterogeneity effect that models differing baselines (in terms of the mean or variance) between geographical groups, such as regions or neighborhoods.
  2. spatial dependence effect that models the fundamental relationships between outcomes, so that nearby outcomes are more similar to distant ones or that cluster in prediction errors can be modeled.
  3. spatial nonstationarity effect that allows the relationship between the predictor and the observed value to change over geography.
  4. spatial context effect that models the effect of an observations’ surrounding conditions on the response at that site.

Exercise: Thinking about spatial effects

Take the linear model with a single effect \(\beta\) and mean \(\alpha\).

\[ y = \alpha + x\beta + \epsilon\]

  1. What part (i.e. terms, like \(y\), \(x\), \(\beta\), etc.) of the equation above does spatial heterogeneity relate to? (2 terms)
  2. What part of the equation above does spatial dependence relate to? (2 term)
  3. What part of the equation does spatial nonstationarity relate to? (1 term)
  4. What part of the equation does spatial context relate to? (2 terms)

Spatial Heterogeneity

Let’s consider a really simple model relating housing quality and crime:

model = lm(crime ~ housing, data=bristol)

This model predicts crime prevalence in an LSOA using the housing deprivation of that LSOA. There are a few rudimentary theories we could posit about this structural relationship. For instance, low quality housing may be easier to break into or vandalize, making it more likely to be targeted by criminals.1 \[ \begin{split} crime = & \ \ \ \ \ baseline \ \ crime \\& + (housing\_deprivation) \bullet (extra\ \ crime\ \ per\ \ housing\_deprivation) \\&+ unexplained \ \ crime \end{split}\] spelled in a more terse way, this is: \[ y_i = \alpha + x_i\beta + \epsilon_i\] where \(y_i\) is crime in an LSOA, \(x_i\) is housing quality in that LSOA, and \(\epsilon_i\) is the error in predicting crime using housing deprivation in that LSOA.

One way we can figure out if our model might need to include spatial effects is to look at the model’s errors, or residuals: \[ e_i = y_i - \hat{y}_i = y_i - (\hat \alpha + x_i\hat{\beta})\]

So, let’s extract the residuals at LSOA level:

bristol['residual'] = residuals(model)

Exercise: Understanding Spatial Heterogeneity

  1. Make boxplots of the model residuals by ward.2 Visually,
  • is there a difference in the location of the boxes/median residual by ward?
  • is there a difference in the width of the boxes/variability of the residuals by ward?
  1. Fit a spatial fixed effect model of the form: \[ y_i = \alpha_{j[i]} + x_i\beta + \epsilon\] This spatial fixed effect model predicts crime at site \(i\) as a function of housing deprivation at site \(i\), but lets the group baseline for crime, \(\alpha_{j[i]}\) be used instead3 of the overall full map baseline for crime, \(\alpha\). To do this in R, you have to use a -1 to avoid fitting a global intercept and just fit each ward mean:
fe_model = lm(crime ~ -1 + housing + ward_name, data=bristol)

Interpreting the ward_name effect as the baseline response in that ward, name two areas where there is a significantly higher-than-zero baseline crime in that ward, and two areas where there is significantly lower-than-zero baseline crime.

  1. From the results above, do you think that the model incorporating spatial fixed effect model may help resolve spatial heterogeneity in means? Would you prefer to use this model? why or why not?
  2. The lmtest::bptest we discussed in the regression module can be used to check for spatial heterogeneity in variance, too. Using lmtest::bptest(model, ~ward_name, data=bristol), test whether or not there are significant differences in the variability of prediction errors between different neighborhoods.

Challenge: Location & Scale

  1. Thinking about the spatial fixed effect model and the results from our bptest, does the fe_model exhibit spatial heterogeneity in its prediction error? Run the bptest again to check. Thinking about distribution means and variances, why should you not be surprised that the spatial fixed effect model has this result from the bptest?

Spatial Dependence

Next, we’ll move to spatially-dependent models. These are models that involve some amount of feedback or structural interaction between the outcome at one site (say, site \(y_i\)) and other sites \(y_j\) that are nearby. Usually, these kinds of models are expressed in terms of the feedback between sites. They’re used to model spillovers, mainly in analysis of social systems. One very common model that is used to express these kinds of feedbacks is called the simultaneous autoregressive model. It’s simultaneous because each site \(i\) is modeled as if it were affecting other sites \(j\) all in the same time period. And, it’s autoregressive because parts of \(y\) will be regressed on itself. Mathematically, the model is spelled: \[ y_i = \rho \sum_j^N w_{ij}y_j + x_i\beta + \epsilon_i\] for some outcome at site \(i\). Again, recall that \(\sum_j^Nw_{ij}y_j\) should be thought of as the local average around \(y_i\), or spatial lag of \(y_i\). Thus, this model uses the local average of \(y_i\) around that site to predict \(y_i\) itself.

This is a somewhat confusing model given the fact that \(y\) appears on both sides of the equation.4 We won’t delve into its interpretation here. But, given our nearby spatial weighting idea this models the crime at site \(i\) as a function of the crime within \(2000\) meters and the housing deprivation at that site.[^hint-reduction2] In truth, we always end up removing \(y_j\) from the “wrong” side of the equation to isolate \(y_i\) on the left hand side alone. But, this requires some matrix mathematics. At a conceptual level, housing deprivation in the area around site \(i\) itself is affected by \(i\); this is exactly what generates a feedback. However, because we have to isolate the effect of each \(y_i\) on other \(y_j\) in order to estimate the model, it is only the spatial structure of housing deprivation that ends up mattering: we have to get all of the \(y\) on one side of the equation in order to estimate the model.

We’ll fit one of these models now, and maybe take a look at the estimates of spillover from this model.

Exercise: Understanding Spatial Dependence

  1. Make a map of the model residuals from the previous question.
  • In plain language, describe the map.
  • Where does the model overpredict crime values from the structural relationship between crime and housing deprivation? That is, is there some place in the map where our prediction of crime made using info about housing deprivation is way higher than the actual crime?
  • Where does the map show that the model underpredicts crime?
  1. From your visual perception, do you think that the residuals are spatially-autocorrelated? Why or why not?
  2. Verify whether the residuals are autocorrelated using the spdep::lm.morantest function, using the nearby weights we made earlier.
  3. We will fit a simultaneous autoregressive model, sometimes called a SAR model, using the following function:
sar_model = spdep::lagsarlm(crime ~ housing, nearby, data=bristol)
  • Look at summary(sar_model);think of rho as an analogue of Moran’s \(I\), which can vary from close to \(-1\) up to \(1\) (kind of like a correlation coefficient). Thinking about \(\rho\) in this way, is there evidence of significant feedback/spatial dependence between our responses at sites in Bristol?
  • In the equation above, when \(\rho\) is positive, it means that the local average \(\sum_j^N w_{ij}y_j\) has a positive impact on \(y_i\). When \(\rho\) is negative, it means the local average has a negative impact on \(y_i\). In our model, does our estimate of \(\rho\) provide evidence of positive spatial dependence or negative spatial dependence?
  • What happened to \(\beta\), our effect of housing deprivation on crime? Is it significantly changed, or only slightly changed?5

Spatial Nonstationarity

Spatial nonstationarity involves changes in the effect of \(X\) on \(y\) over geography. This could mean that, in some areas an increase in \(X\) is related to an increase in \(y\), and in other areas the reverse holds. Spatial Nonstationarity is a particular problem for geographical analysis, because it is difficult to dis-entangle from measures of spatial dependence and measures of context. As such, models that examine spatial nonstationarity face many conceptual and practical challenges, such as - Where do things change? in what regions is the effect stable and when does it become unstable? - How does the effect change? Is there a clear theory for what we’re looking for, or are we simply looking for the existence of instability/effect change?

In a very limited sense, spatial nonstationarity is easy to examine. That is, if and only if we have a known or recieved geography over which we believe the effect will change. One simple way we could do that right now is to examine if different areas of Bristol have different relationships between living environment and housing quality in Bristol. As such, let’s create a binary indicator variable that splits Bristol into counties north of and south of the river.

south_of_river = c('Bedminster', 'Bishopsworth', 'Brislington East', 'Brislington West',
                   'Filwood', 'Hartcliffe & Withywood', 'Hengrove & Whitchurch Park',
                   'Knowle', 'Southville', 'Stockwood', 'Windmill Hill')
bristol['is_south'] = bristol$ward_name %in% south_of_river

Exercise: Spatial Nonstationarity

  1. First, fit a model of the relationship between the living environment and housing quality in Bristol. What is the relationship between housing deprivation and living environment/green space deprivation in Bristol?
  2. The following plot is a plot of the living environment deprivation predicted by housing deprivation. In addition, I’ve colored the points depending on which side of the river the locality falls. Finally, I’ve plotted the regression you estimated in the last step above.
ggplot(bristol, aes(x=housing, y=living_env, color=is_south)) +
    geom_point() +
    geom_smooth(aes(x=housing, y=living_env), method='lm', inherit.aes=F, color='black')

  • Visually, do you see a difference in the two colors of point clouds?
  • If you had to bet, is there a difference in the two areas’ baselines (i.e. intercepts?)6
  • Is there a difference in the two areas’ gradients (i.e. slopes/\(\beta\))?
  1. Now, I’ve altered the plot above to show you the lines of best fit for both groups, separately:
ggplot(bristol, aes(x=housing, y=living_env, color=is_south)) +
    geom_point() +
    geom_smooth(method='lm')

Run this code for yourself. Visually, did you win your bets from the previous question?

  1. Fit two separate models of living_env ~ housing and examine the results7.
  • Are the slopes significantly different?8
  • Are the two intercepts significantly different?
  1. Given your exercises above, do you think there is spatial nonstationarity in the structure of the relationship between housing and living environment deprivation in bristol? At a human level, what is the relationship between housing quality and living environment quality in the north? in the south? Overall?

Challenge: Interactions

  1. Interaction effects (which we’ve discussed a little bit) can be used to construct this kind of model comparison. Using the interaction effect model: living_env ~ housing*is_south,
  • Write the equation for what an interaction effect model like the form above would look like.9
  • Plugging in the values from your estimate of that model, can you substitute in \(0\) for is_south and get one of the models above? How about substituting in \(1\)?

Spatial Context

The last kind of effect that is common is a spatial contextual effect. This captures the effect of surroundings on a model outcome. Practically speaking, this is often done using the local average of a predictor variable as an extra predictor itself: \[ y_i = \alpha + x_i\beta_1 + \left(\sum_i^N w_{ij}x_j \right)\beta_2 + \epsilon\] This model (called the SLX model from Hallack-Vega (2015)) is one of a few different kinds of contextual effect models out there. The SLX model is interesting because it provides a distinct effect for the site \(\beta_1\) and the situation \(\beta_2\) that can be interpreted independently of one another. Unlike the models considered in spatial dependence, the SLX model is mathematically much simpler to estimate, interpret, and discuss.

Exercise: Spatial Context

We’ll examine the SLX model for a very simple regression that relates crime and housing.

  1. Create the local average of (using the “touches” spatial relationship)10, and save it into your bristol dataframe as a new column called touches_housing.
  2. Going back to our crime & housing model, what’s the \(R^2\) of that model?
  3. Fit a model crime ~ housing + touches_housing.
  • What’s the new \(R^2\) of this model?
  • What is the impact of the surrounding LSOA’s housing deprivation on crime values? If the surrounding housing deprivation goes up, what happens to crime incidence?
  • Does this change your interpretation of the relationship between crime and housing deprivation itself?
  1. Now, try fitting a model predicting housing deprivation from income deprivation: housing ~ income. Arguably, income is what largely drives the process of choosing housing, both in a city-wide sense and in a local sense.
  • What is the \(R^2\) of that model?
  • What are the effect estimates of that model?
  1. Now, build a column of the local average incomes using the touches relation. Then, fit a model of housing ~ income + touches_income.
  • Does the income deprivation of the LSOAs that touch have an impact on housing deprivation in that tract?
  1. Finally, build a new local average using the nearby relation we build before, which is the local average within 2km.
  • Make a map of the local average built from the touches relation, and the average built from the nearby relation. How do the two maps differ?
  • Then, build a regression housing ~ income + nearby_income.
  • What is the difference between this model and the touches model? Is the context measured in this new model significant? Does the \(R^2\) improve between this nearby model and the model that does not incorporate spatial context? From a larger conceptual point, why might one context matter differently than another?

Conclusion

This is the last lecture in the Topics in Advanced Spatial Modelling module. Together, this presents the main concepts behind nearly any different kind of geographical/statistical regression analysis you’ll see in more advanced spatial modelling11 For instance, multilevel models are kind of like our spatial heterogeneity & spatial nonstationarity exercises, but combined together in a single model. Geographically-weighted regression is kind of like the contextual, heterogeneity, and nonstationarity exercises blended together. Much of spatial econometrics deals with autoregressive models that explicitly have spillovers/interactions between adjacent units, as well as dealing with contextual effects and data-driven heterogeneity discovery.

For a long time, quantitative geography was mainly interested in the existence of heterogeneity (e.g. Hartshornian regional geography focusing on place-differences). Then, arguably after the Whittle (1955) model and later from Tobler (1970)’s analysis of Detroit, analytical/quantitative geography became intensely interested in dependence. Alongside of this and Casetti’s original work on the “expansion” method (Casetti, 1972), geographers began to take nonstationarity seriously, culminating in a lot of work now in the subject. Contextual effects in geographical/spatial models really begin in earnest in the late 80s (e.g. Raudenbush & Bryk (1986)) and (as such) are one of the more recent contributions to the conceptual overview of the way geography is modeled. What the future spatial effects that quantitative geographers focus on remains to be seen.

Footnotes


  1. But, of course, we’re leaving out really strongly significant elements that a model such as this should consider, like income or legacies of oppression/social domination.

  2. Using boxplot(x ~ category, data=dataset) will give you boxplots of the variable x stratified by the category category for the observations in dataset.

  3. Think of this like you do in R. If \(j\) were a vector containing the group in which each of the sites \(i\) fell, then \(j[i]\) would be \(i\)’s group. Then, \(\alpha_{j[i]}\) could be thought of as the \(\alpha\) corresponding to the group for observation \(i\).

  4. We can restate this in something called reduced form, where \(y\) only shows up on one side of the equation, if we use a little matrix algebra. We start with an expression of the same model above, recognizing that the restatement of \(\sum_j^N w_{ij}y_j\) can be re-done in terms of matrix multiplication between the matrix \(\mathbf{W}\) and vector \(y\): \[ y = \rho \mathbf{W}y + x\beta + \epsilon\] This model can no longer be thought of in terms of individual sites; instead, it models the whole map at once, as a single entity. This is because each site outcome is related to every other site outcome, since \(y\) is on both sides of the equation. Using a little more algebra, we can take a few steps: \[\begin{align} y &= \rho \mathbf{W} y + x\beta + \epsilon \\ y - \rho \mathbf{W} y &= x\beta + \epsilon \end{align}\] Now, we can “factor out” that \(y\). But, when we do factoring in matrix algebra, we usually have to replace \(y\) with an \(I\), the identity matrix… This is just a trick to ensure that the shapes of our final matrix product are “correct,” but we still get \(y\) at the end of the day, since \(I y = y\). \[ (I - \rho \mathbf{W}) y = x\beta + \epsilon \] Finally, we have to “divide” out that \((I - \rho \mathbf{W})\) term to get \(y\) by itself. But, we can’t really just divide by a matrix; instead we have to multiply both sides by the matrix inverse of what we want to get rid of: \[ (I - \rho \mathbf{W})^{-1}(I - \rho \mathbf{W}) y = (I - \rho \mathbf{W})^{-1}(x\beta + \epsilon)\] Since a matrix times its inverse cancels into \(I\) (just like \(\frac{x}{x} = 1\)), we get: \[ y = (I - \rho \mathbf{W})^{-1}(x\beta + \epsilon)\] You do not need to know this for the exam.

  5. It’s important to note that the \(\beta\) effect we’ve just estimated in this SAR model is not actually the change in \(y\) given a unit change in \(x\). This is because it matters where the “unit change in \(x\)” occurs! Some sites will have stronger spillovers than others, and so will exert a stronger impact on \(y\) in these models.

  6. One simple way you can think about this is to just eyeball a line of best fit for one color of points. Alternatively, think about how you’d shift the black line to better fit for one of the colors than the other. How would you have to tilt the line up or down (slope) shift it left or right (intercept)?

  7. You’ll probably want to use the subset() function to split the data into two halves.

  8. Remember: if two estimates’ confidence intervals overlap, the true value of the parameter might just be in that interval, and the two estimates are actually uncertain measurements of the same value.

  9. For example, an interaction effect model of y ~ x*is_red would be written as: \[ y = \alpha + is\_red + x\beta_1 + x * is\_red * \beta_2 \]

  10. remember, in spdep, this is the spdep::poly2nb function. And, don’t forget that you have to convert the nb to list2 using something in spdep

  11. save for maybe agent-based/geosimulation-style models.