---
title: "Mesa High Vignette"
author: "Benjamin W. Campbell"
date: "`r Sys.Date()`"
output: rmarkdown::html_vignette
bibliography: fergm.bib
vignette: >
  %\VignetteIndexEntry{Mesa High Vignette}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---
## Vignette Info and Introduction
This vignette uses the faux.mesa.high network from the ergm package [@hunter2008ergm] to provide an example of the basic functionality of the fergm package [@box2018modeling].  In particular, it walks through the estimation of a well-fitting Exponential Random Graph (ERGM) model on the simulated "Mesa High" network.  Once estimated, the corresponding Frailty Exponential Random Graph (FERGM) model is estimated and compared to the comparable ERGM.  The documentation for each individual function includes chunks of code taken from this vignette.  

It is worth noting that many of these functions are built upon the `rstan` package [@guo2016rstan].  Given that the output of the `fergm` function is a `stanfit` object, one may use much of the built-in functionality of the `rstan` package should they be familiar with the package.  

This vignette proceeds by walking through an example step-by-step to illustrate a representative workflow for using the FERGM package.  

*Note: The code presented here is not built given the runtime for the FERGM.  As such, the code presented is purely illustrative.  It will successfully execute, however.* 

## Importing Mesa High Network 

The following chunk of code imports the Mesa High network generously provided in the `ergm` package [@hunter2008ergm].   

```{r, eval = FALSE}
# Load statnet which contains the ergm package and the faux.mesa.high network.
library(statnet)

# Set seed for replication
set.seed(1)

# Load faux.mea.high daa
data("faux.mesa.high")

# Rename the object
mesa <- faux.mesa.high
```

## ERGM and FERGM Estimation
The second step, once the data is imported, is to estimate a well-fitting ERGM and its FERGM counterpart.  The `fergm` function takes at least two arguments, the network object to have an FERGM fit on and a character string formula containing undirected `ergm-terms`. A variety of other function arguments can be specified to override defaults, including the seed, the number of chains, number of warmup iterations, total number of iterations, and the number of cores used.  The `fergm` function returns a named list of two objects: `fergm$stan.dta` returns the data object handed to Stan and `fergm$stan.fit` is the `stanfit` object returned.  The following chunk of code handles this.  

```{r, echo = TRUE, eval = FALSE}
# ERGM fit
ergm.fit <- ergm(mesa ~ edges +
                   nodematch('Sex') +
                   nodematch('Grade', diff = FALSE) +
                   nodematch('Race', diff = FALSE) +
                   gwesp(decay = 0.2, fixed = TRUE) +
                   altkstar(lambda = 0.6, fixed = TRUE))

# FERGM fit
library(fergm)
form <- c("edges + nodematch('Sex') + nodematch('Grade', diff = FALSE) +
        nodematch('Race', diff = FALSE) + gwesp(decay = 0.2, fixed = TRUE) + 
        altkstar(lambda = 0.6, fixed = TRUE)")

fergm.fit <- fergm(net = mesa, form = form, chains = 1)

```

## Summarizing FERGM Output
While the `fergm` package does not contain a built in `summary` function for the FERGM, there are several means to summarize `fergm` output. One way to do so cleanly is to use the built-in `clean_summary` function.  This function takes two objects: the output of the `fergm` function and a vector of custom character names for each coefficient.  If a vector of custom chaaracter names is not provided then the function inherits the formula used in the `fergm` call, which holds true for other functions including the `custom_var_names` argument.  

In addition, we provide a built-in function to create coefficient plots: `coef_plot`.  This function takes either an `fergm` object to plot the FERGM coefficients or both an `fergm` and `ergm` object to plot and compare the coefficients.  

We also include a built-in function to plot the densities for each coefficient of interest, `coef_posterior_density`. The code is as follows:

```{r, echo = TRUE, eval = FALSE}
# Conventional rstan approach to extracting posterior summary
stan.smry <- summary(fergm.fit$stan.fit)$summary
beta_df <- stan.smry[grep("beta", rownames(stan.smry)),]
est <- round(beta_df[,c(1,4,8)], 3)
est # in order of "form"

# fergm built-in function to summarize posteior
est <- clean_summary(fergm.fit)
est <- clean_summary(fergm.fit, 
                     custom_var_names = c("Edges", "Sex Homophily",
                                          "GradeHomophily", "Race Homophily",
                                          "GWESP", "Alternating K-Stars"))
est
# Compare substantive implications via coef plot, these are with 95% credible intervals
coef_plot(fergm.fit = fergm.fit)

coef_plot(fergm.fit = fergm.fit, 
          custom_var_names =  c("Edges", "Sex Homophily", "Grade Homophily", 
                                "Race Homophily", "GWESP", "Alternating K-Stars"))
coef_plot(fergm.fit = fergm.fit, 
          ergm.fit = ergm.fit, 
          custom_var_names =  c("Edges", "Sex Homophily", "Grade Homophily", 
                                "Race Homophily", "GWESP", "Alternating K-Stars"))


# You can also look at the density of particular variables using the following:

densities <- coef_posterior_density(fergm.fit = fergm.fit)

densities <- coef_posterior_density(fergm.fit = fergm.fit, 
                                    custom_var_names = c("Edges", "Sex Homophily", 
                                                         "Grade Homophily", "Race Homophily", 
                                                         "GWESP", "Alternating K-Stars"))
densities[[1]]
densities[[2]]
```

There are also a series of `rstan` functions to summarize posterior distributions, and we would refer the reader to the `rstan` manual should they be interested in learning their alternatives.

## FERGM Diagnostics
To visually check whether there is evidence that the chains used to estimate the FERGM have converged, traceplots may be used.  While `rstan` has a built-in traceplot function that could easily be used, changes may be desired to produce publication-quality graphics.  As such, building upon the `rstan::traceplot()` function, we provide a cleaner alternative.  The code is as follows:

```{r, echo = TRUE, eval = FALSE}
# Use rstan functions to assess whether chains have evidence of converging
trace <- rstan::traceplot(fergm.fit$stan.fit, pars = "beta")
trace

# We have our own version that includes variable names and tidies it up a bit
fergm_beta_traceplot(fergm.fit)

fergm_beta_traceplot(fergm.fit,
                     custom_var_names =  c("Edges", "Sex Homophily", 
                                           "Grade Homophily", "Race Homophily", 
                                           "GWESP", "Alternating K-Stars"))
```

Additional diagnostics are available using the `rstan` package, and we would refer the reader to their well-written manual for further diagnostics or details on their built-in diagnostic plots.  

## Compare ERGM and FERGM Fit
One might be interested in the relative difference in predictive performance between an ERGM and an FERGM.  While the `coef_plot()` function offers an opportunity to compare the coefficients of these two models, it does not provide a means to assess the relative fit of each model. Relative fit is examined through simulating a number of networks based upon the model results and examining the average number of correctly predicted ties across all simulated networks.  This routine is described by the manuscript presenting the model [@box2018modeling].  **Note, specifying an object named net prior to estimating the compare_predictions function is necessary to produce a valid network on the left-hand side of the simulate method for ERGMs.**  The code to perform this routine is as follows:

```{r, echo = TRUE, eval = FALSE}
# Use fergm built in compare_predictions function to compare predictions of ERGM and FERGM
net <- ergm.fit$network
predict_out <- compare_predictions(ergm.fit = ergm.fit, fergm.fit = fergm.fit, replications = 100)

# Use the built in compare_predictions_plot function to examine the densities of correctly predicted
  # ties from the compare_predictions simulations
compare_predictions_plot(predict_out)

# We can also conduct a KS test to determine if the FERGM fit it statistically disginguishable 
  # from the ERGM fit
compare_predictions_test(predict_out)
```

# References