Skip to content

Commit ff53b6e

Browse files
committed
Almost done
1 parent 2fc9e9a commit ff53b6e

17 files changed

+2204
-120
lines changed

.gitignore

+3
Original file line numberDiff line numberDiff line change
@@ -2,3 +2,6 @@
22
.Rhistory
33
.RData
44
.Ruserdata
5+
slides_*
6+
*.log
7+
*.aux

bibliography.bib

+7-1
Original file line numberDiff line numberDiff line change
@@ -168,4 +168,10 @@ @inproceedings{Kim2017
168168
publisher = {ACM},
169169
address = {New York, NY, USA},
170170
keywords = {collective intelligence, online collaboration, online games, team performance, virtual teams},
171-
}
171+
}
172+
173+
@article{sampson1969novitiate,
174+
title={A novitiate in a period of change: An experimental and case study of social relationships.},
175+
author={Sampson, Samuel F},
176+
year={1969}
177+
}

ergmitos-header.tex

+9-8
Original file line numberDiff line numberDiff line change
@@ -15,14 +15,14 @@
1515
% Objects
1616
\newcommand{\params}{\theta}
1717
\newcommand{\Params}{\Theta}
18-
\newcommand{\Graph}{\mathbf{G}}
19-
\newcommand{\graph}{\mathbf{g}}
20-
\newcommand{\GRAPH}{\mathcal{G}}
18+
\newcommand{\Graph}{\mathbf{Y}}
19+
\newcommand{\graph}{\mathbf{y}}
20+
\newcommand{\GRAPH}{\mathcal{Y}}
2121
\newcommand{\Adjmat}{\mathbf{A}}
2222
\newcommand{\adjmat}{\mathbf{a}}
23-
\newcommand{\DEPVAR}{\mathcal{Y}}
24-
\newcommand{\Depvar}{Y}
25-
\newcommand{\depvar}{y}
23+
\newcommand{\DEPVAR}{\mathcal{Z}}
24+
\newcommand{\Depvar}{Z}
25+
\newcommand{\depvar}{z}
2626

2727
\newcommand{\INDEPVAR}{\mathcal{X}}
2828
\newcommand{\Indepvar}{\mathbf{X}}
@@ -53,8 +53,9 @@
5353
\definecolor{USCGold}{HTML}{FFCC00}
5454
\definecolor{USCGray}{HTML}{CCCCCC}
5555

56-
% To use the function \sout
57-
\usepackage{ulem}
56+
57+
\usepackage{ulem} % To use the function \sout
5858
\usepackage{tabularx, booktabs}
59+
\usepackage{tcolorbox} % Fancy looking boxes
5960

6061
% \bibliography{bibliography.bib}
184 KB
Loading

fig/g1.pdf

1.57 KB
Binary file not shown.

fig/g2.pdf

1.58 KB
Binary file not shown.

fig/g3.pdf

1.74 KB
Binary file not shown.

fig/github.png

6.4 KB
Loading

fig/parts-of-ergm.pdf

32.8 KB
Binary file not shown.

fig/parts-of-ergm.svg

+1,443
Loading

fig/scared.pdf

200 KB
Binary file not shown.

fig/scared.png

80.6 KB
Loading

fig/simply-not.jpg

62.7 KB
Loading

fig/twitter.png

4.61 KB
Loading

slides.Rmd

+217-33
Original file line numberDiff line numberDiff line change
@@ -17,6 +17,7 @@ aspectratio: 169
1717
nocite: |
1818
@Csardi2015, @knitr, @rmarkdown, @R, @Handcock2006, @Wasserman1996
1919
bibliography: bibliography.bib
20+
handout: true
2021
---
2122

2223
```{r setup, include=FALSE}
@@ -27,7 +28,8 @@ knitr::knit_hooks$set(smallsize = function(before, options, envir) {
2728
"\n\\normalsize\n\n"
2829
}
2930
})
30-
knitr::opts_chunk$set(echo = TRUE, smallsize=TRUE)
31+
knitr::opts_chunk$set(echo = TRUE, smallsize=TRUE, out.width = ".6\\linewidth",
32+
fig.align = "center")
3133
```
3234

3335
## Social networks
@@ -38,6 +40,16 @@ knitr::opts_chunk$set(echo = TRUE, smallsize=TRUE)
3840
\caption{Friendship network of a UK university faculty. Source: \textbf{igraphdata} R package (Csardi, 2015). Figure drawn using the R package \textbf{netplot} (yours truly, https://github.com/usccana/netplot)}
3941
\end{figure}
4042

43+
## What drives \sout{\color{USCCardinal}social} networks?
44+
45+
If \color{gray}\textit{[blank]}\color{black}{} asks you to predict a network\pause
46+
47+
\Huge What kind of model?\pause
48+
49+
\Huge What features would you include?\pause
50+
51+
\normalsize
52+
4153
## Exponential Family Random Graph Models (ERGMs)
4254

4355
Why are you and I are \color{gray}\textit{[blank]} \color{black}? (friends, collaborators, etc.)
@@ -50,52 +62,225 @@ knitr::include_graphics("fig/friendly-terms.pdf")
5062

5163
## ERGMs from scratch
5264

53-
We need to build a probability function...\pause
65+
We need to build a probability function for \includegraphics[width=.05\linewidth]{fig/g1.pdf}...\pause
5466

55-
- First, we will focus on counts: "How many edges?", "How many homophilic ties?".
56-
We will call them "sufficient statistics"
57-
58-
$\# edges, \#homophilic\;ties, \dots$
59-
\pause
67+
\begin{centering}
6068

61-
- As we do in \sout{life} statistics, let's assume it is an additive model (we add stuff up), in a weighted fashion (i.e. we have model parameters!)
62-
63-
$\theta_{1} \times \#edges + \theta_{2} \times \#homophilic\;ties + \dots$
64-
\pause
69+
\def\tbw{.6\linewidth}
6570

66-
- And since we like things to be positive... we just exponentiate it!
71+
\note{First, we will focus on counts: "How many edges?", "How many homophilic ties?".
72+
We will call them "sufficient statistics"}
73+
\note{As we do in \sout{life} statistics, let's assume it is an additive model (we add stuff up), in a weighted fashion (i.e. we have model parameters!)}
74+
\note{And since we like things to be positive... we just exponentiate it!}
75+
\note{Finally, as probabilities should add up to 1, we will divide the thing by the sum of all possible cases (the "normalizing constant")}
6776

77+
\begin{tcolorbox}[width=\tbw]
78+
$\# edges, \#homophilic\;ties, \dots$
79+
\end{tcolorbox}\pause
80+
\begin{tcolorbox}[width=\tbw]
81+
$\theta_{1} \times \#edges + \theta_{2} \times \#homophilic\;ties + \dots$
82+
\end{tcolorbox}\pause
83+
\begin{tcolorbox}[width=\tbw]
6884
$\exp{\theta_{1} \times \#edges + \theta_{2} \times \#homophilic\;ties + \dots}$
69-
\pause
70-
71-
- Finally, as probabilities should add up to 1, we will divide the thing by the sum of all possible cases (the "normalizing constant")
85+
\end{tcolorbox}\pause
86+
\begin{tcolorbox}[width=\tbw]
87+
$\frac{\exp{\theta_{1} \times \#edges + \theta_{2} \times \#homophilic\;ties + \dots}}{\sum \exp{\dots}}$
88+
\end{tcolorbox}\pause
7289

73-
$\frac{\exp{\theta_{1} \times \#edges + \theta_{2} \times \#homophilic\;ties + \dots}}{\sum \exp{\dots}}$\pause
90+
\end{centering}
7491

7592
You got yourself an ERGM!
7693

7794
## ERGMs... the \textit{lingua franca} of SNA
7895

79-
- Seeks to answer the question: \emph{What local social structures gave origin to a given observed graph?}\pause
96+
<!-- - Seeks to answer the question: \emph{What local social structures gave origin to a given observed graph?}\pause -->
8097

81-
- The model is centered around a vector of \textbf{sufficient statistics} $\sufstats{}$, and is operationalized as:
82-
83-
\begin{equation}
84-
\Prcond{\Graph = \graph}{\params, \Indepvar} = \frac{%
85-
\exp{\transpose{\params}\sufstats{\graph, \Indepvar}}%
86-
}{
87-
\kappa\left(\params, \Indepvar\right)
88-
},\quad\forall \graph\in\GRAPH\label{eq:ergm}
89-
\end{equation}
90-
91-
Where $\kappa\left(\params, \Indepvar\right)$ is the normalizing constant and equals $\sum_{\graph'\in\GRAPH}\exp{\transpose{\params}\sufstats{\graph', \Indepvar}}$. \pause
98+
<!-- - The model is centered around a vector of \textbf{sufficient statistics} $\sufstats{}$, and is operationalized as: -->
99+
100+
\begin{figure}
101+
\centering
102+
\includegraphics[width = .8\linewidth]{fig/parts-of-ergm.pdf}
103+
\end{figure}
104+
105+
106+
------
107+
108+
\centering
109+
110+
There is one problem with this model ...
111+
112+
\includegraphics[width = .5\linewidth]{fig/parts-of-ergm.pdf}\pause
113+
114+
\large because of \color[HTML]{af0000}$\GRAPH$\color{black},
115+
the \color[HTML]{5726e7} \textbf{normalizing constant}\color{black}{} is \linebreak[4] a summation of $2^{n(n-1)}$ terms \includegraphics[width=.05\linewidth]{fig/scared.png}!\normalsize\pause
116+
117+
-----
118+
119+
To solve this, instead of directly computing this function, estimation is done by approximating ratios of likelihood functions instead (TL;DR we use simulations).
120+
121+
\begin{figure}
122+
\includegraphics[width=.6\linewidth]{fig/simply-not.jpg}
123+
\end{figure}
124+
125+
<!-- \begin{equation} -->
126+
<!-- \Prcond{\Graph = \graph}{\params, \Indepvar} = \frac{% -->
127+
<!-- \exp{\transpose{\params}\sufstats{\graph, \Indepvar}}% -->
128+
<!-- }{ -->
129+
<!-- \sum_{\graph'\in\GRAPH}\exp{\transpose{\params}\sufstats{\graph', \Indepvar}} -->
130+
<!-- },\quad\forall \graph\in\GRAPH\label{eq:ergm} -->
131+
<!-- \end{equation} -->
92132

93-
- The set of sufficient statistics reflects social and psychological mechanisms that are hypothesized to drive the network structure. Figure \autoref{fig:ergm-structs} shows some examples of values in $\sufstats{}$.\pause
133+
<!-- Where $\kappa\left(\params, \Indepvar\right)$ is the normalizing constant and equals $\sum_{\graph'\in\GRAPH}\exp{\transpose{\params}\sufstats{\graph', \Indepvar}}$. \pause -->
94134

95-
- In the case of directed networks, $\GRAPH$ has $2^{n(n-1)}$ terms.\pause
135+
<!-- - The set of sufficient statistics reflects social and psychological mechanisms that are hypothesized to drive the network structure. Figure \autoref{fig:ergm-structs} shows some examples of values in $\sufstats{}$.\pause -->
136+
137+
<!-- - In the case of directed networks, $\GRAPH$ has $2^{n(n-1)}$ terms.\pause -->
138+
139+
<!-- - See Wasserman, Pattison, Robins, Snijders, Handcock, Butts, and others. -->
140+
141+
142+
## Let's get going
143+
144+
We will use the famous Monk data from @sampson1969novitiate
145+
146+
```{r ergm-monks, message=FALSE}
147+
library(ergm)
148+
data(samplk, package="ergm")
149+
150+
# A glimpse into a network object (from the network package loaded with ergm)
151+
samplk1
152+
```
153+
154+
---
155+
156+
```{r ermg-vis, warning=FALSE, message=FALSE, out.width=".5\\linewidth"}
157+
library(sna) # Tools for SNA
158+
set.seed(1) # Graph layout is usually random-driven
159+
gplot(samplk1)
160+
```
161+
162+
\pause Let's add some color and other features
163+
164+
---
165+
166+
```{r ergm-vis-cont, out.width=".5\\linewidth"}
167+
set.seed(1)
168+
cols <- viridisLite::magma(4)[as.factor((samplk1 %v% "group"))]
169+
gplot(samplk1, vertex.cex = degree(samplk1)/4, vertex.col = cols, edge.col = "gray")
170+
```
171+
96172

97-
- See Wasserman, Pattison, Robins, Snijders, Handcock, Butts, and others.
173+
## A simple ergm model
98174

175+
```{r ergm-mcmc1, message=TRUE, warning=FALSE}
176+
# Estimating the model
177+
ans <- ergm(
178+
samplk1 ~ edges + nodematch("group") + ttriad,
179+
control = control.ergm(seed = 112)
180+
)
181+
182+
```
183+
184+
----
185+
186+
```{r ergm1-summary}
187+
summary(ans)
188+
```
189+
190+
The common way to continue is: adding/removing terms, checking convergence, and checking goodness-of-fit.
191+
192+
----
193+
194+
Now its time for small networks!
195+
196+
## `ergmito` example
197+
198+
```{r loading-fivenets, cache=TRUE}
199+
library(ergmito)
200+
data(fivenets, package = "ergmito")
201+
```
202+
203+
```{r plotfivenets, warning=FALSE, message=FALSE, echo=FALSE, fig.width=6, fig.height=3, out.width='.5\\linewidth', fig.align='center', cache=TRUE}
204+
library(sna)
205+
library(network)
206+
op <- par(mfrow = c(2, 3), mai=rep(0, 4), oma = rep(0, 4))
207+
USCCARDINAL <- rgb(153, 0, 0, maxColorValue = 255)
208+
ans <- lapply(fivenets, function(f) {
209+
gplot(
210+
f,
211+
vertex.cex = 2,
212+
vertex.col = c("white", USCCARDINAL)[
213+
get.vertex.attribute(f, "female") + 1
214+
]
215+
)
216+
})
217+
plot.new()
218+
plot.window(xlim = c(0, 1), ylim = c(0, 1))
219+
legend("center", fill = c("white", USCCARDINAL), legend = c("Male", "Female"), cex=1, bty="n")
220+
par(op)
221+
```
222+
223+
----
224+
225+
```{r fivenets-1, cache=TRUE}
226+
# Looking at one of the five networks
227+
fivenets[[1]]
228+
```
229+
230+
\pause How can we fit an ERGMito to this 5 networks?
231+
232+
## `ergmito` example (cont'd)
233+
234+
The same as you would do with the `ergm` package:
235+
236+
```{r fit-fivenets, cache=TRUE}
237+
(model1 <- ergmito(fivenets ~ edges + nodematch("female")))
238+
```
239+
240+
```{r fit-fivenets-print, results='asis', echo=FALSE, cache=TRUE}
241+
texreg::texreg(model1)
242+
# cat(gsub("#", "\\#", unclass(out), fixed=TRUE))
243+
```
244+
245+
---
246+
247+
248+
```{r gof-fivenets, cache=TRUE}
249+
(gof1 <- gof_ergmito(model1))
250+
```
251+
252+
253+
---
254+
255+
```{r gof-fivenets-print, cache=TRUE, out.width=".7\\linewidth", fig.align='center', width=8, height=6}
256+
plot(gof1)
257+
```
258+
259+
260+
261+
## Thanks!
262+
263+
\begin{centering}
264+
\includegraphics[width = .1\linewidth]{usc.pdf}
265+
266+
\large \textbf{\textcolor{USCCardinal}{George G. Vega Yon}}
267+
268+
\normalsize Let's chat!
269+
270+
\href{mailto:vegayon@usc.edu}{vegayon@usc.edu}
271+
272+
\href{https://ggvy.cl}{https://ggvy.cl}
273+
274+
\includegraphics[width=.02\linewidth]{github.png}\href{https://github.com/gvegayon}{@gvegayon}
275+
276+
\includegraphics[width=.02\linewidth]{twitter.png}\href{https://twitter.com/gvegayon}{@gvegayon}
277+
278+
\end{centering}
279+
280+
281+
---
282+
283+
\appendix
99284

100285
## Structures
101286

@@ -114,5 +299,4 @@ You got yourself an ERGM!
114299
\caption{\label{fig:ergm-structs}Besides of the common edge count statistic (number of ties in a graph), ERGMs allow measuring other more complex structures that can be captured as sufficient statistics. }
115300
\end{figure}
116301

117-
118-
## References
302+
## References {.allowframebreaks}

slides.pdf

379 KB
Binary file not shown.

0 commit comments

Comments
 (0)