You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Once the package is installed, you can load the library
42
+
43
+
```{r}
44
+
library(fasthplus)
45
+
```
46
+
47
+
34
48
# Introduction
35
49
36
50
This vignette serves as an introductory example on how to utilize the `fasthplus` package.
37
-
We introduce $H_{+}$, a simple modification of $G_{+}$ as first introduced by [@williams1971comparison], re-written in [@rohlf1974methods], and implemented in `R` in the `clusterCrit`pacakge[@desgraupes2018package].
51
+
We introduce $H_{+}$, a simple modification of $G_{+}$ as first introduced by [@williams1971comparison], re-written in [@rohlf1974methods], and implemented in `R` in the `clusterCrit`package[@desgraupes2018package].
38
52
Both $G_{+}$ and $H_{+}$ are quantifications of disconcordance, which can be thought of as the fitness of a contingency table generated using two label sets for the same observations.
39
53
The primary user function is `hpe` (H Plus Estimator), a quick means to estimate $H_{+}$ using two arbitrary vectors $A,B$, or a dissimilarity matrix $D$ and set of labels $L$.
40
54
$G_{+}$ and $H_{+}$ are estimators of the same theoretical disconcordance parameter -- $H_{+}$ more explicitly estimates this parameter.
41
-
$H_{+}$ can be thought of as the product of two parameters $\gamma_{A},\gamma_{B}$, which lends itself to a simple interpretation for $H_{+}$: $\gamma_{A}\times100\%$ of $a\inA$ are strictly greater than $\gamma_{B}\times100\%$ of $b\inB$.
55
+
$H_{+}$ can be thought of as the product of two parameters $\gamma_{A},\gamma_{B}$, which lends itself to a simple interpretation for $H_{+}$: $\gamma_{A}\times100\%$ of $a \in A$ are strictly greater than $\gamma_{B}\times100\%$ of $b \in B$.
42
56
For further exploration of disconcordance, these estimators ($G_{+}$ and $H_{+}$), as well as their theoretical properties, please see `RKIV PREPRINT HERE`.
43
57
44
58
@@ -49,11 +63,11 @@ The user can specify $p$, and `hpe` guarantees accuracy within $\pm \frac{1}{p-1
49
63
We provide two equivalent algorithms for this estimation process, with the further benefit that our algorithms yields a range of reasonable values for $\gamma_{A},\gamma_{B}$.
50
64
51
65
52
-
##Formulation examples
66
+
##Formulation examples
53
67
We provide two simulated examples of `hpe` usage
54
68
55
-
###$A,B$ formulation
56
-
This formulation seeks to quantify the answer a simple question: for two sets $A,B$ how often can we expect that $a>b,a\inA,b\inB$?
69
+
###$A,B$ formulation
70
+
This formulation seeks to quantify the answer a simple question: for two sets $A,B$ how often can we expect that $a>b,a \in A,b \in B$?
57
71
We simulate $A$ and $B$ as $n=10000$ draws from a univariate normal distributions with unit variance and slightly different means ($\mu_{A}=0.5,\mu_{B}=-0.5$).
We can apply $A,B$ formulation to a dissimilarity matrix $D$ and set of cluster labels $L$.
73
87
$L$ can be used to generate a binary adjacency matrix that tells us (for every unique pair of obvservaitons) whether two observations belong to the same group.
74
88
This adjacency matrix (more specifically, its upper-triangular elements) can then be used to classify every unique dissimilarity $d\in D$ as corresponding to a pair within the same cluster or not.
0 commit comments