Corrections and updates related to points 5 and 6 of reviewer Kaija Gahm

RonaldVisser · RonaldVisser · commit d19a1b8ecfca · 2024-03-29T17:29:15.000+01:00
diff --git a/NEWS.md b/NEWS.md
@@ -12,13 +12,14 @@
 
 ### Bug Fixes
 
--   corrected bug in cyto_clean_styles()
--   corrected error in wuchwerte(). It is not using the anos1 dataset from dplR anymore
--   corrected error in cyto_create_gn_style and cyto_create_cpm_style
+-   corrected bug in `cyto_clean_styles()`
+-   corrected error in `wuchwerte()`. It is not using the `anos1` dataset from dplR anymore
+-   corrected error in `cyto_create_gn_style` and `cyto_create_cpm_style`
+-   updated `cyto_create_cpm_style` because it was not working properly.
 
 ### Deprecated and defunct
 
--   dev-folder removed, since this was not needed (created by biocthis)
+-   dev-folder removed, since this was not needed (created by `biocthis`)
 
 ### Documentation fixes
 
@@ -29,7 +30,7 @@
 -   added Vignette on Cytoscape use
 -   moved information for using big datasett to seperate vignette
 -   updated README: added more installion instructions
--   corrected examples in cyto_create_gn_style and cyto_create_cpm_style
+-   corrected examples in `cyto_create_gn_style` and `cyto_create_cpm_style`
 
 ### Continous integration
 
@@ -40,9 +41,9 @@
 ### Minor improvements
 
 -   replaced igraph::graph.data.frame() with igraph::graph_from_data_frame(), since the former is deprecated in igraph 2.0.0
--   replaced igraph::is.simple with igraph::is_simple
--   replaced igraph::decompose.graph with igraph::decompose
--   correction to calls to functions grDevices::colorRampPalette and stats::pnorm
+-   replaced `igraph::is.simple` with `igraph::is_simple`
+-   replaced `igraph::decompose.graph` with `igraph::decompose`
+-   correction to calls to functions `grDevices::colorRampPalette` and `stats::pnorm`
 
 ### Bug Fixes
 
diff --git a/R/cyto_create_cpm_style.R b/R/cyto_create_cpm_style.R
@@ -33,6 +33,9 @@ cyto_create_cpm_style <- function(graph_input, k = 3, com_k = NULL, style_name =
   if (!igraph::is.igraph(graph_input)) {
     stop(paste0("Please use an igraph object as input. The current object is an ", class(graph_input), "."))
   }
+  if (is.null(com_k)) {
+    stop("Please present a data frame with the communities for the given clique size")
+  }
   if (is.numeric(k)) {
     if (style_name == "auto") {
       style_name <- paste0(substitute(graph_input), "_CPM(k=", k, ")")
@@ -43,17 +46,21 @@ cyto_create_cpm_style <- function(graph_input, k = 3, com_k = NULL, style_name =
     RCy3::copyVisualStyle("WhiteNodesLabel", style_name)
     # com_k <- clique_community_names(graph_input, k)
     com_count <- length(unique(com_k$com_name))
+    com_k_spread <- com_k %>%
+      dplyr::count(node, com_name) %>%
+      tidyr::spread(com_name, n)
+    RCy3::loadTableData(com_k_spread, data.key.column = "node")
     if (com_count == 1) {
       # RCy3::setNodeCustomPieChart does not work with a single column and therefore the nodes are coloured based on the single community
       RCy3::setNodeColorMapping(unique(com_k$com_name),
         table.column.values = 1,
         colors = RColorBrewer::brewer.pal(12, "Paired")[1],
+        mapping.type = "d",
         style.name = style_name
       )
     } else {
-      getPalette <- grDevices::colorRampPalette(RColorBrewer::brewer.pal(12, "Paired"))
       RCy3::setNodeCustomPieChart(unique(com_k$com_name),
-        colors = getPalette(com_count),
+        colors = RColorBrewer::brewer.pal(com_count, "Paired"),
         style.name = style_name
       )
     }
diff --git a/R/sim_table.R b/R/sim_table.R
@@ -8,8 +8,8 @@
 #'           min_overlap=50,
 #'           last_digit_radius=FALSE)
 #'
-#' @param trs1 Rwl object with first tree-ring series to be compared with trs2
-#' @param trs2 Optional second rwl object with second tree-ring series to be compared with trs1. Use this is you have to datasets that you want to compare.
+#' @param trs1 Rwl object with first tree-ring series to be compared with trs2. A rwl object is a data.frame with series or tree-ring widths as columns and years as rows. This object is created or read by using the dplR-package
+#' @param trs2 Optional second rwl object with second tree-ring series to be compared with trs1. Use this is you have two different datasets that you want to compare. Otherwise all series in trs1 are pair wise compared.
 #' @param min_overlap If the overlap of the compared series is longer or equal than this minimal value, the similarities will be calculated for the comparison
 #' @param last_digit_radius Set this to TRUE if the last digit of a series name is the radius of the tree-ring series
 #' @returns The resulting list includes the names of the compared series, overlap, correlation (both with and without Hollstein-transformation), t-value based on these correlations, SGC, SSGC and the related probability of exceedence.
diff --git a/README.Rmd b/README.Rmd
@@ -20,7 +20,9 @@ knitr::opts_chunk$set(
 )
 ```
 
-dendroNetwork is a package to create dendrochronological networks for gaining insight into provenance or other patterns based on the statistical relations between tree ring curves. The code and the functions are based on several published papers [@visser2021a; @visser2021b; @visser2022]
+dendroNetwork is a package to create dendrochronological networks for gaining insight into provenance or other patterns based on the statistical relations between tree ring curves. The code and the functions are based on several published papers [@visser2021a; @visser2021b; @visser2022].
+
+The package is written for dendrochronologists and have a general knowledge on the discipline and used jargon. There is an excellent website for the introduction of using R in dendrochronology: <https://opendendro.org/r/>. The basics of dendrochronology can be found in handbooks [@speer2010; @cook1990] or on <https://www.dendrohub.com/>.
 
 ## Installation
 
@@ -102,29 +104,7 @@ After creating the network in R, it is possible to visualize the network using C
 
 ## Usage for large datasets
 
-When using larger datasets calculating the table with similarities can take a lot of time, but finding communities even more. It is therefore recommended to use of parallel computing for Clique Percolation: `clique_community_names_par(network, k=3, n_core = 6)`. This reduces the amount of time significantly.
-
-The workflow is similar as above, but with minor changes:
-
-1.  load network
-
-2.  compute similarities
-
-3.  find the maximum clique size: `igraph::clique_num(network)`
-
-4.  detect communities for each clique size separately:
-
-    -   `com_cpm_k3 <- clique_community_names_par(network, k=3, n_core = 6)`.
-
-    -   `com_cpm_k4 <- clique_community_names_par(network, k=4, n_core = 6)`.
-
-    -   and so on until the maximum clique size
-
-5.  merge these into a single `data frame` by `com_cpm_all <- rbind(com_cpm_k3,com_cpm_k4, com_cpm_k5,... )`
-
-6.  create table for use in cytoscape with all communities: `com_cpm_all <- com_cpm_all %>% dplyr::count(node, com_name) %>% tidyr::spread(com_name, n)`
-
-7.  Continue with the visualisation in Cytoscape, see the previous [section on visualization in Cytoscape](#visualization_cytoscape)
+When using larger datasets of tree-ring series, calculating the table with similarities can take a lot of time, but finding communities even more. It is therefore recommended to use of parallel computing for Clique Percolation: `clique_community_names_par(network, k=3, n_core = 4)`. This reduces the amount of time significantly. For most datasets `clique_community_names()` is sufficiently fast and for smaller datasets `clique_community_names_par()` can even be slower due to the parallelisation. Therefore, the funtion `clique_community_names()` should be used initially and if this is very slow, start using `clique_community_names_par()`. See the separate [vignette](https://ronaldvisser.github.io/dendroNetwork/articles/large_datasets_communities.html)for that.
 
 ## Citation
 
diff --git a/README.md b/README.md
@@ -20,7 +20,14 @@ dendroNetwork is a package to create dendrochronological networks for
 gaining insight into provenance or other patterns based on the
 statistical relations between tree ring curves. The code and the
 functions are based on several published papers (Visser 2021b, 2021a;
-Visser and Vorst 2022)
+Visser and Vorst 2022).
+
+The package is written for dendrochronologists and have a general
+knowledge on the discipline and used jargon. There is an excellent
+website for the introduction of using R in dendrochronology:
+<https://opendendro.org/r/>. The basics of dendrochronology can be found
+in handbooks (Cook and Kariukstis 1990; Speer 2010) or on
+<https://www.dendrohub.com/>.
 
 ## Installation
 
@@ -120,36 +127,19 @@ with the Girvan-Newman communities visualized.</figcaption>
 
 ## Usage for large datasets
 
-When using larger datasets calculating the table with similarities can
-take a lot of time, but finding communities even more. It is therefore
-recommended to use of parallel computing for Clique Percolation:
-`clique_community_names_par(network, k=3, n_core = 6)`. This reduces the
-amount of time significantly.
-
-The workflow is similar as above, but with minor changes:
-
-1.  load network
-
-2.  compute similarities
-
-3.  find the maximum clique size: `igraph::clique_num(network)`
-
-4.  detect communities for each clique size separately:
-
-    - `com_cpm_k3 <- clique_community_names_par(network, k=3, n_core = 6)`.
-
-    - `com_cpm_k4 <- clique_community_names_par(network, k=4, n_core = 6)`.
-
-    - and so on until the maximum clique size
-
-5.  merge these into a single `data frame` by
-    `com_cpm_all <- rbind(com_cpm_k3,com_cpm_k4, com_cpm_k5,... )`
-
-6.  create table for use in cytoscape with all communities:
-    `com_cpm_all <- com_cpm_all %>% dplyr::count(node, com_name) %>% tidyr::spread(com_name, n)`
-
-7.  Continue with the visualisation in Cytoscape, see the previous
-    [section on visualization in Cytoscape](#visualization_cytoscape)
+When using larger datasets of tree-ring series, calculating the table
+with similarities can take a lot of time, but finding communities even
+more. It is therefore recommended to use of parallel computing for
+Clique Percolation:
+`clique_community_names_par(network, k=3, n_core = 4)`. This reduces the
+amount of time significantly. For most datasets
+`clique_community_names()` is sufficiently fast and for smaller datasets
+`clique_community_names_par()` can even be slower due to the
+parallelisation. Therefore, the funtion `clique_community_names()`
+should be used initially and if this is very slow, start using
+`clique_community_names_par()`. See the separate
+[vignette](https://ronaldvisser.github.io/dendroNetwork/articles/large_datasets_communities.html)for
+that.
 
 ## Citation
 
@@ -196,6 +186,14 @@ optimized and also outputs the number of overlapping rings. Source code:
 <div id="refs" class="references csl-bib-body hanging-indent"
 line-spacing="2">
 
+<div id="ref-cook1990" class="csl-entry">
+
+Cook, ER and Kariukstis, LA. 1990. *Methods of dendrochronology.
+Applications in the environmental sciences*. Dordrecht: Kluwer Academic
+Publishers.
+
+</div>
+
 <div id="ref-girvan2002" class="csl-entry">
 
 Girvan, M and Newman, MEJ. 2002 Community structure in social and
@@ -233,6 +231,13 @@ https://doi.org/[10.1101/gr.1239303](https://doi.org/10.1101/gr.1239303).
 
 </div>
 
+<div id="ref-speer2010" class="csl-entry">
+
+Speer, JH. 2010. *Fundamentals of tree ring research*. Tucson:
+University of Arizona Press.
+
+</div>
+
 <div id="ref-visser2021b" class="csl-entry">
 
 Visser, RM. 2021a Dendrochronological Provenance Patterns. Network
diff --git a/man/dendroNetwork-package.Rd b/man/dendroNetwork-package.Rd
diff --git a/man/figures/README-network_hollstein_1980-1.png b/man/figures/README-network_hollstein_1980-1.png
diff --git a/man/figures/README-network_hollstein_1980-2.png b/man/figures/README-network_hollstein_1980-2.png
diff --git a/man/sim_table.Rd b/man/sim_table.Rd
diff --git a/references.bib b/references.bib
@@ -99,3 +99,22 @@ @article{shannon2003
 	doi = {10.1101/gr.1239303},
 	url = {http://genome.cshlp.org/content/13/11/2498.abstract}
 }
+
+@book{speer2010,
+	title = {Fundamentals of Tree Ring Research},
+	author = {Speer, James H.},
+	year = {2010},
+	month = {05},
+	date = {2010-05-01},
+	publisher = {University of Arizona Press},
+	address = {Tucson}
+}
+
+@book{cook1990,
+	title = {Methods of Dendrochronology. Applications in the Environmental Sciences},
+	author = {Cook, E. R. and Kariukstis, L. A.},
+	year = {1990},
+	date = {1990},
+	publisher = {Kluwer Academic Publishers},
+	address = {Dordrecht}
+}
diff --git a/vignettes/dendroNetwork.Rmd b/vignettes/dendroNetwork.Rmd
@@ -4,12 +4,14 @@ output: rmarkdown::html_vignette
 bibliography: references.bib
 csl: journal-of-computer-applications-in-archaeology.csl
 vignette: >
-  %\VignetteIndexEntry{dendroNetwork_use}
+  %\VignetteIndexEntry{dendroNetwork}
   %\VignetteEngine{knitr::rmarkdown}
   %\VignetteEncoding{UTF-8}
 ---
 
-dendroNetwork is a package to create dendrochronological networks for gaining insight into provenance or other patterns based on the statistical relations between tree ring curves. The code and the functions are based on several published papers [@visser2022; @visser2021; @visser2021]
+dendroNetwork is a package to create dendrochronological networks for gaining insight into provenance or other patterns based on the statistical relations between tree ring curves. The code and the functions are based on several published papers [@visser2022; @visser2021; @visser2021].
+
+The package is written for dendrochronologists and have a general knowledge on the discipline and used jargon. There is an excellent website for the introduction of using R in dendrochronology: <https://opendendro.org/r/>. The basics of dendrochronology can be found in handbooks [@speer2010; @cook1990] or on <https://www.dendrohub.com/>.
 
 ## Usage {#usage}
 
@@ -75,8 +77,6 @@ After creating the network in R, it is possible to visualize the network using C
 
 A more complete description of using Cytoscape with this package can be found here: `vignette("large_datasets_communities")`
 
-
-
 ## Citation
 
 If you use this software, please cite this using:
diff --git a/vignettes/large_datasets_communities.Rmd b/vignettes/large_datasets_communities.Rmd
@@ -18,11 +18,11 @@ knitr::opts_chunk$set(
 library(dendroNetwork)
 ```
 
-## Community detection in large datasets
+## Community detection in very large datasets
 
-When using larger datasets of tree-ring series, calculating the table with similarities can take a lot of time, but finding communities even more. It is therefore recommended to use of parallel computing for Clique Percolation: `clique_community_names_par(network, k=3, n_core = 6)`. This reduces the amount of time significantly.
+When using larger datasets of tree-ring series, calculating the table with similarities can take a lot of time, but finding communities even more. It is therefore recommended to use of parallel computing for Clique Percolation: `clique_community_names_par(network, k=3, n_core = 4)`. This reduces the amount of time significantly. For most datasets `clique_community_names()` is sufficiently fast and for smaller datasets `clique_community_names_par()` can even be slower due to the parallelisation. Therefore, the funtion `clique_community_names()` should be used initially and if this is very slow, start using `clique_community_names_par()`.
 
-The workflow is similar as described in the `vignette("dendronetwork")`, but with minor changes:
+The workflow is similar as described in the `vignette("dendroNetwork")`, but with minor changes:
 
 1.  load network
 
@@ -42,4 +42,4 @@ The workflow is similar as described in the `vignette("dendronetwork")`, but wit
 
 6.  create table for use in cytoscape with all communities: `com_cpm_all <- com_cpm_all %>% dplyr::count(node, com_name) %>% tidyr::spread(com_name, n)`
 
-7.  Continue with the visualisation in Cytoscape, see the relevant section in the `vignette("dendronetwork")`
+7.  Continue with the visualisation in Cytoscape, see the relevant section in the `vignette("dendroNetwork")`
diff --git a/vignettes/references.bib b/vignettes/references.bib
@@ -99,3 +99,22 @@ @article{palla2005
 	doi = {10.1038/nature03607},
 	url = {http://dx.doi.org/10.1038/nature03607}
 }
+
+@book{speer2010,
+	title = {Fundamentals of Tree Ring Research},
+	author = {Speer, James H.},
+	year = {2010},
+	month = {05},
+	date = {2010-05-01},
+	publisher = {University of Arizona Press},
+	address = {Tucson}
+}
+
+@book{cook1990,
+	title = {Methods of Dendrochronology. Applications in the Environmental Sciences},
+	author = {Cook, E. R. and Kariukstis, L. A.},
+	year = {1990},
+	date = {1990},
+	publisher = {Kluwer Academic Publishers},
+	address = {Dordrecht}
+}