Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove the dependency on the readxl package #270

Merged
merged 2 commits into from
Sep 27, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 0 additions & 1 deletion DESCRIPTION
Original file line number Diff line number Diff line change
Expand Up @@ -52,7 +52,6 @@ Imports:
methods,
minpack.lm (>= 1.2-4),
mclust (>= 6.1),
readxl (>= 1.4.3),
Rcpp (>= 1.0.12),
shape (>= 1.4.6),
parallel,
Expand Down
16 changes: 15 additions & 1 deletion NEWS.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,13 @@ header-includes:

## New functions

## Breaking changes
* We have dropped our dependency on the `readxl` package: functions
`analyse_baSAR()` and `use_DRAC()` now do not accept XLS files anymore but
CSV files instead (#237, fixed in #270). CSV files can be easily generated
from XLS files by Excel or similar applications, or by reading them with
`readxl::read_excel()` and saving them with `write.csv()`.

## Removed functions and deprecations
* Function `Analyse_SAR.OSLdata()` is now officially deprecated,
`analyse_SAR.CWOSL()` should be used instead (#216, fixed in #264).
Expand All @@ -19,6 +26,10 @@ header-includes:

## Bugfixes

### `analyse_baSAR()`
* Argument `XLS_file` has been replaced by `CSV_file` and, as mentioned
above, the function now only accepts CSV files as input (#237, fixed in #270).

### `analyse_pIRIRSequence()`
* The function crashed with a object merge error if run in a loop because of a `merge_RLum()` error. This
was caused by a regression while implementing the `n_N` calculation in `plot_GrowthCurve()`. Potentially
Expand Down Expand Up @@ -51,6 +62,10 @@ fixed in #263).
* Add support for `lphi` and `ltheta` light direction arguments for `plot.type = "persp"`.
* Fix the reason for the unclear warning `In col.unique == col : longer object length is not a multiple of shorter object length`

### `use_DRAC()`
* Support for DRAC v1.1 XLS/XLSX files has been dropped, users should use
CSV files according to the DRAC v1.2 CSV template.

### `write_R2BIN()`
* Recently, non-ASCII characters in comments or file names became more common and that led to crashes during the
file export. To avoid this now all non-ASCII characters are replaced by `_` before writing them to the BIN/BINX files.
Expand All @@ -63,4 +78,3 @@ file export. To avoid this now all non-ASCII characters are replaced by `_` befo
terminal with messages if call (internally) in particular circumstances. Now we maintain a
stack of function names, so that at any time we can report correctly the name of the
function where an error or a warning is thrown (#254, fixed in #256).

85 changes: 36 additions & 49 deletions R/analyse_baSAR.R
Original file line number Diff line number Diff line change
Expand Up @@ -42,26 +42,26 @@
#' if only a file name and/or a path is provided. In both cases it will become the data that can be
#' used for the analysis.
#'
#' `[XLS_file = NULL]`
#' `[CSV_file = NULL]`
#'
#' If no XLS file (or data frame with the same format) is provided the functions runs an automatic process that
#' consists of the following steps:
#' If no CSV file (or data frame with the same format) is provided, the
#' function runs an automatic process that consists of the following steps:
#'
#' 1. Select all valid aliquots using the function [verify_SingleGrainData]
#' 2. Calculate `Lx/Tx` values using the function [calc_OSLLxTxRatio]
#' 3. Calculate De values using the function [plot_GrowthCurve]
#'
#' These proceeded data are subsequently used in for the Bayesian analysis
#'
#' `[XLS_file != NULL]`
#' `[CSV_file != NULL]`
#'
#' If an XLS-file is provided or a `data.frame` providing similar information the pre-processing
#' steps consists of the following steps:
#' If a CSV file is provided (or a `data.frame` containing similar information)
#' the pre-processing phase consists of the following steps:
#'
#' 1. Calculate `Lx/Tx` values using the function [calc_OSLLxTxRatio]
#' 2. Calculate De values using the function [plot_GrowthCurve]
#'
#' Means, the XLS file should contain a selection of the BIN-file names and the aliquots selected
#' The CSV file should contain a selection of the BIN-file names and the aliquots selected
#' for the further analysis. This allows a manual selection of input data, as the automatic selection
#' by [verify_SingleGrainData] might be not totally sufficient.
#'
Expand Down Expand Up @@ -142,10 +142,7 @@
#' \tabular{llll}{
#' **Supported argument** \tab **Corresponding function** \tab **Default** \tab **Short description **\cr
#' `threshold` \tab [verify_SingleGrainData] \tab `30` \tab change rejection threshold for curve selection \cr
#' `sheet` \tab [readxl::read_excel] \tab `1` \tab select XLS-sheet for import\cr
#' `col_names` \tab [readxl::read_excel] \tab `TRUE` \tab first row in XLS-file is header\cr
#' `col_types` \tab [readxl::read_excel] \tab `NULL` \tab limit import to specific columns\cr
#' `skip` \tab [readxl::read_excel] \tab `0` \tab number of rows to be skipped during import\cr
#' `skip` \tab [data.table::fread] \tab `0` \tab number of rows to be skipped during import\cr
#' `n.records` \tab [read_BIN2R] \tab `NULL` \tab limit records during BIN-file import\cr
#' `duplicated.rm` \tab [read_BIN2R] \tab `TRUE` \tab remove duplicated records in the BIN-file\cr
#' `pattern` \tab [read_BIN2R] \tab `TRUE` \tab select BIN-file by name pattern\cr
Expand All @@ -167,15 +164,15 @@
#' providing a file connection. Mixing of both types is not allowed. If an [RLum.Results-class]
#' is provided the function directly starts with the Bayesian Analysis (see details)
#'
#' @param XLS_file [character] (*optional*):
#' XLS_file with data for the analysis. This file must contain 3 columns:
#' @param CSV_file [character] (*optional*):
#' CSV_file with data for the analysis. This file must contain 3 columns:
#' the name of the file, the disc position and the grain position
#' (the last being 0 for multi-grain measurements).\cr
#' Alternatively a `data.frame` of similar structure can be provided.
#'
#' @param aliquot_range [numeric] (*optional*):
#' allows to limit the range of the aliquots used for the analysis.
#' This argument has only an effect if the argument `XLS_file` is used or
#' This argument has only an effect if the argument `CSV_file` is used or
#' the input is the previous output (i.e. is [RLum.Results-class]). In this case the
#' new selection will add the aliquots to the removed aliquots table.
#'
Expand Down Expand Up @@ -271,7 +268,7 @@
#' enables or disables verbose mode
#'
#' @param ... parameters that can be passed to the function [calc_OSLLxTxRatio]
#' (almost full support), [readxl::read_excel] (full support), [read_BIN2R] (`n.records`,
#' (almost full support), [data.table::fread] (`skip`), [read_BIN2R] (`n.records`,
#' `position`, `duplicated.rm`), see details.
#'
#'
Expand Down Expand Up @@ -327,7 +324,7 @@
#' The underlying Bayesian model based on a contribution by Combès et al., 2015.
#'
#' @seealso [read_BIN2R], [calc_OSLLxTxRatio], [plot_GrowthCurve],
#' [readxl::read_excel], [verify_SingleGrainData],
#' [data.table::fread], [verify_SingleGrainData],
#' [rjags::jags.model], [rjags::coda.samples], [boxplot.default]
#'
#'
Expand Down Expand Up @@ -392,11 +389,11 @@
#' print(results)
#'
#'
#' ##XLS_file template
#' ##CSV_file template
#' ##copy and paste this the code below in the terminal
#' ##you can further use the function write.csv() to export the example
#'
#' XLS_file <-
#' CSV_file <-
#' structure(
#' list(
#' BIN_FILE = NA_character_,
Expand All @@ -413,7 +410,7 @@
#' @export
analyse_baSAR <- function(
object,
XLS_file = NULL,
CSV_file = NULL,
aliquot_range = NULL,
source_doserate = NULL,
signal.integral,
Expand Down Expand Up @@ -809,10 +806,7 @@ analyse_baSAR <- function(
##calc_OSLLxTxRatio()
background.count.distribution = "non-poisson",

##readxl::read_excel()
sheet = 1,
col_names = TRUE,
col_types = NULL,
## data.table::fread()
skip = 0,

##read_BIN2R()
Expand Down Expand Up @@ -1251,12 +1245,11 @@ analyse_baSAR <- function(
}
}

# Read EXCEL sheet ----------------------------------------------------------------------------
if(is.null(XLS_file)){
# Read CSV file -----------------------------------------------------------
if (is.null(CSV_file)) {
##select aliquots giving light only, this function accepts also a list as input
if(verbose){
cat("\n[analyse_baSAR()] No XLS-file provided, running automatic grain selection ...\n")

cat("\n[analyse_baSAR()] 'CSV_file' was not provided, running automatic grain selection ...\n")
}

for (k in 1:length(fileBIN.list)) {
Expand Down Expand Up @@ -1312,32 +1305,26 @@ analyse_baSAR <- function(
}
rm(k)

} else if (is(XLS_file, "data.frame") || is(XLS_file, "character")) {
##load file if we have an XLS file
if (is(XLS_file, "character")) {
} else if (is.data.frame(CSV_file) || is.character(CSV_file)) {
##load file if we have a filename
if (is.character(CSV_file)) {
##test for valid file
if(!file.exists(XLS_file)){
.throw_error("XLS_file does not exist")
if(!file.exists(CSV_file)){
.throw_error("'CSV_file' does not exist")
}

##import Excel sheet
datalu <- as.data.frame(readxl::read_excel(
path = XLS_file,
sheet = additional_arguments$sheet,
col_names = additional_arguments$col_names,
col_types = additional_arguments$col_types,
skip = additional_arguments$skip,
progress = FALSE,
), stringsAsFactors = FALSE)
## import CSV file
datalu <- data.table::fread(CSV_file, data.table = FALSE,
skip = additional_arguments$skip)

###check whether data format is somehow odd, check only the first three columns
if (ncol(datalu) < 3) {
.throw_error("The XLS_file requires at least 3 columns for ",
.throw_error("'CSV_file' requires at least 3 columns for ",
"'BIN_file', 'DISC' and 'GRAIN'")
}
if(!all(grepl(colnames(datalu), pattern = " ")[1:3])){
.throw_error("One of the first 3 columns in your XLS_file has no ",
"header. Your XLS_file requires at least 3 columns for ",
.throw_error("One of the first 3 columns in 'CSV_file' has no ",
"header. Your CSV file requires at least 3 columns for ",
"'BIN_file', 'DISC' and 'GRAIN'")
}

Expand All @@ -1346,11 +1333,11 @@ analyse_baSAR <- function(

} else{

datalu <- XLS_file
datalu <- CSV_file

##check number of number of columns in data.frame
if(ncol(datalu) < 3){
.throw_error("The data.frame provided via 'XLS_file' must have ",
.throw_error("The data.frame provided via 'CSV_file' must have ",
"at least 3 columns (see manual)")
}

Expand Down Expand Up @@ -1407,12 +1394,12 @@ analyse_baSAR <- function(
##if k is NULL it means it was not set so far, so there was
##no corresponding BIN-file found
if(is.null(k)){
.throw_error("BIN-file names in XLS_file do not match the loaded ",
"BIN-files")
.throw_error("The BIN-file names provided via 'CSV_file' do not match ",
"the loaded BIN-files")
}

} else{
.throw_error("Input type for 'XLS_file' not supported")
.throw_error("Input type for 'CSV_file' not supported")
}


Expand Down
39 changes: 22 additions & 17 deletions R/internals_Thermochronometry.R
Original file line number Diff line number Diff line change
@@ -1,17 +1,19 @@
#'@title Import Thermochronometry Data
#'
#'@description Import Excel Data from Thermochronometry Experiments into R.
#'This function is an adaption of the script `STAGE1, ExcelToStructure` by
#'Benny Guralnik, 2014
#' @description
#' Import data from thermochronometry experiments into R.
#' This function is an adaption of the script `STAGE1, ExcelToStructure` by
#' Benny Guralnik, 2014, modified to accept CSV files with the same structure
#' as the original Excel files.
#'
#'@param file [character] (**required**): path to XLS file; alternatively a [list] created
#'@param file [character] (**required**): path to a CSV file; alternatively a
#' [vector] of paths
#'
#'@param output_type [character] (*with default*): defines the output for the function,
#'which can be either `"RLum.Results"` (the default) or a plain R list (`"list"`)
#'
#'@author Sebastian Kreutzer, Institute of Geography, Heidelberg University (Germany)
#'
#'@seealso [readxl::read_excel]
#'
#'@returns Depending on the setting of `output_type` it will be either a plain R [list]
#'or an [RLum.Results-class] object with the following structure data elements
Expand Down Expand Up @@ -50,18 +52,21 @@
# Import ------------------------------------------------------------------
## preset records
records <- file[1]

if (inherits(file, "character")) {
## get number of sheets in the file
sheets <- readxl::excel_sheets(file)

## import data from all sheets ... separate header and body
tmp_records <- lapply(sheets, function(x) {
header <- readxl::read_excel(file, sheet = x, .name_repair = "unique_quiet", n_max = 3)
body <- readxl::read_excel(file, sheet = x, .name_repair = "unique_quiet", skip = 3)
list(as.data.frame(header), as.data.frame(body))
})
names(tmp_records) <- sheets

if (grepl("xlsx?", tools::file_ext(file[1]), ignore.case = TRUE)) {
.throw_error("XLS/XLSX format is not supported, use CSV instead")
}

## import data from all files ... separate header and body
tmp_records <- lapply(file, function(x) {
if (!file.exists(x))
.throw_error("File does not exist")
header <- data.table::fread(x, nrows = 3, select = c(1:5))
body <- data.table::fread(x, skip = 3, header = TRUE)
list(as.data.frame(header), as.data.frame(body))
})
names(tmp_records) <- basename(tools::file_path_sans_ext(file))

## compile records
records <- lapply(tmp_records, function(x){
Expand All @@ -84,7 +89,7 @@
## assign originator to this list
attr(records, "originator") <- ".import_ThermochronometryData "

}#end XLSX import
} # end CSV import

## if input is a list check what is coming in
if(!inherits(records, "list") ||
Expand Down
36 changes: 14 additions & 22 deletions R/use_DRAC.R
Original file line number Diff line number Diff line change
@@ -1,13 +1,12 @@
#' Use DRAC to calculate dose rate data
#'
#' The function provides an interface from R to DRAC. An R-object or a
#' pre-formatted XLS/XLSX file is passed to the DRAC website and the
#' results are re-imported into R.
#'
#' CSV file is passed to the DRAC website and results are re-imported into R.
#'
#' @param file [character] (**required**):
#' spreadsheet to be passed to the DRAC website for calculation. Can also be a
#' DRAC template object obtained from `template_DRAC()`.
#' name of a CSV file (formatted according to the DRAC v1.2 CSV template) to
#' be sent to the DRAC website for calculation. It can also be a DRAC template
#' object obtained from [template_DRAC()].
#'
#' @param name [character] (*with default*):
#' Optional user name submitted to DRAC. If omitted, a random name will be generated
Expand Down Expand Up @@ -133,29 +132,22 @@ use_DRAC <- function(

}

# Import data ---------------------------------------------------------------------------------
if (tools::file_ext(file) == "xls" || tools::file_ext(file) == "xlsx") {
.throw_error("XLS/XLSX format no longer supported, use CSV instead")
}

## Import data ----------------------------------------------------------

## Import and skip the first rows and remove NA lines and the 2 row, as this row contains
## only meta data

## DRAC v1.1 - XLS sheet
##check if is the original DRAC table
if (tools::file_ext(file) == "xls" || tools::file_ext(file) == "xlsx") {
if (readxl::excel_sheets(file)[1] != "DRAC_1.1_input")
stop("[use_DRAC()] It looks like that you are not using the original DRAC v1.1 XLSX template. This is currently not supported!", call. = FALSE)

warning("\n[use_DRAC()] The current DRAC version is 1.2, but you provided the v1.1 excel input template.",
"\nPlease transfer your data to the new CSV template introduced with DRAC v1.2.", call. = FALSE)
input.raw <- na.omit(as.data.frame(readxl::read_excel(path = file, sheet = 1, skip = 5)))[-1, ]
}

## DRAC v1.2 - CSV sheet
if (tools::file_ext(file) == "csv") {
if (read.csv(file, nrows = 1, header = FALSE)[1] != "DRAC v.1.2 Inputs")
stop("[use_DRAC()] It looks like that you are not using the original DRAC v1.2 CSV template. This is currently not supported!", call. = FALSE)
if (read.csv(file, nrows = 1, header = FALSE)[1] != "DRAC v.1.2 Inputs")
.throw_error("It looks like that you are not using the original ",
"DRAC v1.2 CSV template, this is currently not supported")

input.raw <- read.csv(file, skip = 8, check.names = FALSE, header = TRUE, stringsAsFactors = FALSE)[-1, ]
}
input.raw <- read.csv(file, skip = 8, check.names = FALSE, header = TRUE,
stringsAsFactors = FALSE)[-1, ]

} else if (inherits(file, "DRAC.list")) {
input.raw <- as.data.frame(file)
Expand Down
Loading
Loading