| Title: | Utility Functions for Production R Code |
|---|---|
| Description: | A suite of utility functions providing functionality commonly needed for production level projects such as logging, error handling, cache management and date-time parsing. Functions for date-time parsing and formatting require that time zones be specified explicitly, avoiding a common source of error when working with environmental time series. |
| Authors: | Jonathan Callahan [aut, cre], Eli Grosman [ctb], Spencer Pease [ctb], Thomas Bergamaschi [ctb] |
| Maintainer: | Jonathan Callahan <[email protected]> |
| License: | GPL-3 |
| Version: | 0.6.2 |
| Built: | 2026-06-04 06:39:34 UTC |
| Source: | https://github.com/mazamascience/mazamacoreutils |
Internal session state used to store API keys for web services.
A named list of character strings.
Users can set API keys with setAPIKey(). Keys are remembered for the
duration of the R session and can be retrieved with getAPIKey().
This provides a small abstraction layer for dependent packages so that data access functions can test for and retrieve provider-specific API keys with generic code.
getAPIKey(), setAPIKey(), showAPIKeys()
Create a location ID for each longitude/latitude pair using a geohash.
createLocationID( longitude = NULL, latitude = NULL, precision = 10, algorithm = c("geohash", "digest"), invalidID = as.character(NA) )createLocationID( longitude = NULL, latitude = NULL, precision = 10, algorithm = c("geohash", "digest"), invalidID = as.character(NA) )
longitude |
Vector of longitudes in decimal degrees east. |
latitude |
Vector of latitudes in decimal degrees north. |
precision |
Precision used when encoding geohashes. |
algorithm |
Encoding algorithm to use. Only |
invalidID |
Identifier to use for invalid locations. This can be a
character string or |
Each location ID is unique within a geohash grid cell. The precision
argument determines the size of the grid cell. At the equator, approximate
grid cell widths are:
precision maximum grid cell width
5 ~ 4.9 km
6 ~ 1.2 km
7 ~ 153 m
8 ~ 38 m
9 ~ 4.8 m
10 ~ 1.2 m
Invalid locations are assigned the value specified by invalidID, typically
NA.
Character vector of location IDs.
https://michaelchirico.github.io/geohashTools/index.html
longitude <- c(-122.5, 0, NA, -122.5, -122.5) latitude <- c(47.5, 0, 47.5, NA, 47.5) createLocationID(longitude, latitude) createLocationID(longitude, latitude, precision = 7) createLocationID(longitude, latitude, invalidID = "bad")longitude <- c(-122.5, 0, NA, -122.5, -122.5) latitude <- c(47.5, 0, 47.5, NA, 47.5) createLocationID(longitude, latitude) createLocationID(longitude, latitude, precision = 7) createLocationID(longitude, latitude, invalidID = "bad")
Create a logical mask identifying valid longitude/latitude pairs.
createLocationMask( longitude = NULL, latitude = NULL, lonRange = c(-180, 180), latRange = c(-90, 90), removeZeroZero = TRUE )createLocationMask( longitude = NULL, latitude = NULL, lonRange = c(-180, 180), latRange = c(-90, 90), removeZeroZero = TRUE )
longitude |
Vector of longitudes in decimal degrees east. |
latitude |
Vector of latitudes in decimal degrees north. |
lonRange |
Range of valid longitudes. |
latRange |
Range of valid latitudes. |
removeZeroZero |
Logical specifying whether the coordinate pair
|
The returned logical vector contains TRUE for valid locations and FALSE
for invalid locations. This is useful for filtering data frames to retain
only records with valid geographic coordinates.
Longitude and latitude values are considered valid when they:
fall within lonRange and latRange
are not missing
are not located at (0, 0) when removeZeroZero = TRUE
The lonRange and latRange arguments can be used to restrict valid
locations to a rectangular geographic region.
Logical vector identifying valid locations.
createLocationMask( longitude = c(-120, NA, -120, -220, -120, 0), latitude = c(45, 45, NA, 45, 100, 0) ) createLocationMask( longitude = -120:-90, latitude = 20:50, lonRange = c(-110, -100), latRange = c(30, 40) )createLocationMask( longitude = c(-120, NA, -120, -220, -120, 0), latitude = c(45, 45, NA, 45, 100, 0) ) createLocationMask( longitude = -120:-90, latitude = 20:50, lonRange = c(-110, -100), latRange = c(30, 40) )
Create a two-element POSIXct vector representing a date/time range in a
specified timezone.
dateRange( startdate = NULL, enddate = NULL, timezone = NULL, unit = "sec", ceilingStart = FALSE, ceilingEnd = FALSE, days = 7 )dateRange( startdate = NULL, enddate = NULL, timezone = NULL, unit = "sec", ceilingStart = FALSE, ceilingEnd = FALSE, days = 7 )
startdate |
Desired start datetime. |
enddate |
Desired end datetime. |
timezone |
Olson timezone used to interpret incoming dates. |
unit |
Temporal precision used for the returned end-of-range value.
One of |
ceilingStart |
Logical specifying whether to round |
ceilingEnd |
Logical specifying whether to include the entirety of the final day. |
days |
Number of days to include when either |
The returned range is ordered from earliest to latest. The first element represents the beginning of the requested date range and the second element represents the end of the requested date range at the requested temporal precision.
By default, the returned end time is one unit before the beginning of
enddate. For example:
dateRange(20190101, 20190102, timezone = "UTC") [1] "2019-01-01 00:00:00 UTC" [2] "2019-01-01 23:59:59 UTC"
Setting ceilingEnd = TRUE includes the entirety of enddate:
dateRange( 20190101, 20190101, timezone = "UTC", ceilingEnd = TRUE ) [1] "2019-01-01 00:00:00 UTC" [2] "2019-01-01 23:59:59 UTC"
The ceilingEnd argument addresses ambiguity in phrases such as
"August 1-8". With ceilingEnd = FALSE (default), the range extends
through the end of August 7, stopping at the midnight boundary where August 8
begins. With ceilingEnd = TRUE, the range
extends through the end of August 8.
Input dates are parsed with parseDatetime() using the specified
timezone.
Two-element POSIXct vector ordered from earliest to latest.
If either startdate or enddate is missing, the missing boundary is
calculated using days.
If both are missing, enddate defaults to the current day in timezone
and startdate is calculated as enddate - days.
The returned end time is adjusted to the last representable value within the requested unit:
unit = "day"End time is midnight at the start of the final day.
unit = "hour"End time is 23:00:00.
unit = "min"End time is 23:59:00.
unit = "sec"End time is 23:59:59.
When startdate or enddate are already POSIXct values, they are first
converted to timezone with lubridate::with_tz() without changing the
represented instant in time.
When parameters conflict, the following rules apply:
If both startdate and enddate are supplied, days is ignored.
If startdate is missing, ceilingStart is ignored.
If enddate is missing, ceilingEnd is ignored.
dateRange("2019-01-08", timezone = "UTC") dateRange("2019-01-08", unit = "min", timezone = "UTC") dateRange("2019-01-08", unit = "hour", timezone = "UTC") dateRange("2019-01-08", unit = "day", timezone = "UTC") dateRange("2019-01-08", "2019-01-11", timezone = "UTC") dateRange( enddate = 20190112, days = 3, unit = "day", timezone = "America/Los_Angeles" )dateRange("2019-01-08", timezone = "UTC") dateRange("2019-01-08", unit = "min", timezone = "UTC") dateRange("2019-01-08", unit = "hour", timezone = "UTC") dateRange("2019-01-08", unit = "day", timezone = "UTC") dateRange("2019-01-08", "2019-01-11", timezone = "UTC") dateRange( enddate = 20190112, days = 3, unit = "day", timezone = "America/Los_Angeles" )
Create a sequence of local-midnight POSIXct datetimes in a specified
timezone.
dateSequence( startdate = NULL, enddate = NULL, timezone = NULL, ceilingEnd = FALSE )dateSequence( startdate = NULL, enddate = NULL, timezone = NULL, ceilingEnd = FALSE )
startdate |
Desired start datetime. |
enddate |
Desired end datetime. |
timezone |
Olson timezone used to interpret incoming dates. |
ceilingEnd |
Logical specifying whether to include the end of the final day. |
The returned sequence begins at midnight local time on startdate and ends
at midnight local time on enddate, i.e. the beginning of enddate.
The ceilingEnd argument addresses ambiguity in date ranges such as
"August 1-8". With ceilingEnd = FALSE (default), the sequence ends at
the beginning of August 8. With ceilingEnd = TRUE, the sequence
includes the entirety of August 8, ending at the midnight that begins August 9.
Input dates are parsed with parseDatetime() using the specified
timezone. Any hour-minute-second information is removed after parsing.
A vector of POSIXct datetimes at local midnight.
When startdate or enddate are already POSIXct values, they are first
converted to timezone with lubridate::with_tz() without changing the
represented instant in time. They are then floored to local midnight.
This function preserves local clock-time midnight boundaries across daylight
savings transitions. This differs from seq.Date(..., by = "day"), which
advances by fixed 24-hour intervals and can drift away from midnight local
time during daylight savings changes.
dateSequence( "2019-11-01", "2019-11-08", timezone = "America/Los_Angeles" ) dateSequence( "2019-11-01", "2019-11-07", timezone = "America/Los_Angeles", ceilingEnd = TRUE ) # Observe daylight savings handling datetime <- dateSequence( "2019-11-01", "2019-11-08", timezone = "America/Los_Angeles" ) datetime lubridate::with_tz(datetime, "UTC") # POSIXct inputs preserve the represented instant before flooring jst <- dateSequence( 20190307, 20190315, timezone = "Asia/Tokyo" ) jst dateSequence( jst[1], jst[7], timezone = "UTC" )dateSequence( "2019-11-01", "2019-11-08", timezone = "America/Los_Angeles" ) dateSequence( "2019-11-01", "2019-11-07", timezone = "America/Los_Angeles", ceilingEnd = TRUE ) # Observe daylight savings handling datetime <- dateSequence( "2019-11-01", "2019-11-08", timezone = "America/Los_Angeles" ) datetime lubridate::with_tz(datetime, "UTC") # POSIXct inputs preserve the represented instant before flooring jst <- dateSequence( 20190307, 20190315, timezone = "Asia/Tokyo" ) jst dateSequence( jst[1], jst[7], timezone = "UTC" )
Return the API key associated with a web service provider.
getAPIKey(provider = NULL)getAPIKey(provider = NULL)
provider |
Web service provider. |
If provider = NULL, all currently stored API keys are returned.
API key string, NULL, or a named list of all provider/key pairs.
APIKeys, setAPIKey(), showAPIKeys()
Parse an HTML page and return all <a href="...">...</a> links as a data
frame.
html_getLinks(url = NULL, relative = TRUE) html_getLinkNames(url = NULL) html_getLinkUrls(url = NULL, relative = TRUE)html_getLinks(url = NULL, relative = TRUE) html_getLinkNames(url = NULL) html_getLinkUrls(url = NULL, relative = TRUE)
url |
URL or local file path of an HTML page. |
relative |
Logical specifying whether to return relative URLs. If
|
The returned data frame contains the human-readable link text in linkName
and the href value in linkUrl. This is useful for extracting links from
index pages, including web-accessible directories that list downloadable
files.
Wrapper functions html_getLinkNames() and html_getLinkUrls() return the
corresponding columns as character vectors.
A tibble with linkName and linkUrl columns.
html_getLinkNames() returns a character vector of link names.
html_getLinkUrls() returns a character vector of link URLs.
## Not run: # If you want to download lots of USCensus shapefiles url <- "https://www2.census.gov/geo/tiger/GENZ2019/shp/" browseURL(url) dataLinks <- html_getLinks(url) dataLinks <- dataLinks %>% dplyr::filter(stringr::str_detect(linkName, "us_county")) head(dataLinks, 10) html_getLinkNames(url) html_getLinkUrls(url, relative = FALSE) ## End(Not run)## Not run: # If you want to download lots of USCensus shapefiles url <- "https://www2.census.gov/geo/tiger/GENZ2019/shp/" browseURL(url) dataLinks <- html_getLinks(url) dataLinks <- dataLinks %>% dplyr::filter(stringr::str_detect(linkName, "us_county")) head(dataLinks, 10) html_getLinkNames(url) html_getLinkUrls(url, relative = FALSE) ## End(Not run)
Parse an HTML page and return all <table> elements as a list of data
frames.
html_getTables(url = NULL, header = NA) html_getTable(url = NULL, header = NA, index = 1)html_getTables(url = NULL, header = NA) html_getTable(url = NULL, header = NA, index = 1)
url |
URL or local file path of an HTML page. |
header |
Logical specifying whether the first row should be used as
column names. If |
index |
Index identifying which table to return. |
The url argument may be either a remote URL or a local file path. Tables are
parsed with rvest::html_table(). To extract a single table, use
html_getTable().
List of data frames, one for each HTML table.
A single data frame containing the requested HTML table.
## Not run: url <- "https://en.wikipedia.org/wiki/List_of_tz_database_time_zones" tables <- html_getTables(url) firstTable <- tables[[1]] head(firstTable) nrow(firstTable) ## End(Not run)## Not run: url <- "https://en.wikipedia.org/wiki/List_of_tz_database_time_zones" tables <- html_getTables(url) firstTable <- tables[[1]] head(firstTable) nrow(firstTable) ## End(Not run)
Create a standard set of MazamaCoreUtils log files.
initializeLogging(logDir = NULL, filePrefix = "", createDir = TRUE)initializeLogging(logDir = NULL, filePrefix = "", createDir = TRUE)
logDir |
Directory in which to write log files. |
filePrefix |
Character string prepended to log file names. |
createDir |
Logical specifying whether to create |
This convenience function creates or validates a log directory, archives any
existing standard log files by appending a UTC timestamp, and then initializes
logging with logger.setup().
Standard log files include:
TRACE.log DEBUG.log INFO.log WARN.log ERROR.log
When filePrefix is supplied, it is prepended to each log file name.
No return value. Called for side effects.
Parse R source code and identify calls to selected functions that are missing required named arguments.
lintFunctionArgs_file(filePath = NULL, rules = NULL, fullPath = FALSE) lintFunctionArgs_dir(dirPath = "./R", rules = NULL, fullPath = FALSE)lintFunctionArgs_file(filePath = NULL, rules = NULL, fullPath = FALSE) lintFunctionArgs_dir(dirPath = "./R", rules = NULL, fullPath = FALSE)
filePath |
Path to a single R source file. |
rules |
Named list of linting rules. Each list name is a function name and each value is a character vector of required named arguments. |
fullPath |
Logical specifying whether returned file paths should be
absolute paths. If |
dirPath |
Path to a directory containing R source files. |
Rules are supplied as a named list where each name is a function to check and each value is a character vector of required argument names. A function call passes when all required arguments are supplied by name.
This linter only checks whether arguments are named in the call. It does not evaluate code, inspect argument values, or detect unnamed positional arguments.
A tibble describing matching function calls, with columns:
Source file path or file name.
Line number where the function call begins.
Column number where the function call begins.
Name of the function being checked.
List column containing named arguments used in the call.
Logical indicating whether all required named arguments were supplied.
This linter only detects named arguments. For example, foo(x = bar, "baz")
is treated as specifying the named argument x, but the value bar and the
unnamed argument "baz" are not inspected.
## Not run: rules <- list( fn_one = "x", fn_two = c("foo", "bar") ) lintFunctionArgs_file( filePath = "local_test/timezone_lint_test_script.R", rules = rules ) lintFunctionArgs_dir( dirPath = "./R", rules = MazamaCoreUtils::timezoneLintRules ) ## End(Not run)## Not run: rules <- list( fn_one = "x", fn_two = c("foo", "bar") ) lintFunctionArgs_file( filePath = "local_test/timezone_lint_test_script.R", rules = rules ) lintFunctionArgs_dir( dirPath = "./R", rules = MazamaCoreUtils::timezoneLintRules ) ## End(Not run)
Load a pre-generated R binary data file from either a local directory or a remote URL.
loadDataFile( filename = NULL, dataUrl = NULL, dataDir = NULL, priority = c("dataDir", "dataUrl") )loadDataFile( filename = NULL, dataUrl = NULL, dataDir = NULL, priority = c("dataDir", "dataUrl") )
filename |
Name of the |
dataUrl |
Remote URL directory containing data files. |
dataDir |
Local directory containing data files. |
priority |
First data source to try when both |
This function is intended for use by package-level *_load() helper
functions. It allows locally cached data files to be used when available,
avoiding unnecessary internet access.
If both dataDir and dataUrl are provided, priority determines which
source is tried first. If loading from the first source fails, the other
source is used as a fallback.
Object loaded from the .rda file.
## Not run: filename <- "USCensusStates_02.rda" dataDir <- "~/Data/Spatial" dataUrl <- "http://data.mazamascience.com/MazamaSpatialUtils/Spatial_0.8" # Load local file USCensusStates <- loadDataFile(filename, dataDir = dataDir) # Load remote file USCensusStates <- loadDataFile(filename, dataUrl = dataUrl) # Load local file with remote file as backup USCensusStates <- loadDataFile( filename, dataDir = dataDir, dataUrl = dataUrl, priority = "dataDir" ) # Load remote file with local file as backup USCensusStates <- loadDataFile( filename, dataDir = dataDir, dataUrl = dataUrl, priority = "dataUrl" ) ## End(Not run)## Not run: filename <- "USCensusStates_02.rda" dataDir <- "~/Data/Spatial" dataUrl <- "http://data.mazamascience.com/MazamaSpatialUtils/Spatial_0.8" # Load local file USCensusStates <- loadDataFile(filename, dataDir = dataDir) # Load remote file USCensusStates <- loadDataFile(filename, dataUrl = dataUrl) # Load local file with remote file as backup USCensusStates <- loadDataFile( filename, dataDir = dataDir, dataUrl = dataUrl, priority = "dataDir" ) # Load remote file with local file as backup USCensusStates <- loadDataFile( filename, dataDir = dataDir, dataUrl = dataUrl, priority = "dataUrl" ) ## End(Not run)
Emit a DEBUG level log message.
logger.debug(msg, ...)logger.debug(msg, ...)
msg |
Message with optional format strings. |
... |
Additional arguments passed to |
Logging must first be initialized with logger.setup().
No return value. Called for side effects.
Emit an ERROR level log message.
logger.error(msg, ...)logger.error(msg, ...)
msg |
Message with optional format strings. |
... |
Additional arguments passed to |
Logging must first be initialized with logger.setup().
No return value. Called for side effects.
Emit a FATAL level log message.
logger.fatal(msg, ...)logger.fatal(msg, ...)
msg |
Message with optional format strings. |
... |
Additional arguments passed to |
Logging must first be initialized with logger.setup().
No return value. Called for side effects.
Emit an INFO level log message.
logger.info(msg, ...)logger.info(msg, ...)
msg |
Message with optional format strings. |
... |
Additional arguments passed to |
Logging must first be initialized with logger.setup().
No return value. Called for side effects.
Determine whether logger.setup() has already been called.
logger.isInitialized()logger.isInitialized()
This function is useful in package code that conditionally emits log statements only when logging has been configured.
Logical scalar indicating whether logging has been initialized.
## Not run: logger.isInitialized() logger.setup() logger.isInitialized() ## End(Not run)## Not run: logger.isInitialized() logger.setup() logger.isInitialized() ## End(Not run)
Set the minimum log level displayed in the console.
logger.setLevel(level)logger.setLevel(level)
level |
Logging threshold level. |
By default, only FATAL messages are displayed in the console. This
function allows users to display additional log messages interactively.
Available log levels are:
TRACE DEBUG INFO WARN ERROR FATAL
No return value. Called for side effects.
All functionality is implemented with the excellent logger package.
## Not run: # Enable console logging logger.setup() # Show DEBUG and higher messages in the console logger.setLevel(DEBUG) ## End(Not run)## Not run: # Enable console logging logger.setup() # Show DEBUG and higher messages in the console logger.setLevel(DEBUG) ## End(Not run)
Configure level-specific log files using the package logging API.
logger.setup( traceLog = NULL, debugLog = NULL, infoLog = NULL, warnLog = NULL, errorLog = NULL, fatalLog = NULL )logger.setup( traceLog = NULL, debugLog = NULL, infoLog = NULL, warnLog = NULL, errorLog = NULL, fatalLog = NULL )
traceLog |
File path receiving |
debugLog |
File path receiving |
infoLog |
File path receiving |
warnLog |
File path receiving |
errorLog |
File path receiving |
fatalLog |
File path receiving |
Logging is built on top of the logger package while retaining the historical MazamaCoreUtils logging interface.
Separate log files can be created for different log levels so that, for
example, an errorLog contains only ERROR and FATAL messages while a
debugLog contains DEBUG messages as well as all higher-severity messages.
Any log file argument left as NULL is disabled and no file will be created
for that level.
After initialization, logging statements can be generated with:
logger.trace(), logger.debug(), logger.info(),
logger.warn(), logger.error(), and logger.fatal().
Log messages are formatted with:
LEVEL [YYYY-MM-DD HH:MM:SS UTC] message
Console logging is enabled by default only for FATAL messages. Use
logger.setLevel() to display additional log messages in the console.
No return value. Called for side effects.
All functionality is implemented with the excellent logger package.
logger.trace(), logger.debug(), logger.info(),
logger.warn(), logger.error(), logger.fatal()
## Not run: # Create three log files logger.setup( debugLog = "debug.log", infoLog = "info.log", errorLog = "error.log" ) # Generate log messages logger.trace("trace statement #%d", 1) logger.debug("debug statement") logger.info("info statement %s %s", "with", "arguments") logger.warn("warn statement: %s", "about to try something risky") result <- try(1 / "a", silent = TRUE) logger.error("error message: %s", geterrmessage()) logger.fatal("fatal statement: %s", "THE END") cat(readLines("debug.log"), sep = "\n") cat(readLines("info.log"), sep = "\n") cat(readLines("error.log"), sep = "\n") ## End(Not run)## Not run: # Create three log files logger.setup( debugLog = "debug.log", infoLog = "info.log", errorLog = "error.log" ) # Generate log messages logger.trace("trace statement #%d", 1) logger.debug("debug statement") logger.info("info statement %s %s", "with", "arguments") logger.warn("warn statement: %s", "about to try something risky") result <- try(1 / "a", silent = TRUE) logger.error("error message: %s", geterrmessage()) logger.fatal("fatal statement: %s", "THE END") cat(readLines("debug.log"), sep = "\n") cat(readLines("info.log"), sep = "\n") cat(readLines("error.log"), sep = "\n") ## End(Not run)
Emit a TRACE level log message.
logger.trace(msg, ...)logger.trace(msg, ...)
msg |
Message with optional format strings. |
... |
Additional arguments passed to |
Logging must first be initialized with logger.setup().
No return value. Called for side effects.
Emit a WARN level log message.
logger.warn(msg, ...)logger.warn(msg, ...)
msg |
Message with optional format strings. |
... |
Additional arguments passed to |
Logging must first be initialized with logger.setup().
No return value. Called for side effects.
Logging level constants used by the MazamaCoreUtils logging API.
FATALFATAL
An object of class integer of length 1.
Available log levels include:
FATAL ERROR WARN INFO DEBUG TRACE
These constants are retained for backwards compatibility with the original MazamaCoreUtils logging system.
Remove old or excess files from a cache directory.
manageCache( cacheDir = NULL, extensions = c("html", "json", "pdf", "png"), maxCacheSize = 100, sortBy = "atime", maxFileAge = NULL )manageCache( cacheDir = NULL, extensions = c("html", "json", "pdf", "png"), maxCacheSize = 100, sortBy = "atime", maxFileAge = NULL )
cacheDir |
Location of cache directory. |
extensions |
Vector of file extensions eligible for removal. |
maxCacheSize |
Maximum cache size in megabytes. |
sortBy |
Timestamp used to order files for size-based removal. One of
|
maxFileAge |
Maximum file age in days. Files with modification times older than this value are removed regardless of cache size. Fractional days are allowed. |
Files are eligible for removal when their extension matches extensions.
Matching is case-sensitive and extensions may be supplied with or without a
leading dot.
Files can be removed for two reasons:
files older than maxFileAge days are removed first
if the remaining cache exceeds maxCacheSize, additional files are
removed until the cache is under the requested size
When removing files to satisfy maxCacheSize, files are ordered by the
timestamp specified by sortBy.
Timestamp meanings are:
atimeFile access time, updated when a file is opened.
ctimeFile change time, updated when file metadata changes.
mtimeFile modification time, updated when file contents change.
Invisibly returns the number of files removed.
CACHE_DIR <- tempdir() write.csv(matrix(1, 400, 500), file = file.path(CACHE_DIR, "m1.csv")) write.csv(matrix(2, 400, 500), file = file.path(CACHE_DIR, "m2.csv")) write.csv(matrix(3, 400, 500), file = file.path(CACHE_DIR, "m3.csv")) write.csv(matrix(4, 400, 500), file = file.path(CACHE_DIR, "m4.csv")) for (file in list.files(CACHE_DIR, pattern = "\\.csv$", full.names = TRUE)) { print(file.info(file)[, c("size", "mtime")]) } # Remove files based on access time until the cache is under 1 MB manageCache( CACHE_DIR, extensions = "csv", maxCacheSize = 1, sortBy = "atime" ) for (file in list.files(CACHE_DIR, pattern = "\\.csv$", full.names = TRUE)) { print(file.info(file)[, c("size", "mtime")]) }CACHE_DIR <- tempdir() write.csv(matrix(1, 400, 500), file = file.path(CACHE_DIR, "m1.csv")) write.csv(matrix(2, 400, 500), file = file.path(CACHE_DIR, "m2.csv")) write.csv(matrix(3, 400, 500), file = file.path(CACHE_DIR, "m3.csv")) write.csv(matrix(4, 400, 500), file = file.path(CACHE_DIR, "m4.csv")) for (file in list.files(CACHE_DIR, pattern = "\\.csv$", full.names = TRUE)) { print(file.info(file)[, c("size", "mtime")]) } # Remove files based on access time until the cache is under 1 MB manageCache( CACHE_DIR, extensions = "csv", maxCacheSize = 1, sortBy = "atime" ) for (file in list.files(CACHE_DIR, pattern = "\\.csv$", full.names = TRUE)) { print(file.info(file)[, c("size", "mtime")]) }
Convenience wrappers around devtools::check() for package checking at
different levels of thoroughness.
check(pkg = ".") check_fast(pkg = ".") check_faster(pkg = ".") check_fastest(pkg = ".") check_slow(pkg = ".") check_slower(pkg = ".") check_slowest(pkg = ".")check(pkg = ".") check_fast(pkg = ".") check_faster(pkg = ".") check_fastest(pkg = ".") check_slow(pkg = ".") check_slower(pkg = ".") check_slowest(pkg = ".")
pkg |
Package location passed to |
These functions make it easy to run quick checks during active development and more thorough checks before merging or releasing package changes.
The functions are ordered from most thorough to fastest:
check_slowest()Builds the manual, runs donttest and dontrun examples, and
uses --use-gct.
check_slower()Builds the manual and runs donttest and dontrun examples.
check_slow()Builds the manual and runs donttest examples.
check()Standard development check without building the manual or running
donttest examples.
check_fast()Skips vignette building and ignores vignettes during checking.
check_faster()Skips vignette building, ignores vignettes, and skips examples.
check_fastest()Skips vignette building, ignores vignettes, skips examples, and skips tests.
Invisibly returns the result from devtools::check().
Convert character, numeric, integer, or POSIXct datetimes to POSIXct.
parseDatetime( datetime = NULL, timezone = NULL, expectAll = FALSE, isJulian = FALSE, quiet = TRUE )parseDatetime( datetime = NULL, timezone = NULL, expectAll = FALSE, isJulian = FALSE, quiet = TRUE )
datetime |
Vector of character, numeric, integer, or |
timezone |
Olson timezone used to interpret incoming datetimes. |
expectAll |
Logical value specifying whether to stop if any non-missing input values fail to parse. |
isJulian |
Logical value specifying whether |
quiet |
Logical value passed to |
This function accepts a variety of compact date/time formats commonly used in
Mazama Science packages, including Y, Ym, Ymd, YmdH, YmdHM, and
YmdHMS. Inputs may be mixed within the same vector.
Examples of equivalent inputs include:
20181012130900 "2018-10-12-13-09-00" "2018 Oct. 12 13:09:00"
All incoming datetimes are interpreted in the specified timezone. If
datetime is already POSIXct, it is converted to the requested timezone
with lubridate::with_tz().
If a character datetime includes signed offset information, such as
"-07:00", that offset is used by lubridate::parse_date_time() when
determining the equivalent instant.
A POSIXct vector.
Within Mazama Science packages, datetimes not already in POSIXct format are
often represented as compact decimal values with no separators, such as
20181012 or 20181012130900, either as numbers or strings.
parseDatetime() is a wrapper around lubridate::parse_date_time() that
defines the datetime formats supported by MazamaCoreUtils.
# All Y[mdHMS] formats are accepted parseDatetime(2018, timezone = "America/Los_Angeles") parseDatetime(201808, timezone = "America/Los_Angeles") parseDatetime(20180807, timezone = "America/Los_Angeles") parseDatetime(2018080718, timezone = "America/Los_Angeles") parseDatetime(201808071812, timezone = "America/Los_Angeles") parseDatetime(20180807181215, timezone = "America/Los_Angeles") parseDatetime("2018-08-07 18:12:15", timezone = "America/Los_Angeles") parseDatetime("2018-08-07 18:12:15-07:00", timezone = "UTC") # Julian days are accepted parseDatetime( 2018219181215, timezone = "America/Los_Angeles", isJulian = TRUE ) # Mixed vector inputs are accepted parseDatetime( c("2018-10-24 12:00", "201810311200", "2018-11-07 12:00"), timezone = "America/New_York" ) badInput <- c("20181013", NA, "20181015", "181016", "10172018") # Return NA for dates that cannot be parsed parseDatetime(badInput, timezone = "UTC", expectAll = FALSE) ## Not run: # Fail if any non-missing dates cannot be parsed parseDatetime(badInput, timezone = "UTC", expectAll = TRUE) ## End(Not run)# All Y[mdHMS] formats are accepted parseDatetime(2018, timezone = "America/Los_Angeles") parseDatetime(201808, timezone = "America/Los_Angeles") parseDatetime(20180807, timezone = "America/Los_Angeles") parseDatetime(2018080718, timezone = "America/Los_Angeles") parseDatetime(201808071812, timezone = "America/Los_Angeles") parseDatetime(20180807181215, timezone = "America/Los_Angeles") parseDatetime("2018-08-07 18:12:15", timezone = "America/Los_Angeles") parseDatetime("2018-08-07 18:12:15-07:00", timezone = "UTC") # Julian days are accepted parseDatetime( 2018219181215, timezone = "America/Los_Angeles", isJulian = TRUE ) # Mixed vector inputs are accepted parseDatetime( c("2018-10-24 12:00", "201810311200", "2018-11-07 12:00"), timezone = "America/New_York" ) badInput <- c("20181013", NA, "20181015", "181016", "10172018") # Return NA for dates that cannot be parsed parseDatetime(badInput, timezone = "UTC", expectAll = FALSE) ## Not run: # Fail if any non-missing dates cannot be parsed parseDatetime(badInput, timezone = "UTC", expectAll = TRUE) ## End(Not run)
Set the API key associated with a web service provider.
setAPIKey(provider = NULL, key = NULL)setAPIKey(provider = NULL, key = NULL)
provider |
Web service provider. |
key |
API key. |
API keys are stored in package session state and are remembered only for the duration of the current R session.
Invisibly returns the previous value of the API key.
Returns default when target is NULL; otherwise returns target
unchanged.
setIfNull(target, default, enforcedType = NULL)setIfNull(target, default, enforcedType = NULL)
target |
Object to test for |
default |
Object to return when |
enforcedType |
Optional character string specifying the suffix of an
If |
This is useful for assigning default values to optional arguments while preserving any user-supplied value exactly as provided.
Optionally, enforcedType may be used to coerce the returned value to a
specific type. This coercion is applied after the NULL check and affects
both target and default.
The value of target if it is not NULL; otherwise default.
If enforcedType is specified, the returned value is coerced using the
corresponding as.*() function.
setIfNull(NULL, "foo") setIfNull(10, 0) setIfNull("15", 0) # User-supplied values are returned unchanged setIfNull("15", 0) setIfNull("mean", 0) setIfNull(mean, 0) # Optional type enforcement setIfNull("15", 0, enforcedType = "double") setIfNull(NULL, "15", enforcedType = "integer")setIfNull(NULL, "foo") setIfNull(10, 0) setIfNull("15", 0) # User-supplied values are returned unchanged setIfNull("15", 0) setIfNull("mean", 0) setIfNull(mean, 0) # Optional type enforcement setIfNull("15", 0, enforcedType = "double") setIfNull(NULL, "15", enforcedType = "integer")
Print all currently set API keys.
showAPIKeys()showAPIKeys()
No return value. Called for side effects.
NULL
Convenience function for validating that an object is not NULL.
stopIfNull(target, msg = NULL)stopIfNull(target, msg = NULL)
target |
Object to test. |
msg |
Optional error message to display if |
If target is not NULL, it is returned invisibly. If target is
NULL, the function stops with either a default or user-supplied
error message.
This function is especially useful for validating required function arguments or for guarding intermediate results in pipelines.
Invisibly returns target when it is not NULL.
# Return input invisibly if not NULL x <- stopIfNull(5) print(x) # Useful in pipelines y <- 1:10 y_mean <- y %>% stopIfNull() %>% mean() ## Not run: # Trigger the default error message testVar <- NULL stopIfNull(testVar) # Trigger a custom error message stopIfNull(testVar, msg = "This is NULL") # Make a failing pipeline z <- NULL z_mean <- z %>% stopIfNull("This has failed.") %>% mean() ## End(Not run)# Return input invisibly if not NULL x <- stopIfNull(5) print(x) # Useful in pipelines y <- 1:10 y_mean <- y %>% stopIfNull() %>% mean() ## Not run: # Trigger the default error message testVar <- NULL stopIfNull(testVar) # Trigger a custom error message stopIfNull(testVar, msg = "This is NULL") # Make a failing pipeline z <- NULL z_mean <- z %>% stopIfNull("This has failed.") %>% mean() ## End(Not run)
Generate a consistent error message from the result of a try() block.
stopOnError( result, err_msg = "", prefix = "", maxLength = 500, truncatedLength = 120, call. = FALSE )stopOnError( result, err_msg = "", prefix = "", maxLength = 500, truncatedLength = 120, call. = FALSE )
result |
Return value from a |
err_msg |
Optional custom error message. |
prefix |
Optional text to prepend to the error message. |
maxLength |
Maximum allowed error message length before truncation. |
truncatedLength |
Length of the truncated error message. |
call. |
Logical indicating whether the call should be included in the
error message. Passed to |
This function is intended for production code where potentially fragile
operations are wrapped in try(..., silent = TRUE). If result inherits
from "try-error", a cleaned and optionally customized error message is
generated and passed to stop().
If result is not a "try-error", the function returns NULL.
Returns NULL if result is not a "try-error"; otherwise stops with an
error.
If logging has been initialized, the final error message is logged with
logger.error() before calling stop().
## Not run: myFunc <- function(x) { log(x) } result <- try({ myFunc("ten") }, silent = TRUE) stopOnError(result) try({ myFunc("ten") }, silent = TRUE) %>% stopOnError(err_msg = "Unable to process user input") try({ myFunc("ten") }, silent = TRUE) %>% stopOnError( prefix = "USER_INPUT_ERROR", maxLength = 40, truncatedLength = 32 ) ## End(Not run)## Not run: myFunc <- function(x) { log(x) } result <- try({ myFunc("ten") }, silent = TRUE) stopOnError(result) try({ myFunc("ten") }, silent = TRUE) %>% stopOnError(err_msg = "Unable to process user input") try({ myFunc("ten") }, silent = TRUE) %>% stopOnError( prefix = "USER_INPUT_ERROR", maxLength = 40, truncatedLength = 32 ) ## End(Not run)
Create an ordered two-element POSIXct time range from start and end
datetime values.
timeRange( starttime = NULL, endtime = NULL, timezone = NULL, unit = "sec", ceilingStart = FALSE, ceilingEnd = FALSE )timeRange( starttime = NULL, endtime = NULL, timezone = NULL, unit = "sec", ceilingStart = FALSE, ceilingEnd = FALSE )
starttime |
Desired start datetime. |
endtime |
Desired end datetime. |
timezone |
Olson timezone used to interpret incoming datetimes. |
unit |
Unit used for rounding. Passed to |
ceilingStart |
Logical specifying whether to round the start time up instead of down. |
ceilingEnd |
Logical specifying whether to round the end time up instead of down. |
Input values are converted with parseDatetime() using the required
timezone argument. The resulting start and end times are sorted so the
earlier time is always returned first.
By default, both times are rounded down with lubridate::floor_date() using
the requested unit. Set ceilingStart = TRUE or ceilingEnd = TRUE to
round either endpoint up with lubridate::ceiling_date() instead.
Two-element POSIXct vector ordered from earliest to latest.
When startdate or enddate are already POSIXct values, they are first
converted to timezone with lubridate::with_tz() without changing the
represented instant in time.
timeRange( starttime = "2019-01-08 10:12:15", endtime = 20190109102030, timezone = "UTC" ) timeRange( starttime = "2019-01-08 10:12:15", endtime = "2019-01-09 10:20:30", timezone = "UTC", unit = "hour" )timeRange( starttime = "2019-01-08 10:12:15", endtime = 20190109102030, timezone = "UTC" ) timeRange( starttime = "2019-01-08 10:12:15", endtime = "2019-01-09 10:20:30", timezone = "UTC", unit = "hour" )
Convert datetimes to compact character timestamps suitable for file names, identifiers, labels, and other reproducible text output.
timeStamp(datetime = NULL, timezone = NULL, unit = "sec", style = "ymdhms")timeStamp(datetime = NULL, timezone = NULL, unit = "sec", style = "ymdhms")
datetime |
Vector of character, integer, or |
timezone |
Olson timezone used to interpret incoming datetimes. |
unit |
Temporal precision of the generated timestamp. |
style |
Output timestamp style. |
Input values are converted with parseDatetime() using the required
timezone argument. When datetime = NULL, the current UTC time is used
and timezone defaults to "UTC".
The unit argument controls the precision of the output timestamp. The
style argument controls the output format.
Supported unit values are:
"year" "month" "day" "hour" "min" "sec" "msec"
Supported style values are:
"ymdhms" compact calendar time "ymdThms" compact calendar time with "T" separator "julian" year and Julian day "clock" ISO-like clock time
For style = "julian" and unit = "month", the timestamp uses the Julian
day associated with the beginning of the month.
Character vector of timestamps.
When startdate or enddate are already POSIXct values, they are first
converted to timezone with lubridate::with_tz() without changing the
represented instant in time.
datetime <- parseDatetime("2019-01-08 12:30:15", timezone = "UTC") timeStamp() timeStamp(datetime, "UTC", unit = "year") timeStamp(datetime, "UTC", unit = "month") timeStamp(datetime, "UTC", unit = "month", style = "julian") timeStamp(datetime, "UTC", unit = "day") timeStamp(datetime, "UTC", unit = "day", style = "julian") timeStamp(datetime, "UTC", unit = "hour") timeStamp(datetime, "UTC", unit = "min") timeStamp(datetime, "UTC", unit = "sec") timeStamp(datetime, "UTC", unit = "sec", style = "ymdThms") timeStamp(datetime, "UTC", unit = "sec", style = "julian") timeStamp(datetime, "UTC", unit = "sec", style = "clock") timeStamp(datetime, "America/Los_Angeles", unit = "sec", style = "clock") timeStamp(datetime, "America/Los_Angeles", unit = "msec", style = "clock")datetime <- parseDatetime("2019-01-08 12:30:15", timezone = "UTC") timeStamp() timeStamp(datetime, "UTC", unit = "year") timeStamp(datetime, "UTC", unit = "month") timeStamp(datetime, "UTC", unit = "month", style = "julian") timeStamp(datetime, "UTC", unit = "day") timeStamp(datetime, "UTC", unit = "day", style = "julian") timeStamp(datetime, "UTC", unit = "hour") timeStamp(datetime, "UTC", unit = "min") timeStamp(datetime, "UTC", unit = "sec") timeStamp(datetime, "UTC", unit = "sec", style = "ymdThms") timeStamp(datetime, "UTC", unit = "sec", style = "julian") timeStamp(datetime, "UTC", unit = "sec", style = "clock") timeStamp(datetime, "America/Los_Angeles", unit = "sec", style = "clock") timeStamp(datetime, "America/Los_Angeles", unit = "msec", style = "clock")
Rules used by lintFunctionArgs_file() and lintFunctionArgs_dir() to find
date/time function calls that should explicitly specify timezone arguments.
timezoneLintRulestimezoneLintRules
A named list of function/argument pairs.
Each list name is a function to check. Each value is the required named timezone-related argument for that function.
Entries with "DEPRECATED" are used to flag functions that should generally
be avoided in package code because they depend on the local system clock or
timezone.
str(timezoneLintRules)str(timezoneLintRules)
Validate a single longitude/latitude pair to ensure both values are numeric scalars and fall within valid geographic bounds.
validateLonLat(longitude = NULL, latitude = NULL)validateLonLat(longitude = NULL, latitude = NULL)
longitude |
Single longitude in decimal degrees east. |
latitude |
Single latitude in decimal degrees north. |
Longitudes must fall between -180 and 180 degrees and latitudes must fall between -90 and 90 degrees. If validation fails, an error is generated.
Invisibly returns TRUE if validation succeeds.
validateLonLat(-122.5, 47.5) ## Not run: validateLonLat(-200, 47.5) validateLonLat(-122.5, NA) ## End(Not run)validateLonLat(-122.5, 47.5) ## Not run: validateLonLat(-200, 47.5) validateLonLat(-122.5, NA) ## End(Not run)
Validate longitude and latitude vectors to ensure they are numeric, have matching lengths, and contain values within valid geographic bounds.
validateLonsLats(longitude = NULL, latitude = NULL, na.rm = FALSE)validateLonsLats(longitude = NULL, latitude = NULL, na.rm = FALSE)
longitude |
Vector of longitudes in decimal degrees east. |
latitude |
Vector of latitudes in decimal degrees north. |
na.rm |
Logical specifying whether to remove |
Longitudes must fall between -180 and 180 degrees and latitudes must fall between -90 and 90 degrees. If validation fails, an error is generated.
Invisibly returns TRUE if validation succeeds.
longitude <- c(-122.5, -122.4) latitude <- c(47.5, 47.6) validateLonsLats(longitude, latitude) # Remove missing values before validation validateLonsLats( c(-122.5, NA), c(47.5, NA), na.rm = TRUE ) ## Not run: validateLonsLats(c(-200, 0), c(45, 46)) ## End(Not run)longitude <- c(-122.5, -122.4) latitude <- c(47.5, 47.6) validateLonsLats(longitude, latitude) # Remove missing values before validation validateLonsLats( c(-122.5, NA), c(47.5, NA), na.rm = TRUE ) ## Not run: validateLonsLats(c(-200, 0), c(45, 46)) ## End(Not run)