Introduction to zentracloud

Purpose

The package is designed to act as a direct access point to the ZENTRA Cloud API. With a valid token, data for a chosen period can be directly loaded into R. Further, the data is saved into a cache, so that repeated queries for the same time period are performed much faster.

IMPORTANT

The package is currently not well suited to download large amounts of data. The ZENTRA Cloud API is limited to 2000 readings/minute, therefore the requests via the package functions are throttled. For long time periods we thus recommend continuing to use the ZENTRA Cloud interface.

Usage

Installation

# install from GitLab
url = "https://gitlab.com/meter-group-inc/pubpackages/zentracloud"
remotes::install_git(url = url)
# load package
library(zentracloud)
#> Cache directory does not exist. Will be created at '/github/home/.cache/R/zentracloud' on first download.

There might be some start-up messages, which will be explained later on.

Token

A valid token for the API is a prerequisite for the use of the package. Tokens can be generated on the ZENTRA Cloud web interface. There, go to the menu point API in the sidebar. If a valid token exists, it will show up and can be copied. If not, there is an option to add a new key, which will generate a token.

For use in the functions, the token has to be set as an option for the duration of the R session. To reload the setting for every session, the option can be written for example into the .Rprofile.

To set the token for the session use function setZentracloudOptions(). Its arguments are the token, as well as the corresponding domain and any of the other three options that can be set for cache management (details below).

The domain has to be set to know which server the API should query. To find out which options exist, see the help page of setZentracloudOptions(). If you are unsure which of the domains your token is valid for, check the URL of your ZENTRACLOUD web interface.

If the URL starts with zentracloud.com use default, if it starts with aroya.zentracloud.com use aroya and so on.

# set token as option
setZentracloudOptions(token = "<your_token>", domain = <"corresponding_domain">)

# set token in .Rprofile
# open profile
usethis::edit_r_profile()

# add token and domain
options("ZENTRACLOUD_TOKEN" = "<your_token>")
options("ZENTRACLOUD_DOMAIN" = "<corresponding_domain>")

Cache

The other options that can be set for this package all concern the cache. Most importantly, the option to set the cache directory, but also the allowed maximum size and file age.

For these, default values are set upon loading the package, if the options were not predefined otherwise. The defaults are:

  • ZENTRACLOUD_CACHE_MAX_SIZE: 500 kB
  • ZENTRACLOUD_CACHE_MAX_AGE: 7 days

For the directory the default changes depending on the operating system. For instance, the default for Linux is:

  • ZENTRACLOUD_CACHE_DIR: ~/.cache/R/zentracloud

The path is determined using this function:

tools::R_user_dir("zentracloud", which = "cache")

Same as with the token, these options can also be changed using setZentracloudOptions(), or set more permanently in the .Rprofile.

To see all currently set options use getZentracloudOptions()

getZentracloudOptions()
#> <zentracloudOptions>
#>   ZENTRACLOUD_CACHE_DIR     : /home/<user>/.cache/R/zentracloud
#>   ZENTRACLOUD_CACHE_MAX_AGE : 7
#>   ZENTRACLOUD_CACHE_MAX_SIZE: 500
#>   ZENTRACLOUD_DOMAIN        : zentracloud.com
#>   ZENTRACLOUD_TOKEN         : <-- hidden -->

If the cache directory is filled upon loading the package, some checks will run automatically:

  • If files that are older than the maximum allowed age are found, they are deleted. If this is the case, a message will show if and how many files were deleted.

  • If afterwards the size of the cache directory still surpasses the maximum allowed size, a warning will be printed. Then it is up to the user to delete or move further files.

To manually clear the cache of files older than a certain age, use function clearCache(). If argument cache_dir is not provided, the function will read the directory from the options. Any path can be set, as long as the cached files follow the same structure that is automatically created when running getReadings(), which will be described later. The argument file_age takes an integer, which must be observed in the notation. Again, if it is not provided, it will use the default value as stored in the options.

clearCache(file_age = 5L)

To load everything that is currently in your cache use readCache(). This will return a nested list with the data sorted by device and sensor. If argument cache_dir is not provided, the function will use the cache directory set in the options.

cached_data = readCache()

Data

To access the API and request the data use function getReadings(). Some notes on the arguments of the function:

  • Arguments that need to be provided are the device serial number, as well as start and end datetime of the period of interest.

  • Start and end time need to be provided in the format “YYYY-MM-DD hh:mm:ss” and have to be given in the logger time zone!

  • If force_api = TRUE, the cache is be bypassed and the query goes straight to the API. Still, the results are written to the cache.

  • If ignore_cache = TRUE, the function internally uses a tmp directory as cache during processing. No data is written to the cache directory set in the options. Be aware though, that no data are read from the cache either, so it is possible that the run time increases.

When running the function, it first checks whether the queried data (or parts of it) are already in the cache. If yes, it loads it from there, if not, it accesses the ZENTRA Cloud API and requests the data.

The maximum download is 2000 entries at once. That means for periods longer than around 20 days (in case of a measurement interval of 15 minutes), the response is paginated, meaning that the data has to be downloaded in chunks. Between the different chunks a downtime of 60 seconds has to be observed. As such, requesting larger amounts of data takes a while.

The chunks are separately written to the cache to avoid memory shortages. The data is written as .parquet files, which is a highly efficient format, both in regards to the storage space it uses and to reading and writing speed. (More info on the format can be found in a short blog post and on the arrow github page).
Within the cache a directory is created for the device you queried, inside which the data is written partitioned by sensor, year and month. For the example query below, this thus creates a directory tree as such:

This is the structure, that is needed for clearCache() to work reliably.

setZentracloudOptions(
  token = Sys.getenv("ZENTRACLOUD_TOKEN")
  , domain = "default"
)

zentra_data = getReadings(
  device_sn = "06-01185"
  , start_time = "2022-06-01 00:00:00"
  , end_time = "2022-06-14 23:59:00"
  , force_api = FALSE
  , ignore_cache = FALSE
)
str(zentra_data, max.level = 3, give.attr = FALSE)
#> List of 6
#>  $ ATM-410003090_port6: tibble [20 × 48] (S3: tbl_df/tbl/data.frame)
#>   ..$ timestamp_utc                         : int [1:20] 1654066800 1654067700 1654068600 1654069500 1654070400 1654071300 1654072200 1654073100 1654074000 1654074900 ...
#>   ..$ tz_offset                             : int [1:20] -25200 -25200 -25200 -25200 -25200 -25200 -25200 -25200 -25200 -25200 ...
#>   ..$ datetime                              : chr [1:20] "2022-06-01 00:00:00-07:00" "2022-06-01 00:15:00-07:00" "2022-06-01 00:30:00-07:00" "2022-06-01 00:45:00-07:00" ...
#>   ..$ rh_sensor_temp.value                  : num [1:20] 12 11.8 11.4 11.3 11.2 11.2 11.2 11 10.9 10.9 ...
#>   ..$ rh_sensor_temp.error_flag             : logi [1:20] FALSE FALSE FALSE FALSE FALSE FALSE ...
#>   ..$ rh_sensor_temp.error_description      : chr [1:20] NA NA NA NA ...
#>   ..$ precipitation.value                   : num [1:20] 0 0 0 0 0 0 0 0 0 0 ...
#>   ..$ precipitation.error_flag              : logi [1:20] FALSE FALSE FALSE FALSE FALSE FALSE ...
#>   ..$ precipitation.error_description       : chr [1:20] NA NA NA NA ...
#>   ..$ solar_radiation.value                 : num [1:20] 0 0 0 0 0 0 0 0 0 0 ...
#>   ..$ solar_radiation.error_flag            : logi [1:20] FALSE FALSE FALSE FALSE FALSE FALSE ...
#>   ..$ solar_radiation.error_description     : chr [1:20] NA NA NA NA ...
#>   ..$ x_axis_level.value                    : num [1:20] -0.6 -0.7 -0.7 -0.7 -0.7 -0.7 -0.7 -0.6 -0.7 -0.6 ...
#>   ..$ x_axis_level.error_flag               : logi [1:20] FALSE FALSE FALSE FALSE FALSE FALSE ...
#>   ..$ x_axis_level.error_description        : chr [1:20] NA NA NA NA ...
#>   ..$ wind_speed.value                      : num [1:20] 2.93 1.86 2.03 2.24 2.08 2.27 1.63 1.61 1.69 2.63 ...
#>   ..$ wind_speed.error_flag                 : logi [1:20] FALSE FALSE FALSE FALSE FALSE FALSE ...
#>   ..$ wind_speed.error_description          : chr [1:20] NA NA NA NA ...
#>   ..$ lightning_distance.value              : num [1:20] 0 0 0 0 0 0 0 0 0 0 ...
#>   ..$ lightning_distance.error_flag         : logi [1:20] FALSE FALSE FALSE FALSE FALSE FALSE ...
#>   ..$ lightning_distance.error_description  : chr [1:20] NA NA NA NA ...
#>   ..$ wind_direction.value                  : num [1:20] 105 114 107 106 98 99 101 102 95 96 ...
#>   ..$ wind_direction.error_flag             : logi [1:20] FALSE FALSE FALSE FALSE FALSE FALSE ...
#>   ..$ wind_direction.error_description      : chr [1:20] NA NA NA NA ...
#>   ..$ y_axis_level.value                    : num [1:20] 0.8 0.8 0.8 0.8 0.8 0.9 0.9 0.8 0.8 0.8 ...
#>   ..$ y_axis_level.error_flag               : logi [1:20] FALSE FALSE FALSE FALSE FALSE FALSE ...
#>   ..$ y_axis_level.error_description        : chr [1:20] NA NA NA NA ...
#>   ..$ vpd.value                             : num [1:20] 0.39 0.35 0.36 0.35 0.35 0.35 0.31 0.31 0.3 0.34 ...
#>   ..$ vpd.error_flag                        : logi [1:20] FALSE FALSE FALSE FALSE FALSE FALSE ...
#>   ..$ vpd.error_description                 : chr [1:20] "" "" "" "" ...
#>   ..$ max_precip_rate.value                 : num [1:20] 0 0 0 0 0 0 0 0 0 0 ...
#>   ..$ max_precip_rate.error_flag            : logi [1:20] FALSE FALSE FALSE FALSE FALSE FALSE ...
#>   ..$ max_precip_rate.error_description     : chr [1:20] NA NA NA NA ...
#>   ..$ lightning_activity.value              : num [1:20] 0 0 0 0 0 0 0 0 0 0 ...
#>   ..$ lightning_activity.error_flag         : logi [1:20] FALSE FALSE FALSE FALSE FALSE FALSE ...
#>   ..$ lightning_activity.error_description  : chr [1:20] NA NA NA NA ...
#>   ..$ atmospheric_pressure.value            : num [1:20] 93 93 93 93 93 ...
#>   ..$ atmospheric_pressure.error_flag       : logi [1:20] FALSE FALSE FALSE FALSE FALSE FALSE ...
#>   ..$ atmospheric_pressure.error_description: chr [1:20] NA NA NA NA ...
#>   ..$ vapor_pressure.value                  : num [1:20] 1.02 1.02 1 1.01 1.01 ...
#>   ..$ vapor_pressure.error_flag             : logi [1:20] FALSE FALSE FALSE FALSE FALSE FALSE ...
#>   ..$ vapor_pressure.error_description      : chr [1:20] NA NA NA NA ...
#>   ..$ air_temperature.value                 : num [1:20] 12.1 11.6 11.6 11.5 11.5 11.5 11.2 11.2 11.1 11.5 ...
#>   ..$ air_temperature.error_flag            : logi [1:20] FALSE FALSE FALSE FALSE FALSE FALSE ...
#>   ..$ air_temperature.error_description     : chr [1:20] NA NA NA NA ...
#>   ..$ gust_speed.value                      : num [1:20] 5 2.96 3.34 4 4.08 4.75 2.75 3.14 3.96 5.45 ...
#>   ..$ gust_speed.error_flag                 : logi [1:20] FALSE FALSE FALSE FALSE FALSE FALSE ...
#>   ..$ gust_speed.error_description          : chr [1:20] NA NA NA NA ...
#>  $ T12-0000248_port5  : tibble [20 × 12] (S3: tbl_df/tbl/data.frame)
#>   ..$ timestamp_utc                          : int [1:20] 1654066800 1654067700 1654068600 1654069500 1654070400 1654071300 1654072200 1654073100 1654074000 1654074900 ...
#>   ..$ tz_offset                              : int [1:20] -25200 -25200 -25200 -25200 -25200 -25200 -25200 -25200 -25200 -25200 ...
#>   ..$ datetime                               : chr [1:20] "2022-06-01 00:00:00-07:00" "2022-06-01 00:15:00-07:00" "2022-06-01 00:30:00-07:00" "2022-06-01 00:45:00-07:00" ...
#>   ..$ saturation_extract_ec.value            : num [1:20] 2.1 2.09 2.1 2.1 2.1 ...
#>   ..$ saturation_extract_ec.error_flag       : logi [1:20] FALSE FALSE FALSE FALSE FALSE FALSE ...
#>   ..$ saturation_extract_ec.error_description: chr [1:20] NA NA NA NA ...
#>   ..$ water_content.value                    : num [1:20] 0.361 0.361 0.361 0.361 0.361 0.361 0.361 0.361 0.361 0.361 ...
#>   ..$ water_content.error_flag               : logi [1:20] FALSE FALSE FALSE FALSE FALSE FALSE ...
#>   ..$ water_content.error_description        : chr [1:20] NA NA NA NA ...
#>   ..$ soil_temperature.value                 : num [1:20] 10.1 10.1 10.1 10.1 10.1 10.1 10.1 10.1 10.1 10.1 ...
#>   ..$ soil_temperature.error_flag            : logi [1:20] FALSE FALSE FALSE FALSE FALSE FALSE ...
#>   ..$ soil_temperature.error_description     : chr [1:20] NA NA NA NA ...
#>  $ T12-0000495_port4  : tibble [20 × 12] (S3: tbl_df/tbl/data.frame)
#>   ..$ timestamp_utc                          : int [1:20] 1654066800 1654067700 1654068600 1654069500 1654070400 1654071300 1654072200 1654073100 1654074000 1654074900 ...
#>   ..$ tz_offset                              : int [1:20] -25200 -25200 -25200 -25200 -25200 -25200 -25200 -25200 -25200 -25200 ...
#>   ..$ datetime                               : chr [1:20] "2022-06-01 00:00:00-07:00" "2022-06-01 00:15:00-07:00" "2022-06-01 00:30:00-07:00" "2022-06-01 00:45:00-07:00" ...
#>   ..$ soil_temperature.value                 : num [1:20] 10.7 10.7 10.7 10.7 10.7 10.7 10.7 10.7 10.7 10.7 ...
#>   ..$ soil_temperature.error_flag            : logi [1:20] FALSE FALSE FALSE FALSE FALSE FALSE ...
#>   ..$ soil_temperature.error_description     : chr [1:20] NA NA NA NA ...
#>   ..$ water_content.value                    : num [1:20] 0.348 0.348 0.348 0.348 0.348 0.348 0.348 0.348 0.348 0.348 ...
#>   ..$ water_content.error_flag               : logi [1:20] FALSE FALSE FALSE FALSE FALSE FALSE ...
#>   ..$ water_content.error_description        : chr [1:20] NA NA NA NA ...
#>   ..$ saturation_extract_ec.value            : num [1:20] 1.86 1.86 1.86 1.86 1.86 ...
#>   ..$ saturation_extract_ec.error_flag       : logi [1:20] FALSE FALSE FALSE FALSE FALSE FALSE ...
#>   ..$ saturation_extract_ec.error_description: chr [1:20] NA NA NA NA ...
#>  $ T12-0000499_port2  : tibble [20 × 12] (S3: tbl_df/tbl/data.frame)
#>   ..$ timestamp_utc                          : int [1:20] 1654066800 1654067700 1654068600 1654069500 1654070400 1654071300 1654072200 1654073100 1654074000 1654074900 ...
#>   ..$ tz_offset                              : int [1:20] -25200 -25200 -25200 -25200 -25200 -25200 -25200 -25200 -25200 -25200 ...
#>   ..$ datetime                               : chr [1:20] "2022-06-01 00:00:00-07:00" "2022-06-01 00:15:00-07:00" "2022-06-01 00:30:00-07:00" "2022-06-01 00:45:00-07:00" ...
#>   ..$ saturation_extract_ec.value            : num [1:20] 1.47 1.47 1.47 1.47 1.47 ...
#>   ..$ saturation_extract_ec.error_flag       : logi [1:20] FALSE FALSE FALSE FALSE FALSE FALSE ...
#>   ..$ saturation_extract_ec.error_description: chr [1:20] NA NA NA NA ...
#>   ..$ soil_temperature.value                 : num [1:20] 12.2 12.2 12.2 12.2 12.2 12.2 12.3 12.3 12.3 12.3 ...
#>   ..$ soil_temperature.error_flag            : logi [1:20] FALSE FALSE FALSE FALSE FALSE FALSE ...
#>   ..$ soil_temperature.error_description     : chr [1:20] NA NA NA NA ...
#>   ..$ water_content.value                    : num [1:20] 0.334 0.334 0.334 0.334 0.334 0.334 0.334 0.334 0.334 0.334 ...
#>   ..$ water_content.error_flag               : logi [1:20] FALSE FALSE FALSE FALSE FALSE FALSE ...
#>   ..$ water_content.error_description        : chr [1:20] NA NA NA NA ...
#>  $ T12-0000500_port1  : tibble [20 × 12] (S3: tbl_df/tbl/data.frame)
#>   ..$ timestamp_utc                          : int [1:20] 1654066800 1654067700 1654068600 1654069500 1654070400 1654071300 1654072200 1654073100 1654074000 1654074900 ...
#>   ..$ tz_offset                              : int [1:20] -25200 -25200 -25200 -25200 -25200 -25200 -25200 -25200 -25200 -25200 ...
#>   ..$ datetime                               : chr [1:20] "2022-06-01 00:00:00-07:00" "2022-06-01 00:15:00-07:00" "2022-06-01 00:30:00-07:00" "2022-06-01 00:45:00-07:00" ...
#>   ..$ saturation_extract_ec.value            : num [1:20] 0.977 0.977 0.977 0.972 0.969 0.968 0.97 0.971 0.971 0.976 ...
#>   ..$ saturation_extract_ec.error_flag       : logi [1:20] FALSE FALSE FALSE FALSE FALSE FALSE ...
#>   ..$ saturation_extract_ec.error_description: chr [1:20] NA NA NA NA ...
#>   ..$ soil_temperature.value                 : num [1:20] 13.8 13.8 13.8 13.8 13.8 13.8 13.8 13.8 13.7 13.7 ...
#>   ..$ soil_temperature.error_flag            : logi [1:20] FALSE FALSE FALSE FALSE FALSE FALSE ...
#>   ..$ soil_temperature.error_description     : chr [1:20] NA NA NA NA ...
#>   ..$ water_content.value                    : num [1:20] 0.291 0.291 0.291 0.291 0.291 0.291 0.291 0.291 0.291 0.291 ...
#>   ..$ water_content.error_flag               : logi [1:20] FALSE FALSE FALSE FALSE FALSE FALSE ...
#>   ..$ water_content.error_description        : chr [1:20] NA NA NA NA ...
#>  $ T12-0000504_port3  : tibble [20 × 12] (S3: tbl_df/tbl/data.frame)
#>   ..$ timestamp_utc                          : int [1:20] 1654066800 1654067700 1654068600 1654069500 1654070400 1654071300 1654072200 1654073100 1654074000 1654074900 ...
#>   ..$ tz_offset                              : int [1:20] -25200 -25200 -25200 -25200 -25200 -25200 -25200 -25200 -25200 -25200 ...
#>   ..$ datetime                               : chr [1:20] "2022-06-01 00:00:00-07:00" "2022-06-01 00:15:00-07:00" "2022-06-01 00:30:00-07:00" "2022-06-01 00:45:00-07:00" ...
#>   ..$ saturation_extract_ec.value            : num [1:20] 1.18 1.19 1.19 1.19 1.19 ...
#>   ..$ saturation_extract_ec.error_flag       : logi [1:20] FALSE FALSE FALSE FALSE FALSE FALSE ...
#>   ..$ saturation_extract_ec.error_description: chr [1:20] NA NA NA NA ...
#>   ..$ soil_temperature.value                 : num [1:20] 11.5 11.5 11.5 11.5 11.5 11.5 11.6 11.6 11.6 11.6 ...
#>   ..$ soil_temperature.error_flag            : logi [1:20] FALSE FALSE FALSE FALSE FALSE FALSE ...
#>   ..$ soil_temperature.error_description     : chr [1:20] NA NA NA NA ...
#>   ..$ water_content.value                    : num [1:20] 0.328 0.328 0.328 0.328 0.328 0.328 0.328 0.328 0.328 0.328 ...
#>   ..$ water_content.error_flag               : logi [1:20] FALSE FALSE FALSE FALSE FALSE FALSE ...
#>   ..$ water_content.error_description        : chr [1:20] NA NA NA NA ...

The data that is returned in zentra_data is a list with entries for all the sensors connected to the queried device. Each entry in turn contains a data.frame with columns for the date & time specifications and for each variable measured, as well as the corresponding error flags and descriptions.

Attached as attributes to the value columns are the corresponding unit and measurement precision. To access the attributes do the following:


# gives back the names of the list items, i.e. the sensor names
attributes(zentra_data)
#> $names
#> [1] "ATM-410003090_port6" "T12-0000248_port5"   "T12-0000495_port4"  
#> [4] "T12-0000499_port2"   "T12-0000500_port1"   "T12-0000504_port3"

# querying the attributes of single columns gives back the measurement unit 
# and precision
vars = c("saturation_extract_ec.value", "soil_temperature.value", "water_content.value")
vars_attr = sapply(
  vars
  , \(v) {attributes(zentra_data$`T12-0000248_port5`[[v]])}
  , simplify = FALSE
  )
str(vars_attr)
#> List of 3
#>  $ saturation_extract_ec.value:List of 2
#>   ..$ unit     : chr " mS/cm"
#>   ..$ precision: chr "3"
#>  $ soil_temperature.value     :List of 2
#>   ..$ unit     : chr " °C"
#>   ..$ precision: chr "1"
#>  $ water_content.value        :List of 2
#>   ..$ unit     : chr " m³/m³"
#>   ..$ precision: chr "3"

NOTE:

Variable names in the returned zentra_data are taken from the API response. This should ensure compatibility with data downloaded via other methods (e.g. as csv from the ZENTRACLOUD). We suggest to make syntactically valid names before continuing data analysis (e.g. with base::make_names() or janitor::clean_names()).

Settings

For internal use within getReadings() the device settings are queried. This is necessary to accurately deal with the timestamps. The function can be called on its own as well.

As this function also accesses the API, a token and device serial number are necessary.

From the settings, information such as measurement intervals, location and time settings can be read. The returned object is a nested list.

set = queryDeviceSettings(
  device_sn = "06-01185"
)

# this function allows a quick view into the data structure of the list:
listviewer::jsonedit(set)