How to Use RStudio with NSSP-ESSENCE APIs

Introduction

RStudio lets you access ESSENCE data more securely, create your own R Markdown reports and Shiny applications, and do exploratory analyses not possible within ESSENCE. In this R Markdown document, we will provide an overview of the ESSENCE application programming interfaces (APIs) and explain how to access them through RStudio. We will also list the APIs available in ESSENCE and show basic examples of R code and packages so that you can start using RStudio.

An API is a structured and consistent way for one machine to exchange information with another machine. ESSENCE has APIs that allow you to programmatically access and further manipulate your data from outside the system. You may select the CSV format to export data, and in some instances, JSON is also supported. More information about the APIs can be found in ESSENCE under “More,” then “User Guide,” and then API Documentation. You may write API URL syntax on your own after reading the documentation, or you can let ESSENCE create the API URL by clicking the “API URL” button on an ESSENCE page after completing a query.

10 ESSENCE APIs are highlighted in this guide:

Time series data table,
Time series PNG image,
Table builder results,
Data details (line level),
Summary stats on the number of unique facilities or regions in your query results,
Alert list detection tables,
Facility-level data quality metrics,
ESSENCE query fields,
CCDD category table,
Time series data table with stratified, historical alerts

RStudio: Securely Saving AMC Credentials

The first step is to create an R script or R Markdown file and load the necessary packages. By default, some packages might already be in your system library. But if not, you can install them yourself by clicking on the packages tab in the bottom-right quadrant of the RStudio interface and selecting the Install tab. Here are the package names and configuration statement for getting started:

library(Rnssp)
library(tidyverse)
library(httr)
library(jsonlite)
library(keyring)
library(lubridate)

To extract data from ESSENCE, you start by passing authentication information from your RStudio session to ESSENCE so that it knows what data you are allowed to see. It is bad practice to explicitly include your username and password in your code. We provide two options to do this work for us: the Rnssp or keyring package. Whenever possible, we encourage users to adopt the more secure methodology presented in Rnssp (option 1 below).

Option 1: `Rnssp` (preferred)

In June 2021, the Rnssp library was installed on the instance of RStudio Workbench hosted on the BioSense Platform. If you are using a local instance of RStudio, you can install the development version of Rnssp from GitHub by running devtools::install_github("cdcgov/Rnssp") in your console. The Rnssp GitHub repository can be accessed at https://github.com/CDCgov/Rnssp, with additional documentation and vignettes located at https://cdcgov.github.io/Rnssp/. Rnssp provides functionality to securely save credentials and interact with ESSENCE APIs. When you run the following code, a pop-up will appear in RStudio for you to enter your AMC username and password. This will create a user profile object of the class Credentials, designed with the R6 object system that integrates classic object-oriented programming concepts into R. To render an R Markdown document, save the myProfile object as an .rda or .rds file to your home directory. This only needs to be done once. The following code chunk is presented for demonstrative purposes only and does not need to be included in your actual R Markdown code.

library(Rnssp)

myProfile <- Credentials$new(
  username = askme("Enter your username: "),
  password = askme()
)

# rda option-----------------------------------------------------------------------------------------------------
save(myProfile, file = "~/myProfile.rda") 

# rds option-----------------------------------------------------------------------------------------------------
saveRDS(myProfile, "~/myProfile.rds")

myProfile can then be loaded by including either of the following lines of code in your introductory code chunks (just use one option):

# rda option-----------------------------------------------------------------------------------------------------
load("~/myProfile.rda") 

# rds option-----------------------------------------------------------------------------------------------------
myProfile <- readRDS("~/myProfile.rds")

Note that your username and password are fully encrypted in your user profile and are not visible when viewing or inspecting the object:

myProfile

## <NSSPCredentials>
##   Public:
##     clone: function (deep = FALSE) 
##     get_api_data: function (url, fromCSV = FALSE, ...) 
##     get_api_response: function (url) 
##     get_api_tsgraph: function (url) 
##     initialize: function (username, password) 
##   Private:
##     ..__: NSSPContainer, R6
##     ..password: NSSPContainer, R6
##     ..username: NSSPContainer, R6

The myProfile object comes with the following methods:

$get_api_response(): Retrieves requested information specified in the API URL from ESSENCE
$get_api_data(): Extracts the content (data) from the API response and parses into an R data frame
$get_api_tsgraph(): Retrieves an ESSENCE time series graph and saves as a PNG to a temporary directory

Option 2: `keyring`

Prior to the development of the Rnssp package, the keyring library was used to authenticate AMC credentials to NSSP-ESSENCE when pulling data via the APIs. While using keyring is more secure than explicitly referencing your credentials from a source file, it is less secure and efficient than the Rnssp method. In 2021, the keyring library was updated and now requires users to specify that credentials should be saved to hidden background environment variables. NSSP applied a patch in RStudio Workbench in January 2022 so that keyring can be used as it was previously. keyring will save your AMC credentials to background environment variables that will persist for the duration of an individual R session. When you run the following line of code (with your username entered in the username quotes) a pop-up will appear in RStudio where you will need to enter your password. Note: You only need to save your credentials once per session.

key_set(service = "essence", username = "msheppardoa01")

After you enter your password in the pop-up window, be sure to “comment out” this line of code by adding a hash mark before it as shown below:

# key_set(service = "essence", username = "msheppardoa01")

In each example that follows, you will see a common pattern emerge: First, the URL must be defined as an object in your RStudio session; then the API response needs to be retrieved and content extracted to an R data frame using either the Rnssp $get_api_data() method or GET from httr before further analysis. When you use this approach, R Markdown provides an easily reproducible workflow for integrating report text with code that reads in data, manipulates data as needed, and produces analyses and visualizations in such a way that can be handed off to colleagues without documenting manual actions (point/click, etc).

Time Series Data Table

In this example, we will show you how to pull a time series data table from ESSENCE into RStudio. Note: All examples use the limited details data sources available in NSSP-ESSENCE. In the code below, the first object created is the URL for the ESSENCE API endpoint of interest, which for this example is a national trend of the injury syndrome. The second object, api_response, authenticates your credentials so that ESSENCE knows you are permitted to access these data and then retrieves the requested data if authentication is successful. The next two objects process JSON-formatted data into an R data frame. By default, time series APIs pull in JSON-formatted data. The fromJSON() function converts these data to a list of length 2, where the second element of the list is a data frame that can be extracted with the pluck() function from the purrr library. Lastly, the glimpse() function from dplyr provides a quick sense of every variable in the data frame.

url <- "https://essence.syndromicsurveillance.org/nssp_essence/api/timeSeries?endDate=31Dec2022&medicalGrouping=injury&percentParam=noPercent&geographySystem=hospitaldhhsregion&datasource=va_hospdreg&detector=probrepswitch&startDate=03Oct2022&timeResolution=daily&medicalGroupingSystem=essencesyndromes&userId=455&aqtTarget=TimeSeries"

# Rnssp option---------------------------------------------------------------------------------------------------
api_response <- myProfile$get_api_response(url)
api_response_json <- content(api_response, as = "text")

## No encoding supplied: defaulting to UTF-8.

api_data <- fromJSON(api_response_json) %>%
  pluck("timeSeriesData")

# keyring option-------------------------------------------------------------------------------------------------
api_response <- GET(url, 
                    authenticate(key_list("essence")[1,2], 
                                 key_get("essence", 
                                 key_list("essence")[1,2])))

api_response_json <- content(api_response, as = "text")

## No encoding supplied: defaulting to UTF-8.

api_data <- fromJSON(api_response_json) %>%
  pluck("timeSeriesData")

glimpse(api_data)

## Rows: 90
## Columns: 8
## $ date     <chr> "2022-10-03", "2022-10-04", "2022-10-05", "2022-10-06", "2022…
## $ count    <dbl> 51779, 49719, 49118, 49686, 49990, 46789, 45608, 51096, 51375…
## $ expected <chr> "51952.7", "50393.255", "48662.281", "48473.657", "48828.893"…
## $ levels   <chr> "0.576", "0.769", "0.305", "0.091", "0.095", "0.543", "0.351"…
## $ colorID  <int> 1, 1, 1, 1, 1, 1, 1, 3, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1…
## $ color    <chr> "blue", "blue", "blue", "blue", "blue", "blue", "blue", "red"…
## $ altText  <chr> "Data: Date: 03Oct22, Level: 0.576, Count: 51779, Expected: 5…
## $ details  <chr> "/nssp_essence/api/dataDetails?percentParam=noPercent&datasou…

IMPORTANT!

Alternatively, with Rnssp we can pull these data with two lines of code by using the $get_api_data() method, which implicitly pulls and extracts data using the steps outlined in the example above. Occasionally there are scenarios in which retrieving the API response first is beneficial—for example, when the API response status code needs to be inspected if data are not pulling as expected. In general, we recommend the more direct approach of using the $get_api_data() method shown below and only recommend using get_api_response() for debugging purposes. The remaining Rnssp examples in this guide do not use get_api_response().

api_data <- myProfile$get_api_data(url) %>%
  pluck("timeSeriesData")

## No encoding supplied: defaulting to UTF-8.

Time Series Graph from ESSENCE

This example shows how to retrieve the ESSENCE graph instead of the underlying data for the graph, as shown previously. Here, we use the same national injury syndrome trend as before, but notice the URL now includes ..”api/timeSeries/graph?…“. You can add a title or axes labels by adding other parameters to the URL:”&graphTitle=Injury%20Syndrome&xAxisLabel=Date&yAxisLabel=Count”.

url <- "https://essence.syndromicsurveillance.org/nssp_essence/api/timeSeries/graph?endDate=31Dec2022&medicalGrouping=injury&percentParam=noPercent&geographySystem=hospitaldhhsregion&datasource=va_hospdreg&detector=probrepswitch&startDate=03Oct2022&timeResolution=daily&medicalGroupingSystem=essencesyndromes&userId=455&aqtTarget=TimeSeries&graphTitle=National%20-%20Injury%20Syndrome%20Daily%20Counts&xAxisLabel=Date&yAxisLabel=Count"

# Rnssp option---------------------------------------------------------------------------------------------------
api_png <- myProfile$get_api_tsgraph(url)

knitr::include_graphics(api_png$tsgraph)

# keyring option-------------------------------------------------------------------------------------------------
api_response <- GET(url, 
                    authenticate(key_list("essence")[1,2], 
                                 key_get("essence", 
                                 key_list("essence")[1,2])),
                    write_disk("timeseries1.png", overwrite = TRUE))

knitr::include_graphics("timeseries1.png")

Table Builder Results

With the table builder API, ESSENCE does the heavy lifting by summarizing your query and presenting results in tabular format where you can define rows, nested rows, and column variables for output. If the query is supported in the ESSENCE query manager, you should be able to use table builder to summarize the output, which is usually more efficient than manipulating large amounts of line-level data yourself. In this example, we use the CDC Opioid Overdose v3 Chief Complaint Discharge Diagnosis (CCDD) category and create a table of counts per month by U.S. Department of Health & Human Services (HHS) region. Visits are limited to emergency department visits by selecting Has been Emergency = “Yes”. Output formats for table builder results are CSV and JSON. The CSV option will pull in data that matches the tabular format seen in the ESSENCE interface, whereas the JSON option will pull in data that is transformed to a long, pivoted format. The latter is recommended for circumventing initial data transformations into a long format and is compatible with functions/libraries based on tidyverse principles.

Example - CSV

When the CSV option is used, you will need to specify this by using the fromCSV argument as shown below. By default, $get_api_data() assumes that the JSON option was selected.

url <- "https://essence.syndromicsurveillance.org/nssp_essence/api/tableBuilder/csv?endDate=31Dec2022&ccddCategory=cdc%20opioid%20overdose%20v3&percentParam=noPercent&geographySystem=hospitaldhhsregion&datasource=va_hospdreg&detector=nodetectordetector&startDate=03Oct2022&ageNCHS=11-14&ageNCHS=15-24&ageNCHS=25-34&ageNCHS=35-44&ageNCHS=45-54&ageNCHS=55-64&ageNCHS=65-74&ageNCHS=75-84&ageNCHS=85-1000&ageNCHS=unknown&timeResolution=monthly&hasBeenE=1&medicalGroupingSystem=essencesyndromes&userId=455&aqtTarget=TableBuilder&rowFields=timeResolution&rowFields=geographyhospitaldhhsregion&columnField=ageNCHS"

# Rnssp option---------------------------------------------------------------------------------------------------
api_data <- myProfile$get_api_data(url, fromCSV = TRUE)

## Rows: 33 Columns: 12
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr  (2): timeResolution, geographyhospitaldhhsregion
## dbl (10): 11-14, 15-24, 25-34, 35-44, 45-54, 55-64, 65-74, 75-84, 85+, Unknown
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

# keyring option-------------------------------------------------------------------------------------------------
api_response <- GET(url, 
                    authenticate(key_list("essence")[1,2], 
                                 key_get("essence", 
                                 key_list("essence")[1,2])))

api_response_csv <- content(api_response, by = "text/csv")
api_data <- read_csv(api_response_csv)

## Rows: 33 Columns: 12
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr  (2): timeResolution, geographyhospitaldhhsregion
## dbl (10): 11-14, 15-24, 25-34, 35-44, 45-54, 55-64, 65-74, 75-84, 85+, Unknown
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

glimpse(api_data)

## Rows: 33
## Columns: 12
## $ timeResolution              <chr> "2022-10", "2022-10", "2022-10", "2022-10"…
## $ geographyhospitaldhhsregion <chr> "OTHER_REGION", "Region 1", "Region 10", "…
## $ `11-14`                     <dbl> 0, 1, 4, 2, 3, 5, 5, 5, 2, 1, 4, 0, 3, 4, …
## $ `15-24`                     <dbl> 0, 108, 132, 89, 128, 476, 246, 160, 64, 7…
## $ `25-34`                     <dbl> 0, 407, 375, 382, 551, 1551, 821, 319, 68,…
## $ `35-44`                     <dbl> 0, 471, 281, 402, 502, 1532, 804, 322, 45,…
## $ `45-54`                     <dbl> 0, 311, 172, 349, 375, 973, 625, 181, 27, …
## $ `55-64`                     <dbl> 0, 279, 182, 395, 420, 918, 692, 146, 29, …
## $ `65-74`                     <dbl> 0, 126, 132, 168, 180, 540, 357, 111, 20, …
## $ `75-84`                     <dbl> 0, 18, 57, 29, 41, 232, 67, 48, 6, 27, 45,…
## $ `85+`                       <dbl> 0, 8, 11, 13, 17, 81, 20, 14, 4, 4, 17, 0,…
## $ Unknown                     <dbl> 0, 39, 13, 28, 50, 49, 27, 7, 1, 2, 5, 0, …

The following example demonstrates the necessary data transformations to achieve the long format that is output from the JSON option by default.

api_data_long <- api_data %>%
  pivot_longer(cols = -c(timeResolution, geographyhospitaldhhsregion), names_to = "ageNCHS", values_to = "count") %>%
  filter(geographyhospitaldhhsregion != "OTHER_REGION")

api_data_long

## # A tibble: 300 × 4
##    timeResolution geographyhospitaldhhsregion ageNCHS count
##    <chr>          <chr>                       <chr>   <dbl>
##  1 2022-10        Region 1                    11-14       1
##  2 2022-10        Region 1                    15-24     108
##  3 2022-10        Region 1                    25-34     407
##  4 2022-10        Region 1                    35-44     471
##  5 2022-10        Region 1                    45-54     311
##  6 2022-10        Region 1                    55-64     279
##  7 2022-10        Region 1                    65-74     126
##  8 2022-10        Region 1                    75-84      18
##  9 2022-10        Region 1                    85+         8
## 10 2022-10        Region 1                    Unknown    39
## # … with 290 more rows

Example - JSON

If the JSON option is selected, the only difference from the API URL for the CSV option is that the string “/csv” is missing after “api/tableBuilder”.

url <- "https://essence2.syndromicsurveillance.org/nssp_essence/api/tableBuilder?endDate=31Dec2022&ccddCategory=cdc%20opioid%20overdose%20v3&percentParam=noPercent&geographySystem=hospitaldhhsregion&datasource=va_hospdreg&detector=nodetectordetector&startDate=03Oct2022&ageNCHS=11-14&ageNCHS=15-24&ageNCHS=25-34&ageNCHS=35-44&ageNCHS=45-54&ageNCHS=55-64&ageNCHS=65-74&ageNCHS=75-84&ageNCHS=85-1000&ageNCHS=unknown&timeResolution=monthly&hasBeenE=1&medicalGroupingSystem=essencesyndromes&userId=2362&aqtTarget=TableBuilder&rowFields=timeResolution&rowFields=geographyhospitaldhhsregion&columnField=ageNCHS"

# Rnssp option---------------------------------------------------------------------------------------------------
api_data <- myProfile$get_api_data(url)

# keyring option-------------------------------------------------------------------------------------------------
api_response <- GET(url, 
                    authenticate(key_list("essence")[1,2], 
                                 key_get("essence", 
                                 key_list("essence")[1,2])))

api_response_json <- content(api_response, as = "text")
api_data <- fromJSON(api_response_json) %>%
  filter(geographyhospitaldhhsregion != "OTHER_REGION")

glimpse(api_data)

## Rows: 300
## Columns: 4
## $ timeResolution              <chr> "2022-10", "2022-10", "2022-10", "2022-10"…
## $ geographyhospitaldhhsregion <chr> "Region 1", "Region 1", "Region 1", "Regio…
## $ ageNCHS                     <chr> "11-14", "15-24", "25-34", "35-44", "45-54…
## $ count                       <dbl> 1, 108, 407, 471, 311, 279, 126, 18, 8, 39…

You may have noticed that the table builder in the ESSENCE user interface is limited to creating tables of up to 30,000 cells. A table this large should suffice in most cases. Keep in mind, however, that the table builder API does not impose a limit on the output table size. To create the API yourself instead of having ESSENCE create it for you, use the “API URLs” button. Familiarize yourself with the structure of the API, and then add parameters that follow the structure. To help with this, look at some examples where ESSENCE has created API URLs for you using the available buttons.

Data Details (line-level limited details)

Sometimes you just need line-level data, and this example describes how to extract those data from ESSENCE. This method will create a very large data set quickly. As with any query with potential to create large data sets, first test the query on a small amount of data. Consider multiple calls of smaller time ranges before combining the separate data frames to create a final data set. This API gives you those options. You can specify the variables to include and whether you want a data set with raw or reference values. Downloads with reference values take longer to stream into RStudio because ESSENCE must create those reference values. Output formats for data details are CSV and JSON.

Example 1 - CSV

url <- "https://essence.syndromicsurveillance.org/nssp_essence/api/dataDetails/csv?medicalGrouping=injury&geography=region%20i&percentParam=noPercent&geographySystem=hospitaldhhsregion&datasource=va_hospdreg&detector=probrepswitch&timeResolution=daily&medicalGroupingSystem=essencesyndromes&userId=455&aqtTarget=dataDetails&startDate=30Dec2022&endDate=31Dec2022"

# Rnssp option---------------------------------------------------------------------------------------------------
api_data <- myProfile$get_api_data(url, fromCSV = TRUE)

## Rows: 5005 Columns: 29
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr  (20): Date, Category_flat, SubCategory_flat, Patient_Class, HospitalDHH...
## dbl   (8): HasBeenE, HasBeenI, HasBeenO, DDAvailable, DDInformative, CCAvail...
## dttm  (1): FirstDateTimeAdded
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

# keyring option-------------------------------------------------------------------------------------------------
api_response <- GET(url, 
                    authenticate(key_list("essence")[1,2], 
                                 key_get("essence", 
                                 key_list("essence")[1,2])))

api_data <- content(api_response, by = "csv/text") %>%
  read_csv()

## Rows: 5005 Columns: 29
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr  (20): Date, Category_flat, SubCategory_flat, Patient_Class, HospitalDHH...
## dbl   (8): HasBeenE, HasBeenI, HasBeenO, DDAvailable, DDInformative, CCAvail...
## dttm  (1): FirstDateTimeAdded
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

glimpse(api_data)

## Rows: 5,005
## Columns: 29
## $ Date                       <chr> "12/31/2022", "12/31/2022", "12/31/2022", "…
## $ Category_flat              <chr> ";Injury;", ";Neuro;Injury;", ";Injury;", "…
## $ SubCategory_flat           <chr> ";CutOrPierce;", ";Seizure;Fall;", ";CutOrP…
## $ Patient_Class              <chr> "E", "E", "E", "E", "E", "E", "E", "E", "E"…
## $ HospitalDHHSRegion         <chr> "Region I", "Region I", "Region I", "Region…
## $ dhhsregion                 <chr> "Region I", "Region I", "Region I", "Region…
## $ AgeGroup                   <chr> "18-44", "18-44", "05-17", "65-1000", "65-1…
## $ Sex                        <chr> "F", "F", "M", "F", "F", "F", "F", "M", "F"…
## $ Race_flat                  <chr> ";2106-3;", ";2106-3;", ";2106-3;", ";2106-…
## $ Ethnicity_flat             <chr> ";2186-5;", ";2186-5;", ";2186-5;", ";2186-…
## $ DispositionCategory        <chr> "DISCHARGED", "none", "none", "none", "none…
## $ ICD_CCSR_Desc_Flat         <chr> ";Unmapped;", ";Unmapped;", ";Unmapped;", "…
## $ ICD_CCSR_Flat              <chr> ";Unmapped;", ";Unmapped;", ";Unmapped;", "…
## $ ICD_Chapter_Desc_Flat      <chr> ";Unmapped;", ";Unmapped;", ";Unmapped;", "…
## $ ICD_Chapter_Flat           <chr> ";Unmapped;", ";Unmapped;", ";Unmapped;", "…
## $ ICD_Section_Desc_Flat      <chr> ";Unmapped;", ";Unmapped;", ";Unmapped;", "…
## $ ICD_Section_Flat           <chr> ";Unmapped;", ";Unmapped;", ";Unmapped;", "…
## $ AdmissionTypeCategory      <chr> "NR", "NR", "NR", "NR", "NR", "NR", "NR", "…
## $ HasBeenE                   <dbl> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1…
## $ HasBeenI                   <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0…
## $ HasBeenO                   <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0…
## $ DDAvailable                <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 1, 1, 1…
## $ DDInformative              <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 1, 1, 1…
## $ CCAvailable                <dbl> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1…
## $ CCInformative              <dbl> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1…
## $ FirstDateTimeAdded         <dttm> 2022-12-31 23:50:29, 2022-12-31 15:11:34, …
## $ HasBeenAdmitted            <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0…
## $ CRace_CEth_Combined_Broad  <chr> "White and non-Hispanic", "White and non-Hi…
## $ CRace_CEth_Combined_Narrow <chr> "White and non-Hispanic", "White and non-Hi…

Example 2 - CSV limited to specified fields

To reduce the file size, you can specify only the variables you need and by adding additional parameters into the URL, like, for example, “&field=Date&field=HospitalDHHSRegion&field=AgeGroup&field=Sex” as shown below. In this example, only the date, HHS region, age group, and patient sex fields are included. Limiting to a subset of desired fields is much more efficient when pulling line-level data from the full details data sources, which includes 201 fields as of January 2022.

url <- "https://essence.syndromicsurveillance.org/nssp_essence/api/dataDetails/csv?medicalGrouping=injury&geography=region%20i&percentParam=noPercent&geographySystem=hospitaldhhsregion&datasource=va_hospdreg&detector=probrepswitch&timeResolution=daily&medicalGroupingSystem=essencesyndromes&userId=455&aqtTarget=dataDetails&startDate=30Dec2022&endDate=31Dec2022&field=Date&field=HospitalDHHSRegion&field=AgeGroup&field=Sex"

# Rnssp option---------------------------------------------------------------------------------------------------
api_data <- myProfile$get_api_data(url, fromCSV = TRUE)

## Rows: 5005 Columns: 4
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (4): Date, HospitalDHHSRegion, AgeGroup, Sex
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

# keyring option-------------------------------------------------------------------------------------------------
api_response <- GET(url, 
                    authenticate(key_list("essence")[1,2], 
                                 key_get("essence", 
                                 key_list("essence")[1,2])))

api_data <- content(api_response, by = "csv/text") %>%
  read_csv()

## Rows: 5005 Columns: 4
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (4): Date, HospitalDHHSRegion, AgeGroup, Sex
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

glimpse(api_data)

## Rows: 5,005
## Columns: 4
## $ Date               <chr> "12/31/2022", "12/31/2022", "12/31/2022", "12/31/20…
## $ HospitalDHHSRegion <chr> "Region I", "Region I", "Region I", "Region I", "Re…
## $ AgeGroup           <chr> "00-04", "18-44", "18-44", "18-44", "45-64", "05-17…
## $ Sex                <chr> "M", "U", "F", "M", "F", "F", "U", "F", "M", "U", "…

Example 3 - JSON

url <- "https://essence.syndromicsurveillance.org/nssp_essence/api/dataDetails?medicalGrouping=injury&geography=region%20i&percentParam=noPercent&geographySystem=hospitaldhhsregion&datasource=va_hospdreg&detector=probrepswitch&timeResolution=daily&medicalGroupingSystem=essencesyndromes&userId=455&aqtTarget=dataDetails&startDate=30Dec2022&endDate=31Dec2022"

# Rnssp option---------------------------------------------------------------------------------------------------
api_data <- myProfile$get_api_data(url)

## No encoding supplied: defaulting to UTF-8.

# keyring option-------------------------------------------------------------------------------------------------
api_response <- GET(url, 
                    authenticate(key_list("essence")[1,2], 
                                 key_get("essence", 
                                 key_list("essence")[1,2])))

api_response_json <- content(api_response, as = "text")

## No encoding supplied: defaulting to UTF-8.

api_data <- fromJSON(api_response_json) %>%
  pluck("dataDetails")

glimpse(api_data)

## Rows: 5,005
## Columns: 29
## $ Date                       <chr> "12/31/2022", "12/31/2022", "12/31/2022", "…
## $ Category_flat              <chr> ";Injury;", ";Neuro;Injury;", ";Injury;", "…
## $ SubCategory_flat           <chr> ";CutOrPierce;", ";Seizure;Fall;", ";CutOrP…
## $ Patient_Class              <chr> "E", "E", "E", "E", "E", "E", "E", "E", "E"…
## $ HospitalDHHSRegion         <chr> "Region I", "Region I", "Region I", "Region…
## $ dhhsregion                 <chr> "Region I", "Region I", "Region I", "Region…
## $ AgeGroup                   <chr> "18-44", "18-44", "05-17", "65-1000", "65-1…
## $ Sex                        <chr> "F", "F", "M", "F", "F", "F", "F", "M", "F"…
## $ Race_flat                  <chr> ";2106-3;", ";2106-3;", ";2106-3;", ";2106-…
## $ Ethnicity_flat             <chr> ";2186-5;", ";2186-5;", ";2186-5;", ";2186-…
## $ DispositionCategory        <chr> "DISCHARGED", "none", "none", "none", "none…
## $ ICD_CCSR_Desc_Flat         <chr> ";Unmapped;", ";Unmapped;", ";Unmapped;", "…
## $ ICD_CCSR_Flat              <chr> ";Unmapped;", ";Unmapped;", ";Unmapped;", "…
## $ ICD_Chapter_Desc_Flat      <chr> ";Unmapped;", ";Unmapped;", ";Unmapped;", "…
## $ ICD_Chapter_Flat           <chr> ";Unmapped;", ";Unmapped;", ";Unmapped;", "…
## $ ICD_Section_Desc_Flat      <chr> ";Unmapped;", ";Unmapped;", ";Unmapped;", "…
## $ ICD_Section_Flat           <chr> ";Unmapped;", ";Unmapped;", ";Unmapped;", "…
## $ AdmissionTypeCategory      <chr> "NR", "NR", "NR", "NR", "NR", "NR", "NR", "…
## $ HasBeenE                   <chr> "1", "1", "1", "1", "1", "1", "1", "1", "1"…
## $ HasBeenI                   <chr> "0", "0", "0", "0", "0", "0", "0", "0", "0"…
## $ HasBeenO                   <chr> "0", "0", "0", "0", "0", "0", "0", "0", "0"…
## $ DDAvailable                <chr> "0", "0", "0", "0", "0", "0", "0", "0", "0"…
## $ DDInformative              <chr> "0", "0", "0", "0", "0", "0", "0", "0", "0"…
## $ CCAvailable                <chr> "1", "1", "1", "1", "1", "1", "1", "1", "1"…
## $ CCInformative              <chr> "1", "1", "1", "1", "1", "1", "1", "1", "1"…
## $ FirstDateTimeAdded         <chr> "2022-12-31 23:50:29.557", "2022-12-31 15:1…
## $ HasBeenAdmitted            <chr> "0", "0", "0", "0", "0", "0", "0", "0", "0"…
## $ CRace_CEth_Combined_Broad  <chr> "White and non-Hispanic", "White and non-Hi…
## $ CRace_CEth_Combined_Narrow <chr> "White and non-Hispanic", "White and non-Hi…

Example 4 - Time Chunked Pulls with JSON Option

If you are pulling line-level data details over a longer range of time (i.e., months or a year), it is ideal to time chunk your API data pulls to reduce the amount of strain put on the ESSENCE system. Please keep in mind that heavy data pulls can affect all ESSENCE users. The following example is meant for demonstrative purposes and only pulls data for 4 days. This can be used as a template to pull data for longer periods of time.

api_data <- data.frame(
  date = seq.Date(from = as.Date("2022-12-28"), to = as.Date("2022-12-31"), by = "1 day")
) %>%
  mutate(
    date = format(date, "%d%b%Y"),
    url = paste0("https://essence.syndromicsurveillance.org/nssp_essence/api/dataDetails?medicalGrouping=injury&geography=region%20i&percentParam=noPercent&geographySystem=hospitaldhhsregion&datasource=va_hospdreg&detector=probrepswitch&timeResolution=daily&medicalGroupingSystem=essencesyndromes&userId=455&aqtTarget=dataDetails&startDate=", date, "&endDate=", date),
    chunk = row_number()
  ) %>%
  nest(data = -chunk) %>%
  mutate(
    pull = map(.x = data, .f = function (.x) {
      
      myProfile$get_api_data(.x$url) %>%
        pluck("dataDetails")
      
    })
  ) %>%
  select(-data) %>%
  unnest(pull)

## No encoding supplied: defaulting to UTF-8.
## No encoding supplied: defaulting to UTF-8.
## No encoding supplied: defaulting to UTF-8.
## No encoding supplied: defaulting to UTF-8.

nrow(api_data)

## [1] 10216

Summary Stats

The Summary Stats ESSENCE API counts regions (or, “counties,” in ESSENCE) and facilities in your query by whatever time resolution you define (daily, weekly, monthly, quarterly, or yearly). One difference between this API and others is that it is only available on full details data sources (the only data sources that expose this level of information). This API is particularly useful for understanding the number of hospitals with results for your query.

url <- "https://essence.syndromicsurveillance.org/nssp_essence/api/summaryData?endDate=31Dec2022&medicalGrouping=injury&geography=region%20i&percentParam=noPercent&geographySystem=hospitaldhhsregion&datasource=va_hosp&detector=probrepswitch&startDate=28Dec2022&timeResolution=daily&medicalGroupingSystem=essencesyndromes&userId=455&aqtTarget=TimeSeries"

# Rnssp option---------------------------------------------------------------------------------------------------
api_data <- myProfile$get_api_data(url)

# keyring option-------------------------------------------------------------------------------------------------
api_response <- GET(url, 
                    authenticate(key_list("essence")[1,2], 
                                 key_get("essence", 
                                 key_list("essence")[1,2])))

api_response_json <- content(api_response, as = "text")
api_data <- fromJSON(api_response_json) %>%
  pluck("summaryData")

glimpse(api_data)

## Rows: 4
## Columns: 5
## $ date          <chr> "28Dec22", "29Dec22", "30Dec22", "31Dec22"
## $ HospitalState <dbl> 6, 6, 6, 6
## $ State         <dbl> 26, 24, 26, 20
## $ Region        <dbl> 129, 124, 131, 115
## $ Hospital      <dbl> 211, 208, 213, 201

Alert List Detection Table

The Alert List API gives you programmatic access to the alert list tables in the ESSENCE user interface. Temporal alert tables based on syndrome definitions are available at the region (or, “county”) and county Federal Information Processing Standards (FIPS) code approximation levels.

The results in these tables are updated a few times daily. Alert lists based on patient location include:

Patient Region - Syndrome: “Region/Syndrome”
Patient FIPS Approximation - Syndrome: “FIPS/Syndrome”
Patient Region - Subsyndrome: “Region/SubSyndrome”
Patient FIPS Approximation - Subsyndrome: “FIPS/SubSyndrome”
Patient Region - CCDD: “Region/CCDD”
Patient FIPS Approximation - CCDD: “FIPS/CCDD”

Alert lists based on facility location include:

Facility Region - Syndrome: “Hospital/Syndrome”
Facility FIPS Approximation - Syndrome: “FacilityFIPS/Syndrome”
Facility FIPS Approximation - Subsyndrome: “FacilityFIPS/SubSyndrome”
Facility FIPS Approximation - CCDD: “FacilityFIPS/CCDD”

Additionally, term-based alerts became available in ESSENCE2 as of July 2021. Examples of pulling word alerts using the Alert List API are presented in the New Functionality Section. The majority of the alert lists only maintain alerts from the last 30 days. If historical data exceeding this time range are pulled, an empty (NULL) data frame will be returned. Note: All users have access to patient and hospital region-level alerts. You will only be able to drill down to live-level data details for alerts occurring in regions from which you have data access.

Patient Region

url <- "https://essence2.syndromicsurveillance.org/nssp_essence/api/alerts/regionSyndromeAlerts?end_date=31Dec2022&start_date=28Dec2022"

# Rnssp option---------------------------------------------------------------------------------------------------
api_data <- myProfile$get_api_data(url) %>%
  pluck("regionSyndromeAlerts")

# keyring option-------------------------------------------------------------------------------------------------
api_response <- GET(url, 
                    authenticate(key_list("essence")[1,2], 
                                 key_get("essence", 
                                 key_list("essence")[1,2])))

api_response_json <- content(api_response, as = "text")
api_data <- fromJSON(api_response_json) %>%
  pluck("regionSyndromeAlerts")

glimpse(api_data)

## Rows: 34,094
## Columns: 12
## $ date                <chr> "2022-12-28", "2022-12-29", "2022-12-30", "2022-12…
## $ datasource          <chr> "va_er", "va_er", "va_er", "va_er", "va_er", "va_e…
## $ age                 <chr> "all", "all", "all", "05-17", "18-44", "18-44", "1…
## $ sex                 <chr> "all", "all", "all", "all", "all", "all", "all", "…
## $ detector            <chr> "probrepswitch", "probrepswitch", "probrepswitch",…
## $ level               <dbl> 1.446976e-02, 4.342886e-02, 2.659571e-02, 4.398468…
## $ count               <int> 8, 4, 8, 2, 2, 2, 3, 3, 3, 2, 3, 3, 2, 3, 1, 1, 1,…
## $ expected            <dbl> 3.00000000, 1.60714286, 3.50000000, 0.53571429, 0.…
## $ region              <chr> "GA_Lamar", "GA_Lamar", "GA_Lamar", "GA_Lanier", "…
## $ syndrome            <chr> "Injury", "Neuro", "Resp", "Resp", "GI", "Neuro", …
## $ timeResolution      <chr> "daily", "daily", "daily", "daily", "daily", "dail…
## $ `observed/expected` <dbl> 2.666667, 2.488889, 2.285714, 3.733333, 2.074074, …

Hospital Region

url <- "https://essence2.syndromicsurveillance.org/nssp_essence/api/alerts/hospitalSyndromeAlerts?end_date=31Dec2022&start_date=28Dec2022"

# Rnssp option---------------------------------------------------------------------------------------------------
api_data <- myProfile$get_api_data(url) %>%
  pluck("hospitalSyndromeAlerts")

# keyring option-------------------------------------------------------------------------------------------------
api_response <- GET(url, 
                    authenticate(key_list("essence")[1,2], 
                                 key_get("essence", 
                                 key_list("essence")[1,2])))

api_response_json <- content(api_response, as = "text")
api_data <- fromJSON(api_response_json) %>%
  pluck("hospitalSyndromeAlerts")

glimpse(api_data)

## Rows: 55,162
## Columns: 13
## $ date             <chr> "2022-12-30", "2022-12-31", "2022-12-30", "2022-12-30…
## $ datasource       <chr> "va_hosp", "va_hosp", "va_hosp", "va_hosp", "va_hosp"…
## $ age              <chr> "all", "00-04", "18-44", "45-64", "45-64", "65-1000",…
## $ sex              <chr> "all", "all", "all", "all", "all", "all", "all", "all…
## $ detector         <chr> "probrepswitch", "probrepswitch", "probrepswitch", "p…
## $ level            <dbl> 0.0283797715, 0.0138202309, 0.0183616856, 0.011742648…
## $ count            <int> 1, 2, 6, 5, 3, 3, 2, 1, 1, 20, 8, 9, 18, 19, 2, 9, 2,…
## $ expected         <dbl> 0.1071429, 0.2857143, 2.3928571, 1.9285714, 1.8928571…
## $ hospitalName     <chr> "MA-Brockton Hospital", "MA-Cape Cod Falmouth Hospita…
## $ regionOfHospital <chr> "MA_Plymouth", "MA_Barnstable", "MA_Barnstable", "MA_…
## $ syndrome         <chr> "Hemr_Ill", "Injury", "Resp", "Injury", "Injury", "Fe…
## $ hospital         <chr> "6136", "6139", "6139", "6139", "6139", "6139", "6139…
## $ timeResolution   <chr> "daily", "daily", "daily", "daily", "daily", "daily",…

New Functionality

ESSENCE Data Quality Tables (Added 2022)

Tables containing facility-level data quality metrics are located under the Data Quality tab in the ESSENCE user interface. The coefficient of variation (CoV) measures the overall volatility of weekly facility volume over a specified time period (i.e., 1 year to date, 2 years to date, etc.). The CoV is calculated by dividing the standard deviation by the mean of weekly encounters over this period and multiplying by 100 to report as a percentage. Small CoV values (i.e., < 35%) represent facilities with consistent overall volume and reporting, while high CoV values represent facilities with volatile weekly volume that are potentially attributable to facility onboarding during the specified time period or data drop outs due to data quality or reporting issues. As of December 2022, NSSP-ESSENCE includes the following data quality CoV filters:

Data Quality CoV Current Year to Date
Data Quality CoV Last Year to Date
Data Quality CoV Last 2 Years to Date
Data Quality CoV Last 3 Years to Date
Data Quality CoV Last 4 Years to Date
Data Quality CoV Last 5 Years to Date

There are 2 options for each of these filters: 1) general CoV filters for all encounters; 2) HasBeenE filters for emergency department encounters. Users can select operators associated with the defined CoV threshold within the ESSENCE user interface. These operators include “Equal”, “Does Not Equal”, “Less Than”, “Less Than or Equal”, “Greater Than”, “Greater Than or Equal”, “Between”, or “Is Null”. The “Less Than” or “Less Than or Equal” operators are most commonly used with CoV thresholds between 30% and 40%.

In 2022, a new API was added to allow ESSENCE users to pull CoV statistics for facilities reporting data to NSSP-ESSENCE. This API can be used to pull information for all facilities or limited to facilities within a specific site. The first API pull below pulls data for all facilities, while the second example limits to facilities in Florida by specifying the appropriate siteid value at the end of the API URL.

# All NSSP-ESSENCE facilities------------------------------------------------------------------------------------
api_url1 <- "https://essence2.syndromicsurveillance.org/nssp_essence/api/dataQuality/COVWklyAvgData"

cov_data1 <- myProfile$get_api_data(api_url1)

# Limited to facilities in a specific site-----------------------------------------------------------------------
api_url2 <- "https://essence2.syndromicsurveillance.org/nssp_essence/api/dataQuality/COVWklyAvgData?siteid=884"

cov_data2 <- myProfile$get_api_data(api_url2)

dim(cov_data1)

## [1] 7881  172

These data include 172 total fields. In addition to CoV information, the data include facility-level metrics for informativeness of discharge diagnosis and chief complaint, total volume, and date of last update. The HTML table below displays all variables and their category for use.

Pulling Query Field Values (Added November 2021)

Users can now pull values for available query fields in ESSENCE2 via the APIs. As an example, you may want to pull the list of all current CCDD categories, syndromes, and subsyndromes. The Syndrome Subsyndrome CCDD Combined Category query field in ESSENCE2 provides a complete list of all existing syndrome definitions. The following API URL can be used as a template to pull values from other query fields. Metadata will be pulled in JSON format.

url <- "https://essence2.syndromicsurveillance.org/nssp_essence/QueryWizard/ParamMetaData.json?action=getParameterMetaData&datasourceId=va_er&paramName=combinedCategory&parentParamValue=&parentParamName=&filterParamValList="

syndromes <- myProfile$get_api_data(url) %>%
  pluck("valueDisplayFields")

## No encoding supplied: defaulting to UTF-8.

head(syndromes, 10)

##                                           valueField
## 1  a_ccdd_air quality-related respiratory illness v1
## 2                      a_ccdd_all traffic related v1
## 3                      a_ccdd_all traffic related v2
## 4            a_ccdd_cdc acute flaccid myelitis dd v1
## 5                    a_ccdd_cdc acute hepatitis c v1
## 6         a_ccdd_cdc afm broad v1-limit to pediatric
## 7        a_ccdd_cdc afm narrow v1-limit to pediatric
## 8                              a_ccdd_cdc alcohol v1
## 9                             a_ccdd_cdc all drug v1
## 10                            a_ccdd_cdc all drug v2
##                                       displayField
## 1  CCDD Air Quality-related Respiratory Illness v1
## 2                      CCDD All Traffic Related v1
## 3                      CCDD All Traffic Related v2
## 4            CCDD CDC Acute Flaccid Myelitis DD v1
## 5                    CCDD CDC Acute Hepatitis C v1
## 6         CCDD CDC AFM Broad v1-Limit to Pediatric
## 7        CCDD CDC AFM Narrow v1-Limit to Pediatric
## 8                              CCDD CDC Alcohol v1
## 9                             CCDD CDC All Drug v1
## 10                            CCDD CDC All Drug v2

Pulling CCDD Category Table (Added 2022)

Alternatively, you may want to pull in an up-to-date list of all existing ESSENCE CCDD categories and their underlying queries. This is useful for providing additional context in category-specific reports or summaries.

url <- "https://essence.syndromicsurveillance.org/nssp_essence/servlet/SyndromeDefinitionsServlet_CCDD?action=getCCDDTerms"

ccdd_queries <- myProfile$get_api_data(url) %>%
  pluck("categories")

ccdd_queries %>%
  filter(category == "CDC Influenza DD v1")

##   updateInfo termId     groupName dateCreated    notes lastUpdate
## 1 No Updates     63 Uncategorized  2020-03-06 No Notes 2020-03-06
##               description
## 1 Description to be added
##                                                                                                                                                                                                                                                                       definition
## 1 Search CC and DD field: (,^[;/ ]J09^,or,^[;/ ]J10^,or,^[;/ ]J11^,or,^[;/ ]487.[018][;/ ]^,or,^[;/ ]487[018][;/ ]^,or,^[;/ ]488.[018][19][;/ ]^,or,^[;/ ]488[018][19][;/ ]^,or,^[;/ ]442696006[;/ ]^,or,^[;/ ]442438000[;/ ]^,or,^[;/ ]6142004[;/ ]^,or,^[;/ ]195878008[;/ ]^,)
##   isAdmin            category fieldsSearched
## 1    TRUE CDC Influenza DD v1      Undefined

Word Alerts (Added July 2021)

Term-based alerts are available in ESSENCE2. This algorithm was developed to identify unusual distributions of chief complaint terms of interest without dependence on syndrome definitions. Technical details of the word alert algorithm can be found at https://essence2.syndromicsurveillance.org/nssp_essence/usersguide/algorithms/TermBasedAlerts.jsp. You can pull word alerts from ESSENCE2 via the Alert List API. Term alert API URLs need to be defined in RStudio and are not populated in the ESSENCE2 user interface. This API requires the following parameters:

geo_system: region, facility, or regionFacility
start_date: start date in ddMMMyyy format
end_date: end date in ddMMMyyy format

Optional query parameters include:

filters: regions or facilities to limit to (defaults to all if not specified)
show_stop_words: Boolean indicator of whether or not standard stop words should be removed (false by default)
show_syndromic_words: Boolean indicator of whether or not to show syndromic terms used in syndrome definition queries (false by default)
show_ignored_words: Boolean indicator of whether or not ignored words should be removed (false by default).

NSSP staff with admin access will be able to add and manage additional requested stop words in a list of “ignored” terms. This list will be modified over time to optimize the identification of informative terms and word pairs.

An example of an API URL only including required parameters is shown below. Fields in the corresponding alert table include term ID, date of the alert, data source ID and name, geography system, region (or “county”), anomalous term or word pair, detector algorithm, alert color ID (red = 3), frequency of occurrence, and expected values based on the 30-day baseline used. Note: Term-based alerts are only available for data from the last 7 days.

url <- "https://essence2.syndromicsurveillance.org/nssp_essence/api/alerts/wordAlerts?geo_system=region&start_date=31Dec2022&end_date=31Dec2022"

# Rnssp option---------------------------------------------------------------------------------------------------
api_data <- myProfile$get_api_data(url) %>%
  pluck("termAlerts")

# keyring option-------------------------------------------------------------------------------------------------
api_response <- GET(url, 
                    authenticate(key_list("essence")[1,2], 
                                 key_get("essence", 
                                 key_list("essence")[1,2])))

api_response_json <- content(api_response, as = "text")
api_data <- fromJSON(api_response_json) %>%
  pluck("termAlerts")

glimpse(api_data)

##  list()

This second example demonstrates how users should form the URL with optional parameters; in this case example, the parameters filter to Broward County, FL, and include syndromic terms.

url <- "https://essence2.syndromicsurveillance.org/nssp_essence/api/alerts/wordAlerts?geo_system=region&start_date=27Dec2022&end_date=31Dec2022&filters=fl_broward&show_syndromic_words=true"

# Rnssp option---------------------------------------------------------------------------------------------------
api_data <- myProfile$get_api_data(url) %>%
  pluck("termAlerts")

# keyring option-------------------------------------------------------------------------------------------------
api_response <- GET(url, 
                    authenticate(key_list("essence")[1,2], 
                                 key_get("essence", 
                                 key_list("essence")[1,2])))

api_response_json <- content(api_response, as = "text")
api_data <- fromJSON(api_response_json) %>%
  pluck("termAlerts")

glimpse(api_data)

##  list()

Time Series Data Table with Stratified Alerts (Added August 2020)

Before this update, there was not an efficient way of pulling in daily stratified alerts. This update gave users the ability to pull historic alerts across stratifications in a long, tabular format. This functionality is only available from ESSENCE2: https://essence2.syndromicsurveillance.org/. For example, a user could choose multiple CCDD categories, Has been Emergency = “Yes”, CCDD Category for “As Percent Parameter”, and within the time series interface choose a geography level such as Hospital HHS Region for “Across Graphs Stratification” and CCDD Category for “Within Graph Stratification”. After selecting these configurations, you will need to select the “Update” button in the user interface to generate the stratified time series and API URL for the corresponding data table. Additionally, if “As Percent Parameter” is specified, the data pulled into RStudio will contain p-values and alert indicators specific to both counts and percentages. Indicators specific to counts are specified with “_dataCount” tags in the column names.

url <- "https://essence2.syndromicsurveillance.org/nssp_essence/api/timeSeries?endDate=31Dec2022&ccddCategory=cdc%20pneumonia%20ccdd%20v1&ccddCategory=cdc%20coronavirus-dd%20v1&ccddCategory=cli%20cc%20with%20cli%20dd%20and%20coronavirus%20dd%20v2&percentParam=ccddCategory&geographySystem=hospitaldhhsregion&datasource=va_hospdreg&detector=probrepswitch&startDate=03Oct2022&timeResolution=daily&hasBeenE=1&medicalGroupingSystem=essencesyndromes&userId=2362&aqtTarget=TimeSeries&stratVal=ccddCategory&multiStratVal=geography&graphOnly=true&numSeries=3&graphOptions=multipleSmall&seriesPerYear=false&nonZeroComposite=false&removeZeroSeries=true&startMonth=January&stratVal=ccddCategory&multiStratVal=geography&graphOnly=true&numSeries=3&graphOptions=multipleSmall&seriesPerYear=false&startMonth=January&nonZeroComposite=false"

# Rnssp option---------------------------------------------------------------------------------------------------
api_data <- myProfile$get_api_data(url) %>%
  pluck("timeSeriesData")

## No encoding supplied: defaulting to UTF-8.

# keyring option-------------------------------------------------------------------------------------------------
api_response <- GET(url, 
                    authenticate(key_list("essence")[1,2], 
                                 key_get("essence", 
                                 key_list("essence")[1,2])))

api_response_json <- content(api_response, as = "text")

## No encoding supplied: defaulting to UTF-8.

api_data <- fromJSON(api_response_json) %>%
  pluck("timeSeriesData")

glimpse(api_data)

## Rows: 2,700
## Columns: 21
## $ date                       <chr> "2022-10-03", "2022-10-04", "2022-10-05", "…
## $ count                      <dbl> 1.944354, 1.757632, 1.726513, 1.864693, 1.8…
## $ expected                   <chr> "1.445", "1.46", "1.479", "1.726", "1.675",…
## $ levels                     <chr> "0.122", "0.138", "0.167", "0.407", "0.383"…
## $ colorID                    <int> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1…
## $ color                      <chr> "blue", "blue", "blue", "blue", "blue", "bl…
## $ altText                    <chr> "Data: Date: 03Oct22, Level: 0.122, Count: …
## $ details                    <chr> "/nssp_essence/servlet/DataDetailsServlet?g…
## $ graphType                  <chr> "percent", "percent", "percent", "percent",…
## $ dataCount                  <dbl> 420, 361, 351, 398, 394, 323, 348, 387, 372…
## $ expected_dataCount         <dbl> 385.4749, 359.1769, 338.4023, 348.0518, 329…
## $ levels_dataCount           <dbl> 0.085922784, 0.470764209, 0.300411004, 0.02…
## $ colorID_dataCount          <int> 1, 1, 1, 2, 3, 1, 1, 1, 1, 1, 1, 2, 1, 1, 2…
## $ color_dataCount            <chr> "blue", "blue", "blue", "yellow", "red", "b…
## $ allCount                   <dbl> 21601, 20539, 20330, 21344, 21283, 19174, 1…
## $ lineLabel                  <chr> "CDC Pneumonia CCDD v1 - Region 1", "CDC Pn…
## $ title                      <chr> "CDC Pneumonia CCDD v1 - Region 1", "CDC Pn…
## $ ccddCategory_id            <chr> "CDC Pneumonia CCDD v1", "CDC Pneumonia CCD…
## $ ccddCategory_display       <chr> "CDC Pneumonia CCDD v1", "CDC Pneumonia CCD…
## $ hospitaldhhsregion_id      <chr> "Region I", "Region I", "Region I", "Region…
## $ hospitaldhhsregion_display <chr> "Region 1", "Region 1", "Region 1", "Region…

names(api_data)

##  [1] "date"                       "count"                     
##  [3] "expected"                   "levels"                    
##  [5] "colorID"                    "color"                     
##  [7] "altText"                    "details"                   
##  [9] "graphType"                  "dataCount"                 
## [11] "expected_dataCount"         "levels_dataCount"          
## [13] "colorID_dataCount"          "color_dataCount"           
## [15] "allCount"                   "lineLabel"                 
## [17] "title"                      "ccddCategory_id"           
## [19] "ccddCategory_display"       "hospitaldhhsregion_id"     
## [21] "hospitaldhhsregion_display"

A benefit of pulling the data and alerts in this manner is that you can create customized figures with ggplot or plotly that can be incorporated into static or interactive R Markdown reports:

hhs_region_data <- api_data %>%
  select(
    date, 
    hhs_region = hospitaldhhsregion_display, 
    ccdd_category = ccddCategory_display, 
    percent = count, 
    color
  ) %>%
  mutate(
    date = as.Date(date),
    hhs_region = factor(hhs_region, levels = paste("Region", 1:10))
  ) %>%
  filter(ccdd_category == "CLI CC with CLI DD and Coronavirus DD v2") %>%
  arrange(date, hhs_region) 

date_range <- paste(format(min(hhs_region_data$date), "%B %d, %Y"), "to", format(max(hhs_region_data$date), "%B %d, %Y"))

ggplot(hhs_region_data, aes(x = date, y = percent)) + 
  geom_line(linewidth = 0.7, color = "#046C9A") + 
  geom_point(data = subset(hhs_region_data, color == "red"), color = "red", size = 0.5) +
  geom_point(data = subset(hhs_region_data, color == "yellow"), color = "yellow", size = 0.5) + 
  theme_bw() + 
  labs(
    title = paste("CLI v2 by HHS Region:", date_range),  
    x = "Date", 
    y = "Percent of ED Visits"
  ) + 
  facet_wrap(facets = ~hhs_region, nrow = 2, scales = "free_x") + 
  scale_y_continuous(
    limits = c(0, NA), 
    expand = expansion(c(0, 0.1))
  ) +
  theme(
    strip.background = element_blank(), 
    strip.text = element_text(face = "bold"),
    panel.spacing = unit(1, "lines")
  )

Capability of Stratifying by FIPS Code (Added July 2020)

ESSENCE2 includes facility county FIPS and patient county FIPS as available query fields (both are technically approximations since ESSENCE regions are populated by ZIP codes). Users may choose FIPS codes as a row (or column) field in the table builder. The following example assumes that a user has chosen the Facility Location (Full Details) data source, the CDC Coronavirus-DD v1, CDC Pneumonia CCDD v1, and Coronavirus-like illness (CLI) CC with CLI DD and Coronavirus DD v2 CCDD categories, “Yes” for “Has Been Emergency”, “CC and DD Category” for “As Percent Query”, and all counties within their state for Facility County FIPS Approximation. By selecting Date and Facility County FIPS Approximation for row fields and CC and DD Category for column fields, the API URL generated in the user interface will have the following structure:

url <- "https://essence2.syndromicsurveillance.org/nssp_essence/api/tableBuilder/csv?endDate=31Dec2022&facilityfips=...&percentParam=ccddCategory&datasource=va_hosp&startDate=03Oct2022&medicalGroupingSystem=essencesyndromes&userId=2362&aqtTarget=TableBuilder&ccddCategory=cdc%20coronavirus-dd%20v1&ccddCategory=cdc%20pneumonia%20ccdd%20v1&ccddCategory=cli%20cc%20with%20cli%20dd%20and%20coronavirus%20dd%20v2&geographySystem=hospital&detector=nodetectordetector&timeResolution=daily&hasBeenE=1&rowFields=timeResolution&rowFields=facilityfips&columnField=ccddCategory"

After the specifying endDate, all facility FIPS codes will be defined with the following syntax: “&facilityfips=fipscode1&facilityfips=fipscode2&…&facilityfips=fipscodeN&”. Note: Currently, users need to manually insert “&refValues=false” after specifying facilityfips as a row field in order to pull in the actual codes instead of the county names. Define this URL and API pull as follows:

url <- "https://essence2.syndromicsurveillance.org/nssp_essence/api/tableBuilder/csv?endDate=31Dec2022&facilityfips=...&percentParam=ccddCategory&datasource=va_hosp&startDate=03Oct2022&medicalGroupingSystem=essencesyndromes&userId=2362&aqtTarget=TableBuilder&ccddCategory=cdc%20coronavirus-dd%20v1&ccddCategory=cdc%20pneumonia%20ccdd%20v1&ccddCategory=cli%20cc%20with%20cli%20dd%20and%20coronavirus%20dd%20v2&geographySystem=hospital&detector=nodetectordetector&timeResolution=daily&hasBeenE=1&rowFields=timeResolution&rowFields=facilityfips&refValues=false&columnField=ccddCategory"

# Rnssp option---------------------------------------------------------------------------------------------------
api_data <- myProfile$get_api_data(url, fromCSV = TRUE)

# keyring option-------------------------------------------------------------------------------------------------
api_response <- GET(url, 
                    authenticate(key_list("essence")[1,2], 
                                 key_get("essence", 
                                 key_list("essence")[1,2])))

api_response_csv <- content(api_response, by = "csv/text")
api_data <- read_csv(api_response_csv)

Tips and Tricks

ESSENCE API URL Length

Occasionally, the API URL you created for the ESSENCE query will be very, VERY long. For example, you might have reason to explicitly include all facilities, all counties, or all ZIP codes for a site. In this situation, the character length of your URL might be too long to assign to the url <- object as shown in the preceding examples. When this occurs, you can split the URL into two or more strings when creating objects, and then pull them together later. In the code chunk shown below, we start with one long URL (use your imagination here) and break it into two pieces. Then, to create the object URL, we join them using the paste0() function. Unlike the paste() function, paste0() does not separate the combined character strings with a space. This final URL can be passed to ESSENCE along with your credentials to return your results. As a rule of thumb, do not exceed 20 lines of characters per URL.

url1 <- "https://essence.syndromicsurveillance.org/nssp_essence/api/very_very_very_long_URL_very_long_use_your_imagination_here...."
url2 <- "still_going_even_longer_here..........."
url3 <- paste0(url1, url2)

# Resulting URL
print(url)

## [1] "https://essence2.syndromicsurveillance.org/nssp_essence/api/timeSeries?endDate=31Dec2022&ccddCategory=cdc%20pneumonia%20ccdd%20v1&ccddCategory=cdc%20coronavirus-dd%20v1&ccddCategory=cli%20cc%20with%20cli%20dd%20and%20coronavirus%20dd%20v2&percentParam=ccddCategory&geographySystem=hospitaldhhsregion&datasource=va_hospdreg&detector=probrepswitch&startDate=03Oct2022&timeResolution=daily&hasBeenE=1&medicalGroupingSystem=essencesyndromes&userId=2362&aqtTarget=TimeSeries&stratVal=ccddCategory&multiStratVal=geography&graphOnly=true&numSeries=3&graphOptions=multipleSmall&seriesPerYear=false&nonZeroComposite=false&removeZeroSeries=true&startMonth=January&stratVal=ccddCategory&multiStratVal=geography&graphOnly=true&numSeries=3&graphOptions=multipleSmall&seriesPerYear=false&startMonth=January&nonZeroComposite=false"

Dynamically Set Start and End Dates

For reports run weekly or daily, you can automate the start and end dates rather than change the dates in the API URL manually before knitting. There are multiple ways to do this. You may split the URL into three pieces in a similar fashion to how the URL in the previous example was split. Or you may use str_extract() and str_replace() to substitute the appropriate dates. For example, if the report is based on the most recent 90 days, the start and end date can be auto-determined by using base R’s Sys.Date() and format() to ensure appropriate date formatting. format(Sys.Date(), "%d%b%Y") will give today’s date, 03Jan2023, while format(Sys.Date() - 89, "%d%b%Y") will give the start date of the recent 90 day period, 06Oct2022. You can insert the dates by splitting the URL can be split into three pieces and then pasting back together:

end_date <- format(Sys.Date(), "%d%b%Y")
startDate <- format(Sys.Date()- 89, "%d%b%Y")

url1 <- "https://essence.syndromicsurveillance.org/nssp_essence/api/timeSeries/graph?"
url2 <- paste0("endDate=", end_date, "&medicalGrouping=injury&percentParam=noPercent&geographySystem=hospitaldhhsregion&datasource=va_hospdreg&detector=probrepswitch&")
url3 <- paste0("startDate=", startDate, "&timeResolution=daily&medicalGroupingSystem=essencesyndromes&userId=455&aqtTarget=TimeSeries&graphTitle=National%20-%20Injury%20Syndrome%20Daily%20Counts&xAxisLabel=Date&yAxisLabel=Count")
url <- paste0(url1, url2, url3)

# Rnssp option---------------------------------------------------------------------------------------------------
api_png <- myProfile$get_api_tsgraph(url)

knitr::include_graphics(api_png$tsgraph)

# keyring option-------------------------------------------------------------------------------------------------
api_response <- GET(url, 
                    authenticate(key_list("essence")[1,2], 
                                 key_get("essence", 
                                 key_list("essence")[1,2])),
                    write_disk("timeseries3.png", overwrite = TRUE))

knitr::include_graphics("timeseries3.png")

Additionally, the URL start and end dates can remained fixed by setting up the code to simply extract and replace the old dates with new dates. The URL results will be the same.

url <- "https://essence.syndromicsurveillance.org/nssp_essence/api/timeSeries/graph?endDate=30Jun2021&medicalGrouping=injury&percentParam=noPercent&geographySystem=hospitaldhhsregion&datasource=va_hospdreg&detector=probrepswitch&startDate=1Jun2021&timeResolution=daily&medicalGroupingSystem=essencesyndromes&userId=455&aqtTarget=TimeSeries&graphTitle=National%20-%20Injury%20Syndrome%20Daily%20Counts&xAxisLabel=Date&yAxisLabel=Count"
  
end_date_old <- regmatches(url, regexpr('endDate=.+?&', url))
end_date_old <- str_extract(end_date_old, "[0-9]{1,2}[A-Z|a-z]{3}[0-9]{2,4}")
end_date_new <- format(Sys.Date(), "%d%b%Y")

start_date_old <- regmatches(url, regexpr("startDate=.+?&", url))
start_date_old <- str_extract(start_date_old, "[0-9]{1,2}[A-Z|a-z]{3}[0-9]{2,4}")
start_date_new <- format(Sys.Date() - 89, "%")

url <- str_replace(url, end_date_old, end_date_new)
url <- str_replace(url, start_date_old, start_date_new)

url

## [1] "https://essence.syndromicsurveillance.org/nssp_essence/api/timeSeries/graph?endDate=17Jan2023&medicalGrouping=injury&percentParam=noPercent&geographySystem=hospitaldhhsregion&datasource=va_hospdreg&detector=probrepswitch&startDate=%&timeResolution=daily&medicalGroupingSystem=essencesyndromes&userId=455&aqtTarget=TimeSeries&graphTitle=National%20-%20Injury%20Syndrome%20Daily%20Counts&xAxisLabel=Date&yAxisLabel=Count"

API URL Session Attribute IDs

ESSENCE works within web standards. Sometimes this raises concerns with users when their configured queries contain so many parameters that the URLs being generated exceed 2,000 characters. To work around this limitation on HTTP GET (and its interaction with servers and browsers ), ESSENCE will automatically create an alias for long URLs called a session attribute ID.

The session attribute ID appears as sessionAttributeID in the API URL and is a normal URL parameter set to a fixed value that corresponds to a single saved URL string.

There are several scenarios in which ESSENCE will always generate session attribute IDs:

When a user enters a large number of query filters in the Query Wizard;
When filters contain large amounts of text; or
When ESSENCE (e.g., in myESSENCE) needs to construct a URL, but the user-added parameters exceed 2,000 characters.

In each scenario, ESSENCE will receive the URL or parameters via non-HTTP GET means, enabling it to receive the excess characters and properly convert the URL for future use and convenience (e.g, to paste into the web browser address bar). The process is automatic and requires no user intervention, aside from acknowledging that the system generated the URL. Once ESSENCE creates the session attribute ID, it will last indefinitely.

To modify parameters like start and end dates, you can add parameter specifications to the end of the API URL that contains the session attribute ID. The example below displays an ESSENCE-generated URL for a time series query in which all CDC-developed CCDD categories were selected with across graph stratification applied across these categories. In the first example below, the week start and end date parameters referenced by the session attribute ID are December 4, 2022 and January 7, 2023, respectively. The second example demonstrates how you could update the start date to be November 27, 2022, and the end date to be December 24, 2022. For weekly time series, the end date in the API URL should be the Saturday ending date of an MMWR week, however the last week in the data will be represented by the Sunday starting date of the corresponding week.

# ESSENCE-generated API URL with session attribute ID------------------------------------------------------------
attr_id_url <- "https://essence2.syndromicsurveillance.org/nssp_essence/api/timeSeries?sessionAttributeID=2362_03_01_2023_19_48_46_5189&stratVal=&multiStratVal=ccddCategory&graphOnly=true&numSeries=0&graphOptions=multipleSmall&seriesPerYear=false&nonZeroComposite=false&removeZeroSeries=true&startMonth=1&stratVal=&multiStratVal=ccddCategory&graphOnly=true&numSeries=0&graphOptions=multipleSmall&seriesPerYear=false&startMonth=1&nonZeroComposite=false"

api_data <- myProfile$get_api_data(attr_id_url) %>%
  pluck("timeSeriesData")

range(as.Date(api_data$date))

## [1] "2022-10-02" "2023-01-01"

# Updated API URL with custom start and end dates----------------------------------------------------------------
attr_id_url_updated <- "https://essence2.syndromicsurveillance.org/nssp_essence/api/timeSeries?sessionAttributeID=2362_03_01_2023_19_48_46_5189&stratVal=&multiStratVal=ccddCategory&graphOnly=true&numSeries=0&graphOptions=multipleSmall&seriesPerYear=false&nonZeroComposite=false&removeZeroSeries=true&startMonth=1&stratVal=&multiStratVal=ccddCategory&graphOnly=true&numSeries=0&graphOptions=multipleSmall&seriesPerYear=false&startMonth=1&nonZeroComposite=false&startDate=27Nov2022&endDate=24Dec2022"

api_data_updated <- myProfile$get_api_data(attr_id_url_updated) %>%
  pluck("timeSeriesData")

range(as.Date(api_data_updated$date))

## [1] "2022-11-27" "2022-12-18"

Or, you can update the start and end dates by using the paste0() function (similar to using this function to dynamically set dates in an API URL). In this scenario, the date parameters are pasted on to the end of the URL. Because this is a time series API with a weekly time resolution, the start and end dates in this example are set to pull data from the most recent 3 complete MMWR weeks.

end_date <- format(floor_date(Sys.Date(), unit = "1 week") - 1, "%d%b%Y")
start_date <- format(floor_date(Sys.Date(), unit = "1 week") - 7 * 4, "%d%b%Y")

attr_id_url2 <- paste0("https://essence2.syndromicsurveillance.org/nssp_essence/api/timeSeries?sessionAttributeID=2362_03_01_2023_19_48_46_5189&stratVal=&multiStratVal=ccddCategory&graphOnly=true&numSeries=0&graphOptions=multipleSmall&seriesPerYear=false&nonZeroComposite=false&removeZeroSeries=true&startMonth=1&stratVal=&multiStratVal=ccddCategory&graphOnly=true&numSeries=0&graphOptions=multipleSmall&seriesPerYear=false&startMonth=1&nonZeroComposite=false&startDate=", start_date, "&endDate=", end_date)

api_data <- myProfile$get_api_data(attr_id_url2) %>%
  pluck("timeSeriesData")

range(as.Date(api_data$date))

## [1] "2022-12-18" "2023-01-08"

Additional Resources

The preceding examples will help you pull data into your RStudio environment, but the next steps are really up to you. If you are unfamiliar with R and RStudio, here are some open-source resources to move your analysis forward.