RStudio lets you access ESSENCE data more securely, create your own R Markdown reports and Shiny applications, and do exploratory analyses not possible within ESSENCE. In this R Markdown document, we will provide an overview of the ESSENCE application programming interfaces (APIs) and explain how to access them through RStudio. We will also list the APIs available in ESSENCE and show basic examples of R code and packages so that you can start using RStudio.
An API is a structured and consistent way for one machine to exchange information with another machine. ESSENCE has APIs that allow you to programmatically access and further manipulate your data from outside the system. You may select the CSV format to export data, and in some instances, JSON is also supported. More information about the APIs can be found in ESSENCE under “More,” then “User Guide,” and then API Documentation. You may write API URL syntax on your own after reading the documentation, or you can let ESSENCE create the API URL by clicking the “API URL” button on an ESSENCE page after completing a query.
10 ESSENCE APIs are highlighted in this guide:
Time series data table,
Time series PNG image,
Table builder results,
Data details (line level),
Summary stats on the number of unique facilities or regions in your query results,
Alert list detection tables,
Facility-level data quality metrics,
ESSENCE query fields,
CCDD category table,
Time series data table with stratified, historical alerts
The first step is to create an R script or R Markdown file and load the necessary packages. By default, some packages might already be in your system library. But if not, you can install them yourself by clicking on the packages tab in the bottom-right quadrant of the RStudio interface and selecting the Install tab. Here are the package names and configuration statement for getting started:
library(Rnssp)
library(tidyverse)
library(httr)
library(jsonlite)
library(keyring)
library(lubridate)
To extract data from ESSENCE, you start by passing authentication
information from your RStudio session to ESSENCE so that it knows what
data you are allowed to see. It is bad practice to explicitly
include your username and password in your code. We provide two
options to do this work for us: the Rnssp
or
keyring
package. Whenever possible, we encourage
users to adopt the more secure methodology presented in
Rnssp
(option 1 below).
Rnssp
(preferred)In June 2021, the Rnssp
library was installed on the
instance of RStudio Workbench hosted on the BioSense Platform. If you
are using a local instance of RStudio, you can install the development
version of Rnssp
from GitHub by running
devtools::install_github("cdcgov/Rnssp")
in your console.
The Rnssp
GitHub repository can be accessed at https://github.com/CDCgov/Rnssp, with additional
documentation and vignettes located at https://cdcgov.github.io/Rnssp/. Rnssp
provides functionality to securely save credentials and interact with
ESSENCE APIs. When you run the following code, a pop-up will appear in
RStudio for you to enter your AMC username and password. This will
create a user profile object of the class Credentials, designed with the
R6 object system that integrates classic object-oriented programming
concepts into R. To render an R Markdown document, save the
myProfile
object as an .rda or .rds file to your home
directory. This only needs to be done once. The following code
chunk is presented for demonstrative purposes only and does not need to
be included in your actual R Markdown code.
library(Rnssp)
myProfile <- Credentials$new(
username = askme("Enter your username: "),
password = askme()
)
# rda option-----------------------------------------------------------------------------------------------------
save(myProfile, file = "~/myProfile.rda")
# rds option-----------------------------------------------------------------------------------------------------
saveRDS(myProfile, "~/myProfile.rds")
myProfile
can then be loaded by including either of the
following lines of code in your introductory code chunks (just use one
option):
# rda option-----------------------------------------------------------------------------------------------------
load("~/myProfile.rda")
# rds option-----------------------------------------------------------------------------------------------------
myProfile <- readRDS("~/myProfile.rds")
Note that your username and password are fully encrypted in your user profile and are not visible when viewing or inspecting the object:
myProfile
## <NSSPCredentials>
## Public:
## clone: function (deep = FALSE)
## get_api_data: function (url, fromCSV = FALSE, ...)
## get_api_response: function (url)
## get_api_tsgraph: function (url)
## initialize: function (username, password)
## Private:
## ..__: NSSPContainer, R6
## ..password: NSSPContainer, R6
## ..username: NSSPContainer, R6
The myProfile
object comes with the following
methods:
$get_api_response()
: Retrieves requested information
specified in the API URL from ESSENCE$get_api_data()
: Extracts the content (data) from the
API response and parses into an R data frame$get_api_tsgraph()
: Retrieves an ESSENCE time series
graph and saves as a PNG to a temporary directorykeyring
Prior to the development of the Rnssp
package, the
keyring
library was used to authenticate AMC credentials to
NSSP-ESSENCE when pulling data via the APIs. While using
keyring
is more secure than explicitly referencing your
credentials from a source file, it is less secure and efficient than the
Rnssp
method. In 2021, the keyring
library was updated and now requires users to specify that credentials
should be saved to hidden background environment variables. NSSP applied
a patch in RStudio Workbench in January 2022 so that
keyring
can be used as it was previously.
keyring
will save your AMC credentials to background
environment variables that will persist for the duration of an
individual R session. When you run the following line of code (with your
username entered in the username quotes) a pop-up will appear in RStudio
where you will need to enter your password. Note: You
only need to save your credentials once per session.
key_set(service = "essence", username = "msheppardoa01")
After you enter your password in the pop-up window, be sure to “comment out” this line of code by adding a hash mark before it as shown below:
# key_set(service = "essence", username = "msheppardoa01")
In each example that follows, you will see a common pattern emerge:
First, the URL must be defined as an object in your RStudio session;
then the API response needs to be retrieved and content extracted to an
R data frame using either the Rnssp
$get_api_data()
method or GET
from
httr
before further analysis. When you use this approach, R
Markdown provides an easily reproducible workflow for integrating report
text with code that reads in data, manipulates data as needed, and
produces analyses and visualizations in such a way that can be handed
off to colleagues without documenting manual actions (point/click,
etc).
In this example, we will show you how to pull a time series data
table from ESSENCE into RStudio. Note: All examples use
the limited details data sources available in NSSP-ESSENCE. In the code
below, the first object created is the URL for the ESSENCE API endpoint
of interest, which for this example is a national trend of the injury
syndrome. The second object, api_response
, authenticates
your credentials so that ESSENCE knows you are permitted to access these
data and then retrieves the requested data if authentication is
successful. The next two objects process JSON-formatted data into an R
data frame. By default, time series APIs pull in JSON-formatted data.
The fromJSON()
function converts these data to a list of
length 2, where the second element of the list is a data frame that can
be extracted with the pluck()
function from the
purrr
library. Lastly, the glimpse()
function
from dplyr
provides a quick sense of every variable in the
data frame.
url <- "https://essence.syndromicsurveillance.org/nssp_essence/api/timeSeries?endDate=31Dec2022&medicalGrouping=injury&percentParam=noPercent&geographySystem=hospitaldhhsregion&datasource=va_hospdreg&detector=probrepswitch&startDate=03Oct2022&timeResolution=daily&medicalGroupingSystem=essencesyndromes&userId=455&aqtTarget=TimeSeries"
# Rnssp option---------------------------------------------------------------------------------------------------
api_response <- myProfile$get_api_response(url)
api_response_json <- content(api_response, as = "text")
## No encoding supplied: defaulting to UTF-8.
api_data <- fromJSON(api_response_json) %>%
pluck("timeSeriesData")
# keyring option-------------------------------------------------------------------------------------------------
api_response <- GET(url,
authenticate(key_list("essence")[1,2],
key_get("essence",
key_list("essence")[1,2])))
api_response_json <- content(api_response, as = "text")
## No encoding supplied: defaulting to UTF-8.
api_data <- fromJSON(api_response_json) %>%
pluck("timeSeriesData")
glimpse(api_data)
## Rows: 90
## Columns: 8
## $ date <chr> "2022-10-03", "2022-10-04", "2022-10-05", "2022-10-06", "2022…
## $ count <dbl> 51779, 49719, 49118, 49686, 49990, 46789, 45608, 51096, 51375…
## $ expected <chr> "51952.7", "50393.255", "48662.281", "48473.657", "48828.893"…
## $ levels <chr> "0.576", "0.769", "0.305", "0.091", "0.095", "0.543", "0.351"…
## $ colorID <int> 1, 1, 1, 1, 1, 1, 1, 3, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1…
## $ color <chr> "blue", "blue", "blue", "blue", "blue", "blue", "blue", "red"…
## $ altText <chr> "Data: Date: 03Oct22, Level: 0.576, Count: 51779, Expected: 5…
## $ details <chr> "/nssp_essence/api/dataDetails?percentParam=noPercent&datasou…
IMPORTANT!
Alternatively, with Rnssp
we can pull these data with
two lines of code by using the $get_api_data()
method,
which implicitly pulls and extracts data using the steps outlined in the
example above. Occasionally there are scenarios in which retrieving the
API response first is beneficial—for example, when the API response
status code needs to be inspected if data are not pulling as expected.
In general, we recommend the more direct approach of using the
$get_api_data()
method shown below and only recommend using
get_api_response()
for debugging purposes. The
remaining Rnssp
examples in this guide do not use
get_api_response()
.
api_data <- myProfile$get_api_data(url) %>%
pluck("timeSeriesData")
## No encoding supplied: defaulting to UTF-8.
This example shows how to retrieve the ESSENCE graph instead of the underlying data for the graph, as shown previously. Here, we use the same national injury syndrome trend as before, but notice the URL now includes ..”api/timeSeries/graph?…“. You can add a title or axes labels by adding other parameters to the URL:”&graphTitle=Injury%20Syndrome&xAxisLabel=Date&yAxisLabel=Count”.
url <- "https://essence.syndromicsurveillance.org/nssp_essence/api/timeSeries/graph?endDate=31Dec2022&medicalGrouping=injury&percentParam=noPercent&geographySystem=hospitaldhhsregion&datasource=va_hospdreg&detector=probrepswitch&startDate=03Oct2022&timeResolution=daily&medicalGroupingSystem=essencesyndromes&userId=455&aqtTarget=TimeSeries&graphTitle=National%20-%20Injury%20Syndrome%20Daily%20Counts&xAxisLabel=Date&yAxisLabel=Count"
# Rnssp option---------------------------------------------------------------------------------------------------
api_png <- myProfile$get_api_tsgraph(url)
knitr::include_graphics(api_png$tsgraph)
# keyring option-------------------------------------------------------------------------------------------------
api_response <- GET(url,
authenticate(key_list("essence")[1,2],
key_get("essence",
key_list("essence")[1,2])),
write_disk("timeseries1.png", overwrite = TRUE))
knitr::include_graphics("timeseries1.png")
With the table builder API, ESSENCE does the heavy lifting by
summarizing your query and presenting results in tabular format where
you can define rows, nested rows, and column variables for output. If
the query is supported in the ESSENCE query manager, you should be able
to use table builder to summarize the output, which is usually more
efficient than manipulating large amounts of line-level data yourself.
In this example, we use the CDC Opioid Overdose v3 Chief Complaint
Discharge Diagnosis (CCDD) category and create a table of counts per
month by U.S. Department of Health & Human Services (HHS) region.
Visits are limited to emergency department visits by selecting Has been
Emergency = “Yes”. Output formats for table builder results are CSV and
JSON. The CSV option will pull in data that matches the tabular format
seen in the ESSENCE interface, whereas the JSON option will pull in data
that is transformed to a long, pivoted format. The latter is recommended
for circumventing initial data transformations into a long format and is
compatible with functions/libraries based on tidyverse
principles.
When the CSV option is used, you will need to specify this by using
the fromCSV
argument as shown below. By default,
$get_api_data()
assumes that the JSON option was
selected.
url <- "https://essence.syndromicsurveillance.org/nssp_essence/api/tableBuilder/csv?endDate=31Dec2022&ccddCategory=cdc%20opioid%20overdose%20v3&percentParam=noPercent&geographySystem=hospitaldhhsregion&datasource=va_hospdreg&detector=nodetectordetector&startDate=03Oct2022&ageNCHS=11-14&ageNCHS=15-24&ageNCHS=25-34&ageNCHS=35-44&ageNCHS=45-54&ageNCHS=55-64&ageNCHS=65-74&ageNCHS=75-84&ageNCHS=85-1000&ageNCHS=unknown&timeResolution=monthly&hasBeenE=1&medicalGroupingSystem=essencesyndromes&userId=455&aqtTarget=TableBuilder&rowFields=timeResolution&rowFields=geographyhospitaldhhsregion&columnField=ageNCHS"
# Rnssp option---------------------------------------------------------------------------------------------------
api_data <- myProfile$get_api_data(url, fromCSV = TRUE)
## Rows: 33 Columns: 12
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (2): timeResolution, geographyhospitaldhhsregion
## dbl (10): 11-14, 15-24, 25-34, 35-44, 45-54, 55-64, 65-74, 75-84, 85+, Unknown
##
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
# keyring option-------------------------------------------------------------------------------------------------
api_response <- GET(url,
authenticate(key_list("essence")[1,2],
key_get("essence",
key_list("essence")[1,2])))
api_response_csv <- content(api_response, by = "text/csv")
api_data <- read_csv(api_response_csv)
## Rows: 33 Columns: 12
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (2): timeResolution, geographyhospitaldhhsregion
## dbl (10): 11-14, 15-24, 25-34, 35-44, 45-54, 55-64, 65-74, 75-84, 85+, Unknown
##
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
glimpse(api_data)
## Rows: 33
## Columns: 12
## $ timeResolution <chr> "2022-10", "2022-10", "2022-10", "2022-10"…
## $ geographyhospitaldhhsregion <chr> "OTHER_REGION", "Region 1", "Region 10", "…
## $ `11-14` <dbl> 0, 1, 4, 2, 3, 5, 5, 5, 2, 1, 4, 0, 3, 4, …
## $ `15-24` <dbl> 0, 108, 132, 89, 128, 476, 246, 160, 64, 7…
## $ `25-34` <dbl> 0, 407, 375, 382, 551, 1551, 821, 319, 68,…
## $ `35-44` <dbl> 0, 471, 281, 402, 502, 1532, 804, 322, 45,…
## $ `45-54` <dbl> 0, 311, 172, 349, 375, 973, 625, 181, 27, …
## $ `55-64` <dbl> 0, 279, 182, 395, 420, 918, 692, 146, 29, …
## $ `65-74` <dbl> 0, 126, 132, 168, 180, 540, 357, 111, 20, …
## $ `75-84` <dbl> 0, 18, 57, 29, 41, 232, 67, 48, 6, 27, 45,…
## $ `85+` <dbl> 0, 8, 11, 13, 17, 81, 20, 14, 4, 4, 17, 0,…
## $ Unknown <dbl> 0, 39, 13, 28, 50, 49, 27, 7, 1, 2, 5, 0, …
The following example demonstrates the necessary data transformations to achieve the long format that is output from the JSON option by default.
api_data_long <- api_data %>%
pivot_longer(cols = -c(timeResolution, geographyhospitaldhhsregion), names_to = "ageNCHS", values_to = "count") %>%
filter(geographyhospitaldhhsregion != "OTHER_REGION")
api_data_long
## # A tibble: 300 × 4
## timeResolution geographyhospitaldhhsregion ageNCHS count
## <chr> <chr> <chr> <dbl>
## 1 2022-10 Region 1 11-14 1
## 2 2022-10 Region 1 15-24 108
## 3 2022-10 Region 1 25-34 407
## 4 2022-10 Region 1 35-44 471
## 5 2022-10 Region 1 45-54 311
## 6 2022-10 Region 1 55-64 279
## 7 2022-10 Region 1 65-74 126
## 8 2022-10 Region 1 75-84 18
## 9 2022-10 Region 1 85+ 8
## 10 2022-10 Region 1 Unknown 39
## # … with 290 more rows
If the JSON option is selected, the only difference from the API URL for the CSV option is that the string “/csv” is missing after “api/tableBuilder”.
url <- "https://essence2.syndromicsurveillance.org/nssp_essence/api/tableBuilder?endDate=31Dec2022&ccddCategory=cdc%20opioid%20overdose%20v3&percentParam=noPercent&geographySystem=hospitaldhhsregion&datasource=va_hospdreg&detector=nodetectordetector&startDate=03Oct2022&ageNCHS=11-14&ageNCHS=15-24&ageNCHS=25-34&ageNCHS=35-44&ageNCHS=45-54&ageNCHS=55-64&ageNCHS=65-74&ageNCHS=75-84&ageNCHS=85-1000&ageNCHS=unknown&timeResolution=monthly&hasBeenE=1&medicalGroupingSystem=essencesyndromes&userId=2362&aqtTarget=TableBuilder&rowFields=timeResolution&rowFields=geographyhospitaldhhsregion&columnField=ageNCHS"
# Rnssp option---------------------------------------------------------------------------------------------------
api_data <- myProfile$get_api_data(url)
# keyring option-------------------------------------------------------------------------------------------------
api_response <- GET(url,
authenticate(key_list("essence")[1,2],
key_get("essence",
key_list("essence")[1,2])))
api_response_json <- content(api_response, as = "text")
api_data <- fromJSON(api_response_json) %>%
filter(geographyhospitaldhhsregion != "OTHER_REGION")
glimpse(api_data)
## Rows: 300
## Columns: 4
## $ timeResolution <chr> "2022-10", "2022-10", "2022-10", "2022-10"…
## $ geographyhospitaldhhsregion <chr> "Region 1", "Region 1", "Region 1", "Regio…
## $ ageNCHS <chr> "11-14", "15-24", "25-34", "35-44", "45-54…
## $ count <dbl> 1, 108, 407, 471, 311, 279, 126, 18, 8, 39…
You may have noticed that the table builder in the ESSENCE user interface is limited to creating tables of up to 30,000 cells. A table this large should suffice in most cases. Keep in mind, however, that the table builder API does not impose a limit on the output table size. To create the API yourself instead of having ESSENCE create it for you, use the “API URLs” button. Familiarize yourself with the structure of the API, and then add parameters that follow the structure. To help with this, look at some examples where ESSENCE has created API URLs for you using the available buttons.
Sometimes you just need line-level data, and this example describes how to extract those data from ESSENCE. This method will create a very large data set quickly. As with any query with potential to create large data sets, first test the query on a small amount of data. Consider multiple calls of smaller time ranges before combining the separate data frames to create a final data set. This API gives you those options. You can specify the variables to include and whether you want a data set with raw or reference values. Downloads with reference values take longer to stream into RStudio because ESSENCE must create those reference values. Output formats for data details are CSV and JSON.
url <- "https://essence.syndromicsurveillance.org/nssp_essence/api/dataDetails/csv?medicalGrouping=injury&geography=region%20i&percentParam=noPercent&geographySystem=hospitaldhhsregion&datasource=va_hospdreg&detector=probrepswitch&timeResolution=daily&medicalGroupingSystem=essencesyndromes&userId=455&aqtTarget=dataDetails&startDate=30Dec2022&endDate=31Dec2022"
# Rnssp option---------------------------------------------------------------------------------------------------
api_data <- myProfile$get_api_data(url, fromCSV = TRUE)
## Rows: 5005 Columns: 29
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (20): Date, Category_flat, SubCategory_flat, Patient_Class, HospitalDHH...
## dbl (8): HasBeenE, HasBeenI, HasBeenO, DDAvailable, DDInformative, CCAvail...
## dttm (1): FirstDateTimeAdded
##
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
# keyring option-------------------------------------------------------------------------------------------------
api_response <- GET(url,
authenticate(key_list("essence")[1,2],
key_get("essence",
key_list("essence")[1,2])))
api_data <- content(api_response, by = "csv/text") %>%
read_csv()
## Rows: 5005 Columns: 29
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (20): Date, Category_flat, SubCategory_flat, Patient_Class, HospitalDHH...
## dbl (8): HasBeenE, HasBeenI, HasBeenO, DDAvailable, DDInformative, CCAvail...
## dttm (1): FirstDateTimeAdded
##
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
glimpse(api_data)
## Rows: 5,005
## Columns: 29
## $ Date <chr> "12/31/2022", "12/31/2022", "12/31/2022", "…
## $ Category_flat <chr> ";Injury;", ";Neuro;Injury;", ";Injury;", "…
## $ SubCategory_flat <chr> ";CutOrPierce;", ";Seizure;Fall;", ";CutOrP…
## $ Patient_Class <chr> "E", "E", "E", "E", "E", "E", "E", "E", "E"…
## $ HospitalDHHSRegion <chr> "Region I", "Region I", "Region I", "Region…
## $ dhhsregion <chr> "Region I", "Region I", "Region I", "Region…
## $ AgeGroup <chr> "18-44", "18-44", "05-17", "65-1000", "65-1…
## $ Sex <chr> "F", "F", "M", "F", "F", "F", "F", "M", "F"…
## $ Race_flat <chr> ";2106-3;", ";2106-3;", ";2106-3;", ";2106-…
## $ Ethnicity_flat <chr> ";2186-5;", ";2186-5;", ";2186-5;", ";2186-…
## $ DispositionCategory <chr> "DISCHARGED", "none", "none", "none", "none…
## $ ICD_CCSR_Desc_Flat <chr> ";Unmapped;", ";Unmapped;", ";Unmapped;", "…
## $ ICD_CCSR_Flat <chr> ";Unmapped;", ";Unmapped;", ";Unmapped;", "…
## $ ICD_Chapter_Desc_Flat <chr> ";Unmapped;", ";Unmapped;", ";Unmapped;", "…
## $ ICD_Chapter_Flat <chr> ";Unmapped;", ";Unmapped;", ";Unmapped;", "…
## $ ICD_Section_Desc_Flat <chr> ";Unmapped;", ";Unmapped;", ";Unmapped;", "…
## $ ICD_Section_Flat <chr> ";Unmapped;", ";Unmapped;", ";Unmapped;", "…
## $ AdmissionTypeCategory <chr> "NR", "NR", "NR", "NR", "NR", "NR", "NR", "…
## $ HasBeenE <dbl> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1…
## $ HasBeenI <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0…
## $ HasBeenO <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0…
## $ DDAvailable <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 1, 1, 1…
## $ DDInformative <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 1, 1, 1…
## $ CCAvailable <dbl> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1…
## $ CCInformative <dbl> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1…
## $ FirstDateTimeAdded <dttm> 2022-12-31 23:50:29, 2022-12-31 15:11:34, …
## $ HasBeenAdmitted <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0…
## $ CRace_CEth_Combined_Broad <chr> "White and non-Hispanic", "White and non-Hi…
## $ CRace_CEth_Combined_Narrow <chr> "White and non-Hispanic", "White and non-Hi…
To reduce the file size, you can specify only the variables you need and by adding additional parameters into the URL, like, for example, “&field=Date&field=HospitalDHHSRegion&field=AgeGroup&field=Sex” as shown below. In this example, only the date, HHS region, age group, and patient sex fields are included. Limiting to a subset of desired fields is much more efficient when pulling line-level data from the full details data sources, which includes 201 fields as of January 2022.
url <- "https://essence.syndromicsurveillance.org/nssp_essence/api/dataDetails/csv?medicalGrouping=injury&geography=region%20i&percentParam=noPercent&geographySystem=hospitaldhhsregion&datasource=va_hospdreg&detector=probrepswitch&timeResolution=daily&medicalGroupingSystem=essencesyndromes&userId=455&aqtTarget=dataDetails&startDate=30Dec2022&endDate=31Dec2022&field=Date&field=HospitalDHHSRegion&field=AgeGroup&field=Sex"
# Rnssp option---------------------------------------------------------------------------------------------------
api_data <- myProfile$get_api_data(url, fromCSV = TRUE)
## Rows: 5005 Columns: 4
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (4): Date, HospitalDHHSRegion, AgeGroup, Sex
##
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
# keyring option-------------------------------------------------------------------------------------------------
api_response <- GET(url,
authenticate(key_list("essence")[1,2],
key_get("essence",
key_list("essence")[1,2])))
api_data <- content(api_response, by = "csv/text") %>%
read_csv()
## Rows: 5005 Columns: 4
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (4): Date, HospitalDHHSRegion, AgeGroup, Sex
##
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
glimpse(api_data)
## Rows: 5,005
## Columns: 4
## $ Date <chr> "12/31/2022", "12/31/2022", "12/31/2022", "12/31/20…
## $ HospitalDHHSRegion <chr> "Region I", "Region I", "Region I", "Region I", "Re…
## $ AgeGroup <chr> "00-04", "18-44", "18-44", "18-44", "45-64", "05-17…
## $ Sex <chr> "M", "U", "F", "M", "F", "F", "U", "F", "M", "U", "…
url <- "https://essence.syndromicsurveillance.org/nssp_essence/api/dataDetails?medicalGrouping=injury&geography=region%20i&percentParam=noPercent&geographySystem=hospitaldhhsregion&datasource=va_hospdreg&detector=probrepswitch&timeResolution=daily&medicalGroupingSystem=essencesyndromes&userId=455&aqtTarget=dataDetails&startDate=30Dec2022&endDate=31Dec2022"
# Rnssp option---------------------------------------------------------------------------------------------------
api_data <- myProfile$get_api_data(url)
## No encoding supplied: defaulting to UTF-8.
# keyring option-------------------------------------------------------------------------------------------------
api_response <- GET(url,
authenticate(key_list("essence")[1,2],
key_get("essence",
key_list("essence")[1,2])))
api_response_json <- content(api_response, as = "text")
## No encoding supplied: defaulting to UTF-8.
api_data <- fromJSON(api_response_json) %>%
pluck("dataDetails")
glimpse(api_data)
## Rows: 5,005
## Columns: 29
## $ Date <chr> "12/31/2022", "12/31/2022", "12/31/2022", "…
## $ Category_flat <chr> ";Injury;", ";Neuro;Injury;", ";Injury;", "…
## $ SubCategory_flat <chr> ";CutOrPierce;", ";Seizure;Fall;", ";CutOrP…
## $ Patient_Class <chr> "E", "E", "E", "E", "E", "E", "E", "E", "E"…
## $ HospitalDHHSRegion <chr> "Region I", "Region I", "Region I", "Region…
## $ dhhsregion <chr> "Region I", "Region I", "Region I", "Region…
## $ AgeGroup <chr> "18-44", "18-44", "05-17", "65-1000", "65-1…
## $ Sex <chr> "F", "F", "M", "F", "F", "F", "F", "M", "F"…
## $ Race_flat <chr> ";2106-3;", ";2106-3;", ";2106-3;", ";2106-…
## $ Ethnicity_flat <chr> ";2186-5;", ";2186-5;", ";2186-5;", ";2186-…
## $ DispositionCategory <chr> "DISCHARGED", "none", "none", "none", "none…
## $ ICD_CCSR_Desc_Flat <chr> ";Unmapped;", ";Unmapped;", ";Unmapped;", "…
## $ ICD_CCSR_Flat <chr> ";Unmapped;", ";Unmapped;", ";Unmapped;", "…
## $ ICD_Chapter_Desc_Flat <chr> ";Unmapped;", ";Unmapped;", ";Unmapped;", "…
## $ ICD_Chapter_Flat <chr> ";Unmapped;", ";Unmapped;", ";Unmapped;", "…
## $ ICD_Section_Desc_Flat <chr> ";Unmapped;", ";Unmapped;", ";Unmapped;", "…
## $ ICD_Section_Flat <chr> ";Unmapped;", ";Unmapped;", ";Unmapped;", "…
## $ AdmissionTypeCategory <chr> "NR", "NR", "NR", "NR", "NR", "NR", "NR", "…
## $ HasBeenE <chr> "1", "1", "1", "1", "1", "1", "1", "1", "1"…
## $ HasBeenI <chr> "0", "0", "0", "0", "0", "0", "0", "0", "0"…
## $ HasBeenO <chr> "0", "0", "0", "0", "0", "0", "0", "0", "0"…
## $ DDAvailable <chr> "0", "0", "0", "0", "0", "0", "0", "0", "0"…
## $ DDInformative <chr> "0", "0", "0", "0", "0", "0", "0", "0", "0"…
## $ CCAvailable <chr> "1", "1", "1", "1", "1", "1", "1", "1", "1"…
## $ CCInformative <chr> "1", "1", "1", "1", "1", "1", "1", "1", "1"…
## $ FirstDateTimeAdded <chr> "2022-12-31 23:50:29.557", "2022-12-31 15:1…
## $ HasBeenAdmitted <chr> "0", "0", "0", "0", "0", "0", "0", "0", "0"…
## $ CRace_CEth_Combined_Broad <chr> "White and non-Hispanic", "White and non-Hi…
## $ CRace_CEth_Combined_Narrow <chr> "White and non-Hispanic", "White and non-Hi…
If you are pulling line-level data details over a longer range of time (i.e., months or a year), it is ideal to time chunk your API data pulls to reduce the amount of strain put on the ESSENCE system. Please keep in mind that heavy data pulls can affect all ESSENCE users. The following example is meant for demonstrative purposes and only pulls data for 4 days. This can be used as a template to pull data for longer periods of time.
api_data <- data.frame(
date = seq.Date(from = as.Date("2022-12-28"), to = as.Date("2022-12-31"), by = "1 day")
) %>%
mutate(
date = format(date, "%d%b%Y"),
url = paste0("https://essence.syndromicsurveillance.org/nssp_essence/api/dataDetails?medicalGrouping=injury&geography=region%20i&percentParam=noPercent&geographySystem=hospitaldhhsregion&datasource=va_hospdreg&detector=probrepswitch&timeResolution=daily&medicalGroupingSystem=essencesyndromes&userId=455&aqtTarget=dataDetails&startDate=", date, "&endDate=", date),
chunk = row_number()
) %>%
nest(data = -chunk) %>%
mutate(
pull = map(.x = data, .f = function (.x) {
myProfile$get_api_data(.x$url) %>%
pluck("dataDetails")
})
) %>%
select(-data) %>%
unnest(pull)
## No encoding supplied: defaulting to UTF-8.
## No encoding supplied: defaulting to UTF-8.
## No encoding supplied: defaulting to UTF-8.
## No encoding supplied: defaulting to UTF-8.
nrow(api_data)
## [1] 10216
The Summary Stats ESSENCE API counts regions (or, “counties,” in ESSENCE) and facilities in your query by whatever time resolution you define (daily, weekly, monthly, quarterly, or yearly). One difference between this API and others is that it is only available on full details data sources (the only data sources that expose this level of information). This API is particularly useful for understanding the number of hospitals with results for your query.
url <- "https://essence.syndromicsurveillance.org/nssp_essence/api/summaryData?endDate=31Dec2022&medicalGrouping=injury&geography=region%20i&percentParam=noPercent&geographySystem=hospitaldhhsregion&datasource=va_hosp&detector=probrepswitch&startDate=28Dec2022&timeResolution=daily&medicalGroupingSystem=essencesyndromes&userId=455&aqtTarget=TimeSeries"
# Rnssp option---------------------------------------------------------------------------------------------------
api_data <- myProfile$get_api_data(url)
# keyring option-------------------------------------------------------------------------------------------------
api_response <- GET(url,
authenticate(key_list("essence")[1,2],
key_get("essence",
key_list("essence")[1,2])))
api_response_json <- content(api_response, as = "text")
api_data <- fromJSON(api_response_json) %>%
pluck("summaryData")
glimpse(api_data)
## Rows: 4
## Columns: 5
## $ date <chr> "28Dec22", "29Dec22", "30Dec22", "31Dec22"
## $ HospitalState <dbl> 6, 6, 6, 6
## $ State <dbl> 26, 24, 26, 20
## $ Region <dbl> 129, 124, 131, 115
## $ Hospital <dbl> 211, 208, 213, 201
The Alert List API gives you programmatic access to the alert list tables in the ESSENCE user interface. Temporal alert tables based on syndrome definitions are available at the region (or, “county”) and county Federal Information Processing Standards (FIPS) code approximation levels.
The results in these tables are updated a few times daily. Alert lists based on patient location include:
Alert lists based on facility location include:
Additionally, term-based alerts became available in ESSENCE2 as of July 2021. Examples of pulling word alerts using the Alert List API are presented in the New Functionality Section. The majority of the alert lists only maintain alerts from the last 30 days. If historical data exceeding this time range are pulled, an empty (NULL) data frame will be returned. Note: All users have access to patient and hospital region-level alerts. You will only be able to drill down to live-level data details for alerts occurring in regions from which you have data access.
url <- "https://essence2.syndromicsurveillance.org/nssp_essence/api/alerts/regionSyndromeAlerts?end_date=31Dec2022&start_date=28Dec2022"
# Rnssp option---------------------------------------------------------------------------------------------------
api_data <- myProfile$get_api_data(url) %>%
pluck("regionSyndromeAlerts")
# keyring option-------------------------------------------------------------------------------------------------
api_response <- GET(url,
authenticate(key_list("essence")[1,2],
key_get("essence",
key_list("essence")[1,2])))
api_response_json <- content(api_response, as = "text")
api_data <- fromJSON(api_response_json) %>%
pluck("regionSyndromeAlerts")
glimpse(api_data)
## Rows: 34,094
## Columns: 12
## $ date <chr> "2022-12-28", "2022-12-29", "2022-12-30", "2022-12…
## $ datasource <chr> "va_er", "va_er", "va_er", "va_er", "va_er", "va_e…
## $ age <chr> "all", "all", "all", "05-17", "18-44", "18-44", "1…
## $ sex <chr> "all", "all", "all", "all", "all", "all", "all", "…
## $ detector <chr> "probrepswitch", "probrepswitch", "probrepswitch",…
## $ level <dbl> 1.446976e-02, 4.342886e-02, 2.659571e-02, 4.398468…
## $ count <int> 8, 4, 8, 2, 2, 2, 3, 3, 3, 2, 3, 3, 2, 3, 1, 1, 1,…
## $ expected <dbl> 3.00000000, 1.60714286, 3.50000000, 0.53571429, 0.…
## $ region <chr> "GA_Lamar", "GA_Lamar", "GA_Lamar", "GA_Lanier", "…
## $ syndrome <chr> "Injury", "Neuro", "Resp", "Resp", "GI", "Neuro", …
## $ timeResolution <chr> "daily", "daily", "daily", "daily", "daily", "dail…
## $ `observed/expected` <dbl> 2.666667, 2.488889, 2.285714, 3.733333, 2.074074, …
url <- "https://essence2.syndromicsurveillance.org/nssp_essence/api/alerts/hospitalSyndromeAlerts?end_date=31Dec2022&start_date=28Dec2022"
# Rnssp option---------------------------------------------------------------------------------------------------
api_data <- myProfile$get_api_data(url) %>%
pluck("hospitalSyndromeAlerts")
# keyring option-------------------------------------------------------------------------------------------------
api_response <- GET(url,
authenticate(key_list("essence")[1,2],
key_get("essence",
key_list("essence")[1,2])))
api_response_json <- content(api_response, as = "text")
api_data <- fromJSON(api_response_json) %>%
pluck("hospitalSyndromeAlerts")
glimpse(api_data)
## Rows: 55,162
## Columns: 13
## $ date <chr> "2022-12-30", "2022-12-31", "2022-12-30", "2022-12-30…
## $ datasource <chr> "va_hosp", "va_hosp", "va_hosp", "va_hosp", "va_hosp"…
## $ age <chr> "all", "00-04", "18-44", "45-64", "45-64", "65-1000",…
## $ sex <chr> "all", "all", "all", "all", "all", "all", "all", "all…
## $ detector <chr> "probrepswitch", "probrepswitch", "probrepswitch", "p…
## $ level <dbl> 0.0283797715, 0.0138202309, 0.0183616856, 0.011742648…
## $ count <int> 1, 2, 6, 5, 3, 3, 2, 1, 1, 20, 8, 9, 18, 19, 2, 9, 2,…
## $ expected <dbl> 0.1071429, 0.2857143, 2.3928571, 1.9285714, 1.8928571…
## $ hospitalName <chr> "MA-Brockton Hospital", "MA-Cape Cod Falmouth Hospita…
## $ regionOfHospital <chr> "MA_Plymouth", "MA_Barnstable", "MA_Barnstable", "MA_…
## $ syndrome <chr> "Hemr_Ill", "Injury", "Resp", "Injury", "Injury", "Fe…
## $ hospital <chr> "6136", "6139", "6139", "6139", "6139", "6139", "6139…
## $ timeResolution <chr> "daily", "daily", "daily", "daily", "daily", "daily",…
Tables containing facility-level data quality metrics are located under the Data Quality tab in the ESSENCE user interface. The coefficient of variation (CoV) measures the overall volatility of weekly facility volume over a specified time period (i.e., 1 year to date, 2 years to date, etc.). The CoV is calculated by dividing the standard deviation by the mean of weekly encounters over this period and multiplying by 100 to report as a percentage. Small CoV values (i.e., < 35%) represent facilities with consistent overall volume and reporting, while high CoV values represent facilities with volatile weekly volume that are potentially attributable to facility onboarding during the specified time period or data drop outs due to data quality or reporting issues. As of December 2022, NSSP-ESSENCE includes the following data quality CoV filters:
There are 2 options for each of these filters: 1) general CoV filters for all encounters; 2) HasBeenE filters for emergency department encounters. Users can select operators associated with the defined CoV threshold within the ESSENCE user interface. These operators include “Equal”, “Does Not Equal”, “Less Than”, “Less Than or Equal”, “Greater Than”, “Greater Than or Equal”, “Between”, or “Is Null”. The “Less Than” or “Less Than or Equal” operators are most commonly used with CoV thresholds between 30% and 40%.
In 2022, a new API was added to allow ESSENCE users to pull CoV
statistics for facilities reporting data to NSSP-ESSENCE. This API can
be used to pull information for all facilities or limited to facilities
within a specific site. The first API pull below pulls data for all
facilities, while the second example limits to facilities in Florida by
specifying the appropriate siteid
value at the end of the
API URL.
# All NSSP-ESSENCE facilities------------------------------------------------------------------------------------
api_url1 <- "https://essence2.syndromicsurveillance.org/nssp_essence/api/dataQuality/COVWklyAvgData"
cov_data1 <- myProfile$get_api_data(api_url1)
# Limited to facilities in a specific site-----------------------------------------------------------------------
api_url2 <- "https://essence2.syndromicsurveillance.org/nssp_essence/api/dataQuality/COVWklyAvgData?siteid=884"
cov_data2 <- myProfile$get_api_data(api_url2)
dim(cov_data1)
## [1] 7881 172
These data include 172 total fields. In addition to CoV information, the data include facility-level metrics for informativeness of discharge diagnosis and chief complaint, total volume, and date of last update. The HTML table below displays all variables and their category for use.
Users can now pull values for available query fields in ESSENCE2 via the APIs. As an example, you may want to pull the list of all current CCDD categories, syndromes, and subsyndromes. The Syndrome Subsyndrome CCDD Combined Category query field in ESSENCE2 provides a complete list of all existing syndrome definitions. The following API URL can be used as a template to pull values from other query fields. Metadata will be pulled in JSON format.
url <- "https://essence2.syndromicsurveillance.org/nssp_essence/QueryWizard/ParamMetaData.json?action=getParameterMetaData&datasourceId=va_er¶mName=combinedCategory&parentParamValue=&parentParamName=&filterParamValList="
syndromes <- myProfile$get_api_data(url) %>%
pluck("valueDisplayFields")
## No encoding supplied: defaulting to UTF-8.
head(syndromes, 10)
## valueField
## 1 a_ccdd_air quality-related respiratory illness v1
## 2 a_ccdd_all traffic related v1
## 3 a_ccdd_all traffic related v2
## 4 a_ccdd_cdc acute flaccid myelitis dd v1
## 5 a_ccdd_cdc acute hepatitis c v1
## 6 a_ccdd_cdc afm broad v1-limit to pediatric
## 7 a_ccdd_cdc afm narrow v1-limit to pediatric
## 8 a_ccdd_cdc alcohol v1
## 9 a_ccdd_cdc all drug v1
## 10 a_ccdd_cdc all drug v2
## displayField
## 1 CCDD Air Quality-related Respiratory Illness v1
## 2 CCDD All Traffic Related v1
## 3 CCDD All Traffic Related v2
## 4 CCDD CDC Acute Flaccid Myelitis DD v1
## 5 CCDD CDC Acute Hepatitis C v1
## 6 CCDD CDC AFM Broad v1-Limit to Pediatric
## 7 CCDD CDC AFM Narrow v1-Limit to Pediatric
## 8 CCDD CDC Alcohol v1
## 9 CCDD CDC All Drug v1
## 10 CCDD CDC All Drug v2
Alternatively, you may want to pull in an up-to-date list of all existing ESSENCE CCDD categories and their underlying queries. This is useful for providing additional context in category-specific reports or summaries.
url <- "https://essence.syndromicsurveillance.org/nssp_essence/servlet/SyndromeDefinitionsServlet_CCDD?action=getCCDDTerms"
ccdd_queries <- myProfile$get_api_data(url) %>%
pluck("categories")
ccdd_queries %>%
filter(category == "CDC Influenza DD v1")
## updateInfo termId groupName dateCreated notes lastUpdate
## 1 No Updates 63 Uncategorized 2020-03-06 No Notes 2020-03-06
## description
## 1 Description to be added
## definition
## 1 Search CC and DD field: (,^[;/ ]J09^,or,^[;/ ]J10^,or,^[;/ ]J11^,or,^[;/ ]487.[018][;/ ]^,or,^[;/ ]487[018][;/ ]^,or,^[;/ ]488.[018][19][;/ ]^,or,^[;/ ]488[018][19][;/ ]^,or,^[;/ ]442696006[;/ ]^,or,^[;/ ]442438000[;/ ]^,or,^[;/ ]6142004[;/ ]^,or,^[;/ ]195878008[;/ ]^,)
## isAdmin category fieldsSearched
## 1 TRUE CDC Influenza DD v1 Undefined
Term-based alerts are available in ESSENCE2. This algorithm was developed to identify unusual distributions of chief complaint terms of interest without dependence on syndrome definitions. Technical details of the word alert algorithm can be found at https://essence2.syndromicsurveillance.org/nssp_essence/usersguide/algorithms/TermBasedAlerts.jsp. You can pull word alerts from ESSENCE2 via the Alert List API. Term alert API URLs need to be defined in RStudio and are not populated in the ESSENCE2 user interface. This API requires the following parameters:
geo_system
: region, facility, or regionFacilitystart_date
: start date in ddMMMyyy formatend_date
: end date in ddMMMyyy formatOptional query parameters include:
NSSP staff with admin access will be able to add and manage additional requested stop words in a list of “ignored” terms. This list will be modified over time to optimize the identification of informative terms and word pairs.
An example of an API URL only including required parameters is shown below. Fields in the corresponding alert table include term ID, date of the alert, data source ID and name, geography system, region (or “county”), anomalous term or word pair, detector algorithm, alert color ID (red = 3), frequency of occurrence, and expected values based on the 30-day baseline used. Note: Term-based alerts are only available for data from the last 7 days.
url <- "https://essence2.syndromicsurveillance.org/nssp_essence/api/alerts/wordAlerts?geo_system=region&start_date=31Dec2022&end_date=31Dec2022"
# Rnssp option---------------------------------------------------------------------------------------------------
api_data <- myProfile$get_api_data(url) %>%
pluck("termAlerts")
# keyring option-------------------------------------------------------------------------------------------------
api_response <- GET(url,
authenticate(key_list("essence")[1,2],
key_get("essence",
key_list("essence")[1,2])))
api_response_json <- content(api_response, as = "text")
api_data <- fromJSON(api_response_json) %>%
pluck("termAlerts")
glimpse(api_data)
## list()
This second example demonstrates how users should form the URL with optional parameters; in this case example, the parameters filter to Broward County, FL, and include syndromic terms.
url <- "https://essence2.syndromicsurveillance.org/nssp_essence/api/alerts/wordAlerts?geo_system=region&start_date=27Dec2022&end_date=31Dec2022&filters=fl_broward&show_syndromic_words=true"
# Rnssp option---------------------------------------------------------------------------------------------------
api_data <- myProfile$get_api_data(url) %>%
pluck("termAlerts")
# keyring option-------------------------------------------------------------------------------------------------
api_response <- GET(url,
authenticate(key_list("essence")[1,2],
key_get("essence",
key_list("essence")[1,2])))
api_response_json <- content(api_response, as = "text")
api_data <- fromJSON(api_response_json) %>%
pluck("termAlerts")
glimpse(api_data)
## list()
Before this update, there was not an efficient way of pulling in daily stratified alerts. This update gave users the ability to pull historic alerts across stratifications in a long, tabular format. This functionality is only available from ESSENCE2: https://essence2.syndromicsurveillance.org/. For example, a user could choose multiple CCDD categories, Has been Emergency = “Yes”, CCDD Category for “As Percent Parameter”, and within the time series interface choose a geography level such as Hospital HHS Region for “Across Graphs Stratification” and CCDD Category for “Within Graph Stratification”. After selecting these configurations, you will need to select the “Update” button in the user interface to generate the stratified time series and API URL for the corresponding data table. Additionally, if “As Percent Parameter” is specified, the data pulled into RStudio will contain p-values and alert indicators specific to both counts and percentages. Indicators specific to counts are specified with “_dataCount” tags in the column names.
url <- "https://essence2.syndromicsurveillance.org/nssp_essence/api/timeSeries?endDate=31Dec2022&ccddCategory=cdc%20pneumonia%20ccdd%20v1&ccddCategory=cdc%20coronavirus-dd%20v1&ccddCategory=cli%20cc%20with%20cli%20dd%20and%20coronavirus%20dd%20v2&percentParam=ccddCategory&geographySystem=hospitaldhhsregion&datasource=va_hospdreg&detector=probrepswitch&startDate=03Oct2022&timeResolution=daily&hasBeenE=1&medicalGroupingSystem=essencesyndromes&userId=2362&aqtTarget=TimeSeries&stratVal=ccddCategory&multiStratVal=geography&graphOnly=true&numSeries=3&graphOptions=multipleSmall&seriesPerYear=false&nonZeroComposite=false&removeZeroSeries=true&startMonth=January&stratVal=ccddCategory&multiStratVal=geography&graphOnly=true&numSeries=3&graphOptions=multipleSmall&seriesPerYear=false&startMonth=January&nonZeroComposite=false"
# Rnssp option---------------------------------------------------------------------------------------------------
api_data <- myProfile$get_api_data(url) %>%
pluck("timeSeriesData")
## No encoding supplied: defaulting to UTF-8.
# keyring option-------------------------------------------------------------------------------------------------
api_response <- GET(url,
authenticate(key_list("essence")[1,2],
key_get("essence",
key_list("essence")[1,2])))
api_response_json <- content(api_response, as = "text")
## No encoding supplied: defaulting to UTF-8.
api_data <- fromJSON(api_response_json) %>%
pluck("timeSeriesData")
glimpse(api_data)
## Rows: 2,700
## Columns: 21
## $ date <chr> "2022-10-03", "2022-10-04", "2022-10-05", "…
## $ count <dbl> 1.944354, 1.757632, 1.726513, 1.864693, 1.8…
## $ expected <chr> "1.445", "1.46", "1.479", "1.726", "1.675",…
## $ levels <chr> "0.122", "0.138", "0.167", "0.407", "0.383"…
## $ colorID <int> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1…
## $ color <chr> "blue", "blue", "blue", "blue", "blue", "bl…
## $ altText <chr> "Data: Date: 03Oct22, Level: 0.122, Count: …
## $ details <chr> "/nssp_essence/servlet/DataDetailsServlet?g…
## $ graphType <chr> "percent", "percent", "percent", "percent",…
## $ dataCount <dbl> 420, 361, 351, 398, 394, 323, 348, 387, 372…
## $ expected_dataCount <dbl> 385.4749, 359.1769, 338.4023, 348.0518, 329…
## $ levels_dataCount <dbl> 0.085922784, 0.470764209, 0.300411004, 0.02…
## $ colorID_dataCount <int> 1, 1, 1, 2, 3, 1, 1, 1, 1, 1, 1, 2, 1, 1, 2…
## $ color_dataCount <chr> "blue", "blue", "blue", "yellow", "red", "b…
## $ allCount <dbl> 21601, 20539, 20330, 21344, 21283, 19174, 1…
## $ lineLabel <chr> "CDC Pneumonia CCDD v1 - Region 1", "CDC Pn…
## $ title <chr> "CDC Pneumonia CCDD v1 - Region 1", "CDC Pn…
## $ ccddCategory_id <chr> "CDC Pneumonia CCDD v1", "CDC Pneumonia CCD…
## $ ccddCategory_display <chr> "CDC Pneumonia CCDD v1", "CDC Pneumonia CCD…
## $ hospitaldhhsregion_id <chr> "Region I", "Region I", "Region I", "Region…
## $ hospitaldhhsregion_display <chr> "Region 1", "Region 1", "Region 1", "Region…
names(api_data)
## [1] "date" "count"
## [3] "expected" "levels"
## [5] "colorID" "color"
## [7] "altText" "details"
## [9] "graphType" "dataCount"
## [11] "expected_dataCount" "levels_dataCount"
## [13] "colorID_dataCount" "color_dataCount"
## [15] "allCount" "lineLabel"
## [17] "title" "ccddCategory_id"
## [19] "ccddCategory_display" "hospitaldhhsregion_id"
## [21] "hospitaldhhsregion_display"
A benefit of pulling the data and alerts in this manner is that you
can create customized figures with ggplot
or
plotly
that can be incorporated into static or interactive
R Markdown reports:
hhs_region_data <- api_data %>%
select(
date,
hhs_region = hospitaldhhsregion_display,
ccdd_category = ccddCategory_display,
percent = count,
color
) %>%
mutate(
date = as.Date(date),
hhs_region = factor(hhs_region, levels = paste("Region", 1:10))
) %>%
filter(ccdd_category == "CLI CC with CLI DD and Coronavirus DD v2") %>%
arrange(date, hhs_region)
date_range <- paste(format(min(hhs_region_data$date), "%B %d, %Y"), "to", format(max(hhs_region_data$date), "%B %d, %Y"))
ggplot(hhs_region_data, aes(x = date, y = percent)) +
geom_line(linewidth = 0.7, color = "#046C9A") +
geom_point(data = subset(hhs_region_data, color == "red"), color = "red", size = 0.5) +
geom_point(data = subset(hhs_region_data, color == "yellow"), color = "yellow", size = 0.5) +
theme_bw() +
labs(
title = paste("CLI v2 by HHS Region:", date_range),
x = "Date",
y = "Percent of ED Visits"
) +
facet_wrap(facets = ~hhs_region, nrow = 2, scales = "free_x") +
scale_y_continuous(
limits = c(0, NA),
expand = expansion(c(0, 0.1))
) +
theme(
strip.background = element_blank(),
strip.text = element_text(face = "bold"),
panel.spacing = unit(1, "lines")
)
ESSENCE2 includes facility county FIPS and patient county FIPS as available query fields (both are technically approximations since ESSENCE regions are populated by ZIP codes). Users may choose FIPS codes as a row (or column) field in the table builder. The following example assumes that a user has chosen the Facility Location (Full Details) data source, the CDC Coronavirus-DD v1, CDC Pneumonia CCDD v1, and Coronavirus-like illness (CLI) CC with CLI DD and Coronavirus DD v2 CCDD categories, “Yes” for “Has Been Emergency”, “CC and DD Category” for “As Percent Query”, and all counties within their state for Facility County FIPS Approximation. By selecting Date and Facility County FIPS Approximation for row fields and CC and DD Category for column fields, the API URL generated in the user interface will have the following structure:
url <- "https://essence2.syndromicsurveillance.org/nssp_essence/api/tableBuilder/csv?endDate=31Dec2022&facilityfips=...&percentParam=ccddCategory&datasource=va_hosp&startDate=03Oct2022&medicalGroupingSystem=essencesyndromes&userId=2362&aqtTarget=TableBuilder&ccddCategory=cdc%20coronavirus-dd%20v1&ccddCategory=cdc%20pneumonia%20ccdd%20v1&ccddCategory=cli%20cc%20with%20cli%20dd%20and%20coronavirus%20dd%20v2&geographySystem=hospital&detector=nodetectordetector&timeResolution=daily&hasBeenE=1&rowFields=timeResolution&rowFields=facilityfips&columnField=ccddCategory"
After the specifying endDate
, all facility FIPS codes
will be defined with the following syntax:
“&facilityfips=fipscode1&facilityfips=fipscode2&…&facilityfips=fipscodeN&”
.
Note: Currently, users need to manually insert
“&refValues=false”
after specifying
facilityfips
as a row field in order to pull in the actual
codes instead of the county names. Define this URL and API pull as
follows:
url <- "https://essence2.syndromicsurveillance.org/nssp_essence/api/tableBuilder/csv?endDate=31Dec2022&facilityfips=...&percentParam=ccddCategory&datasource=va_hosp&startDate=03Oct2022&medicalGroupingSystem=essencesyndromes&userId=2362&aqtTarget=TableBuilder&ccddCategory=cdc%20coronavirus-dd%20v1&ccddCategory=cdc%20pneumonia%20ccdd%20v1&ccddCategory=cli%20cc%20with%20cli%20dd%20and%20coronavirus%20dd%20v2&geographySystem=hospital&detector=nodetectordetector&timeResolution=daily&hasBeenE=1&rowFields=timeResolution&rowFields=facilityfips&refValues=false&columnField=ccddCategory"
# Rnssp option---------------------------------------------------------------------------------------------------
api_data <- myProfile$get_api_data(url, fromCSV = TRUE)
# keyring option-------------------------------------------------------------------------------------------------
api_response <- GET(url,
authenticate(key_list("essence")[1,2],
key_get("essence",
key_list("essence")[1,2])))
api_response_csv <- content(api_response, by = "csv/text")
api_data <- read_csv(api_response_csv)
Occasionally, the API URL you created for the ESSENCE query will be
very, VERY long. For example, you might have reason to explicitly
include all facilities, all counties, or all ZIP codes for a site. In
this situation, the character length of your URL might be too long to
assign to the url <-
object as shown in the preceding
examples. When this occurs, you can split the URL into two or more
strings when creating objects, and then pull them together later. In the
code chunk shown below, we start with one long URL (use your imagination
here) and break it into two pieces. Then, to create the object URL, we
join them using the paste0()
function. Unlike the
paste()
function, paste0()
does not separate
the combined character strings with a space. This final URL can be
passed to ESSENCE along with your credentials to return your results. As
a rule of thumb, do not exceed 20 lines of characters per URL.
url1 <- "https://essence.syndromicsurveillance.org/nssp_essence/api/very_very_very_long_URL_very_long_use_your_imagination_here...."
url2 <- "still_going_even_longer_here..........."
url3 <- paste0(url1, url2)
# Resulting URL
print(url)
## [1] "https://essence2.syndromicsurveillance.org/nssp_essence/api/timeSeries?endDate=31Dec2022&ccddCategory=cdc%20pneumonia%20ccdd%20v1&ccddCategory=cdc%20coronavirus-dd%20v1&ccddCategory=cli%20cc%20with%20cli%20dd%20and%20coronavirus%20dd%20v2&percentParam=ccddCategory&geographySystem=hospitaldhhsregion&datasource=va_hospdreg&detector=probrepswitch&startDate=03Oct2022&timeResolution=daily&hasBeenE=1&medicalGroupingSystem=essencesyndromes&userId=2362&aqtTarget=TimeSeries&stratVal=ccddCategory&multiStratVal=geography&graphOnly=true&numSeries=3&graphOptions=multipleSmall&seriesPerYear=false&nonZeroComposite=false&removeZeroSeries=true&startMonth=January&stratVal=ccddCategory&multiStratVal=geography&graphOnly=true&numSeries=3&graphOptions=multipleSmall&seriesPerYear=false&startMonth=January&nonZeroComposite=false"
For reports run weekly or daily, you can automate the start and end
dates rather than change the dates in the API URL manually before
knitting. There are multiple ways to do this. You may split the URL into
three pieces in a similar fashion to how the URL in the previous example
was split. Or you may use str_extract()
and
str_replace()
to substitute the appropriate dates. For
example, if the report is based on the most recent 90 days, the start
and end date can be auto-determined by using base R’s
Sys.Date()
and format()
to ensure appropriate
date formatting. format(Sys.Date(), "%d%b%Y")
will give
today’s date, 03Jan2023, while
format(Sys.Date() - 89, "%d%b%Y")
will give the start date
of the recent 90 day period, 06Oct2022. You can insert the dates by
splitting the URL can be split into three pieces and then pasting back
together:
end_date <- format(Sys.Date(), "%d%b%Y")
startDate <- format(Sys.Date()- 89, "%d%b%Y")
url1 <- "https://essence.syndromicsurveillance.org/nssp_essence/api/timeSeries/graph?"
url2 <- paste0("endDate=", end_date, "&medicalGrouping=injury&percentParam=noPercent&geographySystem=hospitaldhhsregion&datasource=va_hospdreg&detector=probrepswitch&")
url3 <- paste0("startDate=", startDate, "&timeResolution=daily&medicalGroupingSystem=essencesyndromes&userId=455&aqtTarget=TimeSeries&graphTitle=National%20-%20Injury%20Syndrome%20Daily%20Counts&xAxisLabel=Date&yAxisLabel=Count")
url <- paste0(url1, url2, url3)
# Rnssp option---------------------------------------------------------------------------------------------------
api_png <- myProfile$get_api_tsgraph(url)
knitr::include_graphics(api_png$tsgraph)
# keyring option-------------------------------------------------------------------------------------------------
api_response <- GET(url,
authenticate(key_list("essence")[1,2],
key_get("essence",
key_list("essence")[1,2])),
write_disk("timeseries3.png", overwrite = TRUE))
knitr::include_graphics("timeseries3.png")
Additionally, the URL start and end dates can remained fixed by setting up the code to simply extract and replace the old dates with new dates. The URL results will be the same.
url <- "https://essence.syndromicsurveillance.org/nssp_essence/api/timeSeries/graph?endDate=30Jun2021&medicalGrouping=injury&percentParam=noPercent&geographySystem=hospitaldhhsregion&datasource=va_hospdreg&detector=probrepswitch&startDate=1Jun2021&timeResolution=daily&medicalGroupingSystem=essencesyndromes&userId=455&aqtTarget=TimeSeries&graphTitle=National%20-%20Injury%20Syndrome%20Daily%20Counts&xAxisLabel=Date&yAxisLabel=Count"
end_date_old <- regmatches(url, regexpr('endDate=.+?&', url))
end_date_old <- str_extract(end_date_old, "[0-9]{1,2}[A-Z|a-z]{3}[0-9]{2,4}")
end_date_new <- format(Sys.Date(), "%d%b%Y")
start_date_old <- regmatches(url, regexpr("startDate=.+?&", url))
start_date_old <- str_extract(start_date_old, "[0-9]{1,2}[A-Z|a-z]{3}[0-9]{2,4}")
start_date_new <- format(Sys.Date() - 89, "%")
url <- str_replace(url, end_date_old, end_date_new)
url <- str_replace(url, start_date_old, start_date_new)
url
## [1] "https://essence.syndromicsurveillance.org/nssp_essence/api/timeSeries/graph?endDate=17Jan2023&medicalGrouping=injury&percentParam=noPercent&geographySystem=hospitaldhhsregion&datasource=va_hospdreg&detector=probrepswitch&startDate=%&timeResolution=daily&medicalGroupingSystem=essencesyndromes&userId=455&aqtTarget=TimeSeries&graphTitle=National%20-%20Injury%20Syndrome%20Daily%20Counts&xAxisLabel=Date&yAxisLabel=Count"
ESSENCE works within web standards. Sometimes this raises concerns with users when their configured queries contain so many parameters that the URLs being generated exceed 2,000 characters. To work around this limitation on HTTP GET (and its interaction with servers and browsers ), ESSENCE will automatically create an alias for long URLs called a session attribute ID.
The session attribute ID appears as sessionAttributeID
in the API URL and is a normal URL parameter set to a fixed value that
corresponds to a single saved URL string.
There are several scenarios in which ESSENCE will always generate session attribute IDs:
In each scenario, ESSENCE will receive the URL or parameters via non-HTTP GET means, enabling it to receive the excess characters and properly convert the URL for future use and convenience (e.g, to paste into the web browser address bar). The process is automatic and requires no user intervention, aside from acknowledging that the system generated the URL. Once ESSENCE creates the session attribute ID, it will last indefinitely.
To modify parameters like start and end dates, you can add parameter specifications to the end of the API URL that contains the session attribute ID. The example below displays an ESSENCE-generated URL for a time series query in which all CDC-developed CCDD categories were selected with across graph stratification applied across these categories. In the first example below, the week start and end date parameters referenced by the session attribute ID are December 4, 2022 and January 7, 2023, respectively. The second example demonstrates how you could update the start date to be November 27, 2022, and the end date to be December 24, 2022. For weekly time series, the end date in the API URL should be the Saturday ending date of an MMWR week, however the last week in the data will be represented by the Sunday starting date of the corresponding week.
# ESSENCE-generated API URL with session attribute ID------------------------------------------------------------
attr_id_url <- "https://essence2.syndromicsurveillance.org/nssp_essence/api/timeSeries?sessionAttributeID=2362_03_01_2023_19_48_46_5189&stratVal=&multiStratVal=ccddCategory&graphOnly=true&numSeries=0&graphOptions=multipleSmall&seriesPerYear=false&nonZeroComposite=false&removeZeroSeries=true&startMonth=1&stratVal=&multiStratVal=ccddCategory&graphOnly=true&numSeries=0&graphOptions=multipleSmall&seriesPerYear=false&startMonth=1&nonZeroComposite=false"
api_data <- myProfile$get_api_data(attr_id_url) %>%
pluck("timeSeriesData")
range(as.Date(api_data$date))
## [1] "2022-10-02" "2023-01-01"
# Updated API URL with custom start and end dates----------------------------------------------------------------
attr_id_url_updated <- "https://essence2.syndromicsurveillance.org/nssp_essence/api/timeSeries?sessionAttributeID=2362_03_01_2023_19_48_46_5189&stratVal=&multiStratVal=ccddCategory&graphOnly=true&numSeries=0&graphOptions=multipleSmall&seriesPerYear=false&nonZeroComposite=false&removeZeroSeries=true&startMonth=1&stratVal=&multiStratVal=ccddCategory&graphOnly=true&numSeries=0&graphOptions=multipleSmall&seriesPerYear=false&startMonth=1&nonZeroComposite=false&startDate=27Nov2022&endDate=24Dec2022"
api_data_updated <- myProfile$get_api_data(attr_id_url_updated) %>%
pluck("timeSeriesData")
range(as.Date(api_data_updated$date))
## [1] "2022-11-27" "2022-12-18"
Or, you can update the start and end dates by using
the paste0()
function (similar to using this function to
dynamically set dates in an API URL). In this scenario, the date
parameters are pasted on to the end of the URL. Because this is a time
series API with a weekly time resolution, the start and end dates in
this example are set to pull data from the most recent 3 complete
MMWR weeks.
end_date <- format(floor_date(Sys.Date(), unit = "1 week") - 1, "%d%b%Y")
start_date <- format(floor_date(Sys.Date(), unit = "1 week") - 7 * 4, "%d%b%Y")
attr_id_url2 <- paste0("https://essence2.syndromicsurveillance.org/nssp_essence/api/timeSeries?sessionAttributeID=2362_03_01_2023_19_48_46_5189&stratVal=&multiStratVal=ccddCategory&graphOnly=true&numSeries=0&graphOptions=multipleSmall&seriesPerYear=false&nonZeroComposite=false&removeZeroSeries=true&startMonth=1&stratVal=&multiStratVal=ccddCategory&graphOnly=true&numSeries=0&graphOptions=multipleSmall&seriesPerYear=false&startMonth=1&nonZeroComposite=false&startDate=", start_date, "&endDate=", end_date)
api_data <- myProfile$get_api_data(attr_id_url2) %>%
pluck("timeSeriesData")
range(as.Date(api_data$date))
## [1] "2022-12-18" "2023-01-08"
The preceding examples will help you pull data into your RStudio environment, but the next steps are really up to you. If you are unfamiliar with R and RStudio, here are some open-source resources to move your analysis forward.