Accessing DAYMET data from NASA’s Archives
Requirements
- Earthdata login (EDL) credentials.
- Concept Collection ID or DOI for relevant DAYMET collection.
- Python >= 3.11.
- Mamba-forge (or conda-forge) installed on the machine.
- Familiarity with Jupyter notebooks and Jupyter Lab.
Optional:
- Store all EDL credentials in a
.netrcfile. - Basic knowledge of conda environment installation.
Objectives
To download 10 years of precipitation Daymet data in a region of the MidAtlantic area of the Continental US. The spatial and temporal range is defined by the following parameters:
- Time range: 04/15/2014 – 04/15/2024.
- Spatial range: -80.3 < longitude < -74, and 36.5 < latitude < 40.5
To accomplish this goal above, the tutorial will demonstrate how to:
- Authenticate (via earthaccess).
- Search for all available NASA OPeNDAP URLs for a specific NASA collection. The search will further filter by time range.
- How to subset with OPeNDAP, by variable name and spatial / temporal range.
Install required python dependencies
In a terminal shell, use mamba or conda forge to install all required dependencies to run this tutorial and activate the environment to run an interactive jupyter notebook on a browser.
$ mamba create -n opendap_env -c conda-forge python=3.12 ipython pydap jupyterlab earthaccess netCDF4
$ mamba activate opendap_env
$ jupyter lab
Once in the jupyter notebook environment, import in the first cell all necessary methods that will be used to stream remote data into a local file:
import xarray as xr
import datetime as dt
import earthaccess
import numpy as np
# import pydap-specific tools
from pydap.client import get_cmr_urls, open_url
from pydap.client import to_netcdf as dap_to_netcdf
Finding OPeNDAP URLs with PyDAP
The needed parameters to search for all DAYMET data available through OPeNDAP is
- Concept Collection ID =
C2531982907-ORNL_CLOUD.
DAYMET data is a level 4 data product, meaning all remote files have the same longitude and latitude coordinates arrays. In this case, it is not necessary to filter the search for all relevant data URLs by a bounding box. Any subset by coordinate values will be done by OPeNDAP.
To learn how to find the concept collection id for a specific data product, click the button below:
Below are the required parameters to search for all OPeNDAP URLs using PyDAP's get_cmr_urls:
daymet_ccid = "C2531982907-ORNL_CLOUD"
time_range = [dt.datetime(2014, 5, 15), dt.datetime(2024, 5, 15)]
cmr_urls = get_cmr_urls(ccid=daymet_ccid, time_range=time_range, limit=1000)
prcp_na_urls = [url for url in cmr_urls if url.split(".nc")[0].split("Daymet_Annual_V4R1.daymet_v4_")[-1].startswith("prcp_annttl_na")]
Line 6 above further filters the URLs returned by the CMR, to select only those URLs related to data in the Continental US.
EDL Authentication with earthaccess and OPeNDAP
There are various ways to authenticate with NASA, and here we will use earthaccess to retrieve a session object containing all required credentials to access data.
When using earthaccess to "login", you need to define a strategy and you have two options:
- If you already have a
.netrcfile with your EDL credentials stored in your machine, setstrategy="netrc" - If you DO NOT have a
.netrcfile with your EDL credentials, or you are not sure, do insteadstrategy="interactive"
from earthaccess.exceptions import LoginStrategyUnavailable
try:
auth = earthaccess.login(strategy="netrc", persist=True)
except LoginStrategyUnavailable:
# you will be prompted to add your EDL credentials
auth = earthaccess.login(strategy="interactive", persist=True)
# pass Token Authorization to a new Session.
my_session = session=auth.get_session()
The object my_session contains your EDL credentials, and it will be used to retrieve data from OPeNDAP. Moreover, by adding a persist=True as an argument to earthaccess.login, a .netrc is created to stored your EDL credentials in the machine for later reuse.
Use OPeNDAP to subset data by coordinate values and variable names
The goal is to run the following code block:
dap_to_netcdf(
prcp_na_urls,
session=my_session,
output_path = output_path,
dim_slices=dim_slices,
keep_variables=keep_vars_prec,
)
where:
output_path: a user-defined directory path where the files will be stored. If not specified, PyDAP streams data into the current directory.dim_slices: a dictionary where spatial slices are defined.keep_variables: a list declaring all variables in the remote file that will be downloaded.
The API above, dap_to_netcdf is an alias to pydap.client.to_netcdf (see the import!). It works exclusively with the DAP4 protocol.
Below we outline how to define dim_slices and keep_variables using OPeNDAP metadata and downloading only minimal data to identify the correct dimension slices to subset by coordinate values.
Subset by variable names
With OPeNDAP one can download only the metadata of the remote file. PyDAP is a Python client that can facilitate this in a jupyter notebook environment as follows:

pyds holds the Python representation of the remote dataset. It describes all variables, their shape and size. It also holds the name of the remote file (see GIF above, where the name of the remote file is daymet_v4_prcp_annttl_na_2014.nc).
Because we are only interested in precipitation data and its coordinates to make visualizations, we define one of the parameters that declares which variables will be downloaded:
keep_vars_prec = ["/time", "/y", "/x", "/lon", "/lat", "/prcp"] # variables to download
NOTE: slashes (/) on variable names is a requirement in the DAP4 protocol, since it supports hierarchical data structures such as Groups. Groups act as directory in the remote file. In this case, the file does not have any Group, but the variables still need to be defined with a "full path", in this case the / identifies the "root".
Subset by coordinate values
In DAP4, the remote server currently does not subset by coordinate value, only by coordinate slices or dimension slices. In the present case of curvilinear coordinates, latitude and longitude are two-dimensional and each have the shared dimensions x and y (or /x and /y). To subset in space, we need to identify the range of values of both /x and /y that produce the desired spatial subset.
The code block below downloads lat and lon coordinate data, and identifies the index pairs that produce the spatial subset of interest.
# define max lat/lon values
lon_min, lon_max = -80.3, -74
lat_min, lat_max = 36.5, 40.5
# download data using pydap - only lon and lat are downloaded
lon = pyds['lon'][:].data
lat = pyds["lat"][:].data
lon, lat = np.asarray(lon), np.asarray(lat)
# identify data of interest by index values
mask = ((lon >= lon_min) & (lon <= lon_max) & (lat >= lat_min) & (lat <= lat_max))
rows, cols = np.where(mask)
y0, y1 = rows.min(), rows.max()
x0, x1 = cols.min(), cols.max()
The pairs x0,x1 and y0,y1 define the dimension slices that we need to subset the remote file. We can now construct the remaining argument to download all our data of interest.
dim_slices = {'/y':(y0, y1), '/x': (x0, x1)} # defines index to subset format: (first, last)
Stream data into a local directory
Finally, we can stream a subset of data across all remote files of interest, and store them in a local directory. We define the path below:
output_path = "./data"
We now stream only the data of interest into local files, using PyDAP and the remote OPeNDAP Hyrax data server (all via the DAP4 protocol).

References
Thornton, M. M., Shrestha, R., Wei, Y., Thornton, P. E., & Kao, S.-C. (2022). Daymet: Annual Climate Summaries on a 1-km Grid for North America, Version 4 R1 (Version 4.1). ORNL Distributed Active Archive Center. https://doi.org/10.3334/ORNLDAAC/2130
Cite this Tutorial
Jimenez-Urias, M. A. (2026). Access Precipitation Data From DAYMET Via OPeNDAP. Zenodo. https://doi.org/10.5281/zenodo.19476333
@misc{jimenez_urias_2026_19476333,
author = {Jimenez-Urias, Miguel Angel},
title = {Access Precipitation Data From DAYMET Via OPeNDAP},
month = apr,
year = 2026,
publisher = {Zenodo},
doi = {10.5281/zenodo.19476333},
url = {https://doi.org/10.5281/zenodo.19476333},
}
