The OPeNDAP Catalog/Aggregation Server is an OPeNDAP server that creates virtual datasets from collections of other OPeNDAP datasets or from NetCDF files. It can also serve single NetCDF files as (non-aggregated) OPeNDAP datasets. It uses Thematic Realtime Environmental Data Distributed Services (THREDDS) catalogs to specify what datasets it serves, and how to aggregate them. The specification is formatted as an XML document.
The OPeNDAP (Open Source Project for Network Data Access Protocol) is organized around the concept that the relevant unit of data storage is a disk file. In order to get or retrieve data, you must know what file the desired data is in. To allow users to identify specific files, data providers create lists of files, or catalogs, that may be browsed or searched. This approach makes certain aspects of data management simpler, and is a natural consequence of several popular data storage APIs which work the same way. However, many users may perceive it as a burden to have to keep track of file names, which are not always chosen to be easy to remember. What's worse, it becomes quite awkward if the slice of data you happen to be interested in spans many files. Looking for a time series in an archive of satellite data, for example, might require thousands of requests to individual data files.
The OPeNDAP Catalog/Aggregation Server is designed to accommodate these problems, providing a single point of entry to collections of many files. Conceptually, it consists of two components: a catalog of data files, and the smarts necessary to determine how to use those data files to satisfy user requests. The files in question may be either local files, existing on the same computer as the server, or they may be remote files, specified with an OPeNDAP URL. The catalog is a file of XML declarations which identify how individual data files can be aggregated to look like single larger files.
In operation, the Aggregation Server accepts a query from a user, and determines how to use the data files listed in its catalog to fulfill that query. After making this determination, the Aggregation Server makes the subsidiary queries necessary to satisfy the original request, aggregates this data into the form expected by the user who initiated the request, and returns the result to that user. The user need not even know that the result is an aggregation. Though a complex aggregation will obviously take longer than a simple OPeNDAP data request, there should be no other differences in the user interaction.