Prev Up Next Index
Go backward to Contents
Go up to Contents
Go forward to 1.1 Peeking at Data

1 What To Do With A DODS URL

Suppose someone gives you a hot tip that there's a lot of good data at:

http://www.cdc.noaa.gov/cgi-bin/nph-nc/Datasets/reynolds_sst/sst.mnmean.nc

This URL points to monthly means of sea surface temperature, worldwide, compiled by Richard Reynolds at the Climate Modeling branch of NOAA, but pretend you don't know that yet. 

The simplest thing you can do with this URL is to download the data it points to. You could feed it to a DODS-enabled data analysis package like Ferret, or you could append .asc, and feed the URL to a regular web browser like Netscape. This will work, but you don't really want to do it because in binary form, there are about 28 megabytes of data at that URL.

NOTE: A DODS server will work with many different clients, some of which are supported by the DODS team, and some of which are supported by others. The operation of any individual package is beyond the scope of this manual. This guide explains how to use a typical web browser such as Netscape Navigator to discover information about the data that will be useful when analyzing data in any package.

A better strategy is to find out some information about the data. DODS has sophisticated methods for subsampling data at a remote site, but you need some information about the data first. First, we'll try looking at the data's Dataset Descriptor Structure  (DDS). This provides a description of the "shape" of the data, using a vaguely C-like syntax. You get a dataset's DDS by appending .dds to the URL. 

A DODS DDS (sst.mnmean.nc.dds)

From the DDS shown, you can see that the dataset consists of five pieces:

The Grid is a special DODS data type that includes a multidimensional array, and map vectors that indicate the independent variable values. That is, you can use a Grid to store an array where the rows are not at regular intervals. Here's a simple grid:  

A Grid

The array part of the grid would contain the data points measured at each one of the squares, the X map vector would contain the positions of the columns, and the Y map vector would contain the positions of the rows.

Of course you can also use a Grid to store arrays where the columns and rows are at regular intervals, and you'll often see DODS data that way.

(The other special DODS data type worth worrying about is the Sequence. You'll see more about them in section 1.2. There are also Structures and Lists, but they exist largely for internal uses, and you don't often see these used in real datasets.)

You can see from the DDS that the Reynolds data is in a 180x360x226 element grid, and the dimensions of the Grid are called "lat", "lon", and "time". This is suggestive, but not as helpful as one could wish. To find out more about what the data is, you can look at the other important DODS structure: the DAS, or Data Attribute Structure. This is somewhat similar to the DDS, but contains information about the data, such as units and the name of the variable. Part of the DAS for the Reynolds data we saw above is shown in the figure below. Click here or on the figure to see the rest of it.

 

A DODS DAS (sst.mnmean.nc.das)

NOTE: The DAS is populated at the data provider's discretion. Because of this, the quality of the data in it (the metadata) varies widely. The data in the Reynolds dataset used in this example are COARDS compliant. Other metadata standards you may encounter with DODS data are HDF-EOS, EPIC, FGDC, or no metadata at all.

Now we can tell something more about the data. Apparently the lat vector contains latitude, in degrees north, and the range is from 89.5 to -89.5. Since this is a global grid, the latitude values probably go in order. We can check this by asking for just the latitude vector, like this:

http://www.cdc.noaa.gov/cgi-bin/nph-nc/Datasets/reynolds_sst/sst.mnmean.nc.asc?lat

What we've done here is to append a constraint expression to the DODS URL, to indicate how to constrain our request for data. Constraint expressions can take many forms. This guide will only describe a few of them. (You can refer to the The OPeNDAP User Guide for more complete information about constraint expressions.) Try requesting the time and longitude vectors to see how this works.  

According to the DAS, time is kept in "days since 1-1-1 00:00:00" in this dataset. You can also learn from the DAS the actual time period recorded in the data which, because of your familiarity with the Julian calendar, you instantly recognize as beginning in November, 1981. You might also notice that the mask array is used to indicate land and sea, and has only the values 0 and 1. 

DODS provides an info service that returns all the information we've seen so far in a single request. The returned information is also formatted differently (some would say "nicer"), and you can occasionally find server-specific documentation here, as well. Some will find this the easiest way to read the attribute and structure information. You can see what information is available by appending .info to a URL, like this:

http://www.cdc.noaa.gov/cgi-bin/nph-nc/Datasets/reynolds_sst/sst.mnmean.nc.info

Tom Sgouros, 2004/07/07

Prev Up Next