Prev Up Next Index
Go backward to 1 Theory
Go up to 1 Theory
Go forward to 1.2 A little more detail

1.1 Methods of Aggregation

Any data can be included in a given data catalog, but not all data can be (or should be) aggregated. Two data files that can (usefully) appear joined together in an Aggregation Server might be records of the same kinds of data on different dates, or different kinds of data at the same location, or data at adjoining locations at the same time. That is, the data to be aggregated must share some common features to be worth the trouble to aggregate them. If you can't imagine a user wanting to make the same query to two different data files, they probably aren't worth aggregating.

Even though two data files are not aggregated, they may still be included as part of the same data catalog. The catalog can help users find data, even though it isn't aggregated.

The OPeNDAP Catalog/Aggregation Server can handle three different forms of aggregation. We call these three JoinNew, JoinExisting, and Union. We'll examine each of these in turn.

NOTE: As of version 0.6 of the Aggregation Server, only Grid and Array data types can be aggregated.

JoinNew
The JoinNew form of aggregation is used to join datasets along a new dimension. For example, if you have a set of measurements taken of the same spatial area at different dates, you could arrange these in order by time to create a time series of measurements. If a user wanted to see the time evolution of a measurement at a subsection of the larger measured area, the situation might look sort of like what's pictured in figure 1.

Here, A, B, and C represent sets of measurements in the X and Y dimensions. The three sets are aggregated along the Z axis. This means that none of A, B, or C have a Z variable in those sets, but they all contain (or represent) a single Z value. In our example above, X and Y are spatial dimensions, and Z would be time.

 

JoinExisting aggregation

JoinExisting
If you have several datasets that consist of data in adjoining regions of space or time, you may be able to aggregate them with the JoinExisting form of aggregation. For example, if you have a time series that begins right after another one ends, you can combine these two along the time axis. Similarly, if you have two spatial grids, whose edges adjoin, you might be able to join them along the dimension orthogonal to the common edge. Something of this nature is shown in figure 1.1, where three datasets, each of which contain independent variables X, Y, and Z, are joined on the Z axis.

 

Union aggregation

Union
If you need to aggregate datasets that cover the same space and time areas, but consist of different data types, you can use the Union aggregation to join them. For example, you might have a grid of satellite sea surface temperature values, and another grid of wind speed observations. You can use the Union aggregation to combine the two into one dataset containing both temperature and wind speed.

Now that you have the idea of what is meant by aggregation, the next section will show how to specify the method and parameters for aggegration using the necessary XML syntax. The following chapter will explain how to install the server. Chapter 3 describes how to configure the aggregation server so it will show your data in the way you want.


Tom Sgouros, 2004/07/07

Prev Up Next