Prev Up Next Index
Go backward to Examples
Go up to 4.1 Selecting Data: Using Constraint Expressions
Go forward to 4.2 A Word About Data Translation

4.1.6 Optimizing the Query

Using the tools provided by OPeNDAP, a user can build quite elaborate and sophisticated constraint expressions that will return precisely the data he or she wishes to examine. However, as the complexity of the constraint expression increases, so does the time necessary to process that expression. There are some techniques a user may user to optimize the evaluation of a constraint that will ease the load on the server, and provide faster replies to OPeNDAP dataset queries.

The OPeNDAP constraint expression evaluator uses a "lazy evaluation" algorithm. This means that the sub-clauses of the selection clause are evaluated in order, and parsing halts when any sub-clause returns FALSE. Consider a constraint expression that looks like this:  

?station&station.cast.O2>15.0&station.cast.temp>22.0

If the server encounters a station with no oxygen values over 15.0, it does not bother to look at the temperature records at all. The first sub- clause evaluates FALSE, so the second clause is never even parsed.

A careful user may use this feature to his or her advantage. In the above example, the order of the clauses does not really matter; there are the same number of temperature and oxygen measurements at each station. However, consider the following expression:

?station&station.cast.O2>15.0&station.month={3,4,5}

For each station there is only one month value, while there are many oxygen values. Passing a constraint expression like this one will force the server to sort through all the oxygen data for each station (which could be in the thousands of points), only to throw the data away when it finds that the month requested does not match the month value stored in the station data. This would be far better done with the clauses reversed:

?station&station.month={3,4,5}&station.cast.O2>15.0

This expression will evaluate much more quickly because unwanted stations may be quickly discarded by the first sub-clause of the selection. The server will only examine each oxygen value in the station if it already knows that the station might be worth keeping.  

This sort of optimization becomes even more important when one of the clauses contains a URL. In general, any selection sub-clause containing a URL should be left to the end of the selection. This way, the OPeNDAP server will only be forced to go to the network for data if absolutely necessary to evaluate the constraint expression.


Tom Sgouros, August 25, 2004

Prev Up Next