next up previous contents
Next: Uses of URLs by Up: Structure and Interpretation of Previous: Syntax of the DODS   Contents


Characteristics of DODS and WWW URLs

DODS URLs are a proper subset of WWW URLs. URLs defined by the WWW project cover a wide variety of client-server systems while parts of DODS URLs are interpreted only to be DODS CGIs. WWW URLs include special cases for servers which support protocols such as ftp, wais and gopher. In these cases, the Access component of a WWW URL is used to indicate which protocol to use. When clients receive URLs, they use this information to select which protocol should be used to access the data referenced by the URL (i.e., the Access information is used to select the type of server with which the client will communicate). DODS URLs are only meaningful to CGI modules accessible over the network via HTTP servers. Thus, there will never be any variation in the Access component of a DODS URL. It will always be http.

In addition to a protocol restriction, DODS URLs always reference a CGI module on the server. This is necessary because all DODS data access takes place via one or more CGI modules. The CGI specification must follow the Host component of the URL.

While WWW URLs are the input to WWW browsers written with knowledge of URLs, and in many cases, the ability to parse parts of URLs, DODS URLs are passed to programs which nominally have no knowledge of URL notation. In DODS--Data Delivery Architecture a data system architecture is presented in which user programs are re-linked with new implementations of data-access APIs. These client libraries understand and can manipulate URLs (and in particular, understand the particular restrictions DODS places on URLs) but the user programs do not. Some of the user programs were written before any of the URL software even existed.

A DODS URL references a data set, or portion of a data set. Accessing a DODS URL returns an experimental-type binary Multipurpose Internet Mail Extensions document. Such a document cannot be displayed by browsers nor does it contain URLs which link the document to other documents on the Internet. URLs that are part of the WWW system of HTTP servers reference documents which can be displayed by a large number of difference browsers. DODS URLs are not browsable at all, that is, they cannot meaningfully serve as input to one of the standard WWW browsers1. DODS uses URLs simply to reference data sets on the Internet and not for the more general purpose of linking a document within the larger context of all documents publicly accessible on the Internet.

There are two ways DODS CGI modules can interpret the part of the URL which the HTTP server passes to it; as a constraint expression or as an access to a virtual file system. This second method of interpretation is a specialization of the general behavior of HTTPD where the text following the CGI name is passed to the CGI module unprocessed. Accessing a data set as if it were a virtual file system means specifying variables within the data set as if they were files or files within directories subordinate to the CGI. For example in Table [*] the file name separator (/) is used to indicate that the within the named data set (the netCDF data set called CDT) only the variable u1 is to be accessed. If the data set contained several levels of variables, as in the examples for the three level JGOFS data set, then a specific level or variable within a level may be specified. However, the virtual file system syntax is not capable of specifying arbitrary set of variables from a data set unless those variables happen to be the sole members of a structure of sequence (See DODS--Data Access Protocol for more information on data types DODS supports). The VFS syntax is a special case of constraint expressions. It is included in DODS because file systems are familiar tools to many users, much more so than boolean expressions, and because they are flexible enough for many user's demands.

The second syntax DODS supports for the portion of the URL past the CGI name is the constraint expression syntax. Constraint expressions are a way of passing, along with the URL, a set of restrictions to be applied to the data set when the client software (i.e., the surrogate library) accesses the data set. These constraints will be used during access to the data set to limit the variables and variable values extracted from it. The constraint expression, or portions of the constraint expression, will be evaluated by the server at the time of the data access. The syntax of the expression itself is dependent on the types of variables within the data set (See DODS--Data Access Protocol).


next up previous contents
Next: Uses of URLs by Up: Structure and Interpretation of Previous: Syntax of the DODS   Contents
James Gallagher 2004-04-21