next up previous contents
Next: Process Configuration Up: DODS Data Delivery Design Previous: Contents   Contents

Introduction

This document describes the design and construction of the data delivery components of the Distributed Oceanographic Data System (DODS). The architecture of DODS is described in detail in ``DODS--Data Delivery Architecture''. DODS is a client-server system which differs from the conventional notion of such a system, such as ftp, because it has many different clients, not just one. Instead of the implementor(s) of DODS building a single client program to read and display data, they build replacements for existing data access application programmer interfaces (API). These replacement, or surrogate, API implementations are then linked to any data access, display or analysis software which uses that API to access data. The surrogate API implementations use the DODS tool kit, in conjunction with the HTTPD, to access data stored on remote computers which have installed DODS data servers. Because the reimplemented APIs can access data over the network using DODS data servers, application programs linked with them can also access such data, even though the API calls they make are unchanged.

In addition to the remote access of data sets, DODS provides a limited degree of API transparency. The DODS data servers transmit information using the Data Access Protocol (DAP). The DODS surrogate libraries issue requests and read data using the DAP, while the DODS data servers answer requests and provide information using the DAP as well. Both the surrogate libraries and the data servers can be said to translate the native access mode of a API or storage format into or out of the DAP. Because a single transmission/access protocol is used to move all information within DODS, any surrogate library can, in principle, access data from any of the data servers, regardless of the storage format of the data accessed by that server. Thus, for example, data stored in NetCDF files can be read using software designed to work with the JGOFS system.

Using the data access protocol has a price. Servers must translate calls in the DAP to calls in the data set API or, if the data set has no associated API, to various reads which get information from the data set. Either way the data server must translate the DAP accesses into the appropriate accesses for the data set. Section [*] describes the design and construction of these data servers. In addition, the surrogate libraries must translate calls from the API they replace to calls in the DAP. Section [*] describes how these surrogate libraries are built.

Several designs were considered for the data delivery mechanism of DODS. They were socket-based peer-to-peer communications, RPC-based peer-to-peer communications, virtual file systems and HTTP/CGI-based client/server systems. The first three of these different designs are compared in: Report on the First Workshop for the Distributed Oceanographic Data System, Proposed System Architectures and ``DODS--Data Delivery'', which presents our rationale behind prototyping the RPC-based design for DODS. However, as a result of those prototypes and the development of HTTP as a de facto data communications standard, we changed the data delivery design to an HTTP/CGI-based system.

By using HTTP as a transport protocol, we are able to tap into a large base of existing software which will likely evolve along with the Internet as a whole. Because the development of large-scale distributed systems is relatively new1 there are many problems which must still be addressed for these systems to be robust. These problems include naming resources independently of their physical location, and choosing between two objects which appear to be the same but which differ in terms of quality. These are general problems which are hard to solve because they will be solved effectively only when the Internet community reaches a consensus on which of the available solutions are best. HTTP, because it is so widely accepted, provides a reasonable base for such solutions. This view is supported by the recent Internet Engineering Task Force (IETF) work on extending the HyperText Transfer Protocol and standards.



Subsections
next up previous contents
Next: Process Configuration Up: DODS Data Delivery Design Previous: Contents   Contents
James Gallagher 2004-04-21