Prev Up Next
Go backward to 8 Implementing the DAS object
Go up to Top
Go forward to 10 Notes

9 Implementing the DataDDS object

Building the DataDDS object handler follows the same pattern as before with the DDS and the DAS. In fact, the DDS handler can be modified and used as the DataDDS handler by simply changing the call to DODSFilter::send_dds() to a call to send_data(). However, before the send_data() method will work, we must return to the data type child classes and add more functionality.

In the data type child classes you created we must now implement the method read(). This method will be called by the DAP software that sends data values. To understand how the read() method will be used, it's instructive to look at the code that calls it. In the DAP data type classes, each of the scalar, vector and constructor types has a method called serialize(). Below is shown Byte's version of this method (Look at the code for DDS if you'd like to see how serialize() is used and pay particular attention to DDS::send_data()).

NOTE: You don't have to write your own version of serialize(), this is

shown here to provide you with some background information about the role of the read() method in building the DataDDS response.

bool
Byte::serialize(const string &dataset, DDS &dds, XDR *sink, bool ce_eval)
{
    if (!read_p())
        read(dataset);          // read() throws Error and InternalErr

    if (ce_eval && !dds.eval_selection(dataset))
        return true;

    if (!xdr_char(sink, (char *)&_buf))
        throw Error(
"Network I/O Error. Could not send byte data.\n\
This may be due to a bug in DODS, on the server or a\n\
problem with the network connection.");

    return true;
}

The serialize() method is broken into three parts:

NOTE: CE evaluation actually happens in two phases. In the first phase, the

expression is parsed. During this process, variables that are `projected' are marked as such and a linked list of `selection' nodes is built. The serialize() method is called only for variables that are part of the projections (that is, that are to be sent back to the client). The second phase of CE evaluation happens inside serialize() when the DDS::ce_eval method is used to evaluate the clauses. If all the clauses evaluate to true, then the current variable is sent.

Here's the read() method for the Matlab server's Array class (MATArray). This is by far the most complex looking piece of code in this tutorial, but it's really not very complicated once broken down.

bool
MATArray::read(const string &dataset)
{
    if (read_p())  // Nothing to do
        return false;

    MATFile *fp = matOpen(dataset.c_str(), "r");
    if (fp == NULL)
        throw Error(string("Could not open the file: ") + dataset.c_str());
  
    Array::Dim_Iter p = dim_begin();
    int start = dimension_start(p,true);
    int stride = dimension_stride(p, true);
    int stop = dimension_stop(p, true); 

    p++;
    int start_2 = dimension_start(p,true);
    int stride_2 = dimension_stride(p, true);
    int stop_2 = dimension_stop(p, true); 


    // get real part of the complex  matrix
    double *DataPtr;
    Matrix *mp;
    size_t pos;
    if ((pos = name().find("_Real")) != name().npos) {  
        string Rname = name().substr(0,pos);
        mp = matGetMatrix(fp,Rname.data());
        DataPtr = mxGetPr(mp); 
    }
    else{
        // get Img part of the complex matrix
        if ((pos = name().find("_Imaginary")) != name().npos) {  
            string Iname = name().substr(0,pos);
            mp = matGetMatrix(fp,Iname.data());
            DataPtr = mxGetPi(mp); 
        }
        else{
            mp = matGetMatrix(fp,name().data());
            DataPtr = mxGetPr(mp); // get the matrix structure
        }
    }

    if (DataPtr == NULL)
        throw Error(string("Error reading matrix"));

    if(start+stop+stride == 0){ //default rows
        start = 0;
        stride = 1;
        stop = mxGetM(mp)-1;
    }
    if(start_2+stop_2+stride_2 == 0){ //default columns
        start_2 = 0;
        stride_2 = 1;
        stop_2 = mxGetN(mp)-1;
    }

    int Len = (((stop-start)/stride)+1)*(((stop_2-start_2)/stride_2)+1);
  
    int Tcount = 0;
    dods_float64 *BufFlt64 = new dods_float64 [Len];    
  
    for (int row = start; row <= stop; row +=3Dstride) {          
        for(int column = start_2; column <= stop_2; column+=stride_2) {
            *(BufFlt64+Tcount) = (dods_float64)
            *(DataPtr+row+column*mxGetM(mp));  
            Tcount++;
        }
    }

    set_read_p(true);      
    val2buf((void *)BufFlt64);
    delete [] BufFlt64;
          
    mxFreeMatrix(mp);
    matClose(fp);
    return false;
}

First, on entry into the method, we check to see if the data have already been read. This can happen if the data were previously needed for the evaluation of the CE. Note that in an earlier version of the DAP library, the return value of read() was used to signal whether the method needed to be called again to read more data (false indicated that all the data had been read). Now calls to read() always get all the data, but the return type is still bool because older code checks the return value. In software that uses 3.2 or newer, read() should always exit by returning false unless it encounters an error, in which case it should throw an exception.

    if (read_p())  // Nothing to do
        return false;

If the data has not yet been read, the method then opens the Matlab data set. Each data source is different, but conceptually, this action has to be performed somewhere. In some cases, the data set would be opened once someplace else and the read() methods would access some sort of pointer or other access token.

    MATFile *fp = matOpen(dataset.c_str(), "r");
    if (fp == NULL)
        throw Error(string("Could not open the file: ") + dataset.c_str());

Read the data for the variable from the data set. Again, this will vary with each type of data set. In the Matlab server, a complex matrix is represented in the DAP by two different matrices, once with the suffix _Imaginary and one with the suffix _Real. This code looks at the name of the variable and uses that to find the correct variable and read its values. More complex data sets will probably need a more sophisticated lookup scheme.

    // get real part of the complex  matrix
    double *DataPtr;
    Matrix *mp;
    size_t pos;
    if ((pos = name().find("_Real")) != name().npos) {  
        string Rname = name().substr(0,pos);
        mp = matGetMatrix(fp, Rname.data());
        DataPtr = mxGetPr(mp); 
    }
    else{
        // get Img part of the complex matrix
        if ((pos = name().find("_Imaginary")) != name().npos) {  
            string Iname = name().substr(0,pos);
            mp = matGetMatrix(fp, Iname.data());
            DataPtr = mxGetPi(mp); 
        }
        else{
            mp = matGetMatrix(fp,name().data());
            DataPtr = mxGetPr(mp); // get the matrix structure
        }
    }

    if (DataPtr == NULL)
        throw Error(string("Error reading matrix"));

Once the data have been read from the data set we need to check for sub-sampling that may have been specified by the client and passed to the server via the CE. The CE was automatically parsed by the boilerplate code, but we need to explicitly look at the values because data set types are fairly idiosyncratic about how they use this information.

While the Matlab server reads the entire array from the dataset and then applies the sub-sampling information, many data set types provide ways to subsample variables through their own API (e.g., HDF, NetCDF, ...). In such cases, you'd read the CE information, then use it to read the data values. The most important point is that you don't always have to read the entire variable from the data when using the DAP. In fact, most servers don't, they make sure to use the data set's underlying API in the most efficient way possible, something that the DAP was designed to make possible.

Since the Matlab server supports only Matlab 5 and since Matlab 5 supported only two dimensional matrices, we grab a pointer to the first and second dimensions using the Array::first_dim() and Array::next_dim() methods. The Array::dimension_start(), dimension_stride() and dimension_stop() methods are used to the read the start and stop indices and the sub-sampling stride for the dimension referenced by the Array::Dim_Iter p.5

    Array::Dim_Iter p = first_dim();
    int start_1 = dimension_start(p,true);
    int stride_1 = dimension_stride(p, true);
    int stop_1 = dimension_stop(p, true); 

    p++;                        // increment iterator to get the second dim.
    int start_2 = dimension_start(p,true);
    int stride_2 = dimension_stride(p, true);
    int stop_2 = dimension_stop(p, true); 

    if(start_1 + stop_1 + stride_1 == 0){ //default rows
        start = 0;
        stride = 1;
        stop = mxGetM(mp)-1;
    }
    if(start_2 + stop_2 + stride_2 == 0){ //default columns
        start_2 = 0;
        stride_2 = 1;
        stop_2 = mxGetN(mp)-1;
    }

Using the information from the CE, the array values are sub-sampled and copied to new storage. Again, this step is generally not necessary when it's possible to subsample variables in the data set using an API, et cetera.

    int Len = (((stop_1 -start_1)/stride_1)+1)*(((stop_2-start_2)/stride_2)+1);
  
    int Tcount = 0;
    dods_float64 *BufFlt64 = new dods_float64[Len];    
  
    for (int row = start_1; row <= stop_1; row += stride_1) {          
        for(int column = start_2; column <= stop_2; column+=stride_2) {
            *(BufFlt64+Tcount) = (dods_float64) *(DataPtr+row+column*mxGetM(mp));  
            Tcount++;
        }
    }

The values, now read and sub-sampled are copied into the DAP variable object. The DAP methods enforce a strict policy that all memory allocated outside of the library must be deleted outside the library, and vice versa. So values sorted in the BufFlt64 array are copied to new storage allocated inside the Array object. This code deletes the BufFlt64 array.

Also important is the call to set_read_p() with the value true this sets the read_p property6 so that, should this function be run again while building this particular response, it will know the data have already been read.

    val2buf((void *)BufFlt64);
    delete [] BufFlt64;
    set_read_p(true);      

The remaining software frees resources allocated via the Matlab data access API. As was explained earlier, the false return value is a hold over from an earlier version of the DAP library.

    mxFreeMatrix(mp);
    matClose(fp);
    return false;

To build a main() function that will return the DataDDS, copy the one for the the DDS but change the call for DODSFilter::send_dds() to DODSFilter::send_data(). This will work because the code leading up to the send_data() call builds the DDS object, then the send_data() call will arrange to build the DataDDS response using the DDS. During this process it will parse and evaluate the CE and call the read() methods for the variables in the DDS. Thus the DDS will contain variables loaded with values which can then be used to create the information in the DataDDS object/response.


James Gallagher <jgallagher@gso.uri.edu>, 2006-08-17, Revision: 14349

Prev Up Next