DODS Meeting
January, 1999
Draft Notes
1/25/99 8:30 Meeting Begins
Richard Chinman summarized the schedule for the meeting and asked for head
count of people who will be attending the working dinner at the Oasis.
Attendees:
| Ethan Alpert |
SCD |
ethan@ucar.edu |
| Dierdre Byrne |
University of Maine |
dbyrne@grayling.umeoce.maine.edu |
| Richard Chinman |
IITA |
chinman@ucar.edu |
| Peter Cornillon |
University of Rhode Island |
pete@petes.gso.uri.edu |
| Ethan Davis |
Unidata |
edavis@ucar.edu |
| Elaine Dobinson |
JPL |
dobinson@mail1.jpl.nasa.gov |
| Ben Domenico |
Unidata |
bdomenico@unidata.ucar.edu |
| Peter Fox |
HAO |
pfox@uluru.hao.ucar.edu |
| Dave Fulker |
Unidata |
fulker@unidata.ucar.edu |
| James Gallagher |
University of Rhode Island |
jimg@dcz.dods.org |
| Jose Garcia |
HAO |
jgarcia@ucar.edu |
| Lynn Halpern |
GCMD |
halpern@gcmd.nasa.gov |
| Steve Hankin |
NOAA/PMEL |
hankin@pmel.noaa.gov |
| Dan Holloway |
University of Rhode Island |
dholloway@gso.uri.edu |
| Hyunmin Hur |
AGU |
|
| Apu Kapadia |
NCSA |
akapadia@ncsa.uiuc.edu |
| Kwoklin Lee |
University of Rhode Island |
kwoklin@rhody.gso.uri.edu |
| Linda Miller |
Unidata |
lmiller@unidata.ucar.edu |
| Robert Morris |
Contractor for JPL |
rmorris@tone.jpl.nasa.gov |
| Reza Nekovei |
University of Texas |
rnekovei@interconnect.net |
| Ted Habermann |
NOAA/NGDC (here for Mark Ohrenschall) |
haber@ngdc.noaa.gov |
| Lola Olson |
NASA Goddard |
olsen@lilgcmd.gsfc.nasa.gov |
| Roland Schweitzer |
Climate Diagnostics Center |
|
| John Sears |
AGU |
|
| Tom Sgouros |
University of Rhode Island |
tomss@ids.net |
| Dave Uhlir |
Research Systems Inc. |
|
| Patrick Kellogg |
HAO |
pkellogg@ucar.edu |
Introducation (Richard Chinman)
DODS has attracted many groups such as ESIP. The Federation had endorsed
the idea of clustering for "interoperability". There may be money
available in pursuit of "interoperability". DODS had indicated its
willingness to form clusters. DODS had negotiated with some of the
DACS (Distributed Active Archive Center) within NAS (NASA EOS). Peter
Cornillon explained that the Federation is as follows: there was
a request to NASA from 24 groups that requested and received funding.
The 24 groups were put into 2 subgroups. The first is data development
and includes ESIP2. The second is a commercial group that contains
12 commercial ESIP's and 8 or 9 DACS. These groups make up the Federation.
At the AGU meeting in December, 1999, AGU provided DODS some of their
booth space which attracted the attention of data developers and users.
There were 50 downloads from the GUI after the AGU meeting. DODS
was demonstrated to 200 people. Richard noted Peter Fox?s endorsements
of DODS within NCAR/UCAR.
Richard went over the DODS milestones (see table).
Status of User Services (Ethan Davis)
Mailing lists are still up. Software server sites still up.
Ethan had a poster at AMS and talked to a few people; there was some interest
in DODS. Web Pages have moved to Unidata with forward from URI.
Statistics on who accessed the site: 37 IP addresses including
Unidata, HAO, and JPL. Some IP addresses were unknown.
Meeting Objectives - The Scientists Perspective (Peter Cornillon)
DODS will have these meetings every 6 months to come up with short-term
plan. Main objective for the next 6 months is to develop the user
base. The Matlab GUI had 50 downloads from outside the project.
Objective is to sort out a way to come up with statistics. Another
objective is to build tools to make data easier to use. Think of
"User Base" from oceanographic perspective. Other objectives:
-
Make more data accessible via DODS GUI's.
-
Need IDL GUI
-
Port client side (DODS core and loaddods to PC)
-
Develop tools to facilitate serving data via DODS. If not in standard
format, than as easy as it can be.
-
Consider PI's needs (freeform server, JGOFS server)
Peter summarized the schedule and noted that "chunking" is handling
large data set. Peter also noted that AGU has given DODS booth space for
this past years meeting and for two more years.
Two handouts were passed out on the breakout groups. One is the
long version; one is the short version. Note: It is possible
to add to the list of breakout groups. We will post the summary and
reports of this meeting on the web. Peter displayed the working groups
for Monday afternoon and explained how people were put into the groups.
The DODS Approach (see Peter?s overhead?s).
Data Systems in Earth Sciences have generally been built from the top downs.
This leads to lesser functionality at the (!?!) level. DODS has taken
the opposite approach. Peter displayed an image representing GCMD,
NOAA, EOSDIS, and DODS which shows the DODS arrow going in the opposite
direction as the arrows for the other groups and noted that DODS needs
to make sure the arrows meet. (see image)
Where are we now?
Basic component for data transport level is complete (level 1)
Level 1 data transport (see slide):
Color scheme:
Green: done and robust
Yellow: written almost all (in testing)
Red: know here we are going but not complete ("loaddods")
HDF includes HDF/EOS.
Slide:
Core Software
-C++ class libraries
-Java libraries (client)
System components
Servers
-
-
Clients
-
-asciival ? ascii tabl delimited data (user gets back packed data)
-get url -
Level 2-Interface/Services
User Interfaces
GUI?s:
-
Matlab: translates data, data will be returned in same format
-
Ferret: pre-existing gui
-
IDL: hubbled by loaddods
(comment about gradsdods interface)
Registration:
-
GCMD working on code for registration of projects wanting to put their
datea on a DODS server.
Support Tools:
-
Support existing functions
-
Build a web site with an interface that asks questions of data holder in
order to build a freeform descriptor.
Web Interface:
-
James will talk about this.
-
Peter confirmed that we do need a web interface,
-
We need to think about whether this warrants a breakout group.
-
What other interfaces do we need?
Systems Interface
Catalog Services
-
File Servers
-
Grid Selection
-
Translation Issue: DODS can make sequences; but how do we get sequences
into Netcdf?
(comment: as soon as you use a client side library, you run into
a translation problem.)
(comment: to make DODs work with existing tools is a big opportunity)
-
Peter requested that somebody write down specific objectives of translation
The Developers Perspective (James Gallagher)
Level 1: Data Transport
-
http: Move data using http by asking a server for 1 or more objects
-
Mime documents: attribute object (general info about a dataset),
structure: type declaration.
-
A client uses http to request mime documents that contain...
-
Server architecture: pretty low tech; cgi reads paremeters from httpd;
run Unix filters programs.
Level 2: Standardization
Standard Attributes:
-
"Standard" says that for a certain subset some names, ranges, units, etc.
will be standard
Translation will allow us to send data from server to client with out dlient
knowing much about server. Peter Cornillon commented that DODS philosophy
is to make it easy to serve data because only want to force people to standardize.
(what does color coding mean?)
We can set server up this way, but not required
to.
Catalog Services
Catalog services is an idea we have been talking about for eons and have
finally decided to do it. Problems include the transport protocal
doesn't see fill as data set. Need to be able to set something up
that would allow catalog services. See overhead.
Overhead:
Catalog Services (items to develop):
-
1. standard names
-
2. standard variables
-
3. grid and array selection: people want
to ask questions in geographical terms, not systems terms - this will arrange
that.
Standard Names:
-
1. provides a way for clients to discover facts
about datasets. Example: "Lat" means latitude.
-
2. standard names built on top of existing
attribute
-
3. Not just for graphical interface.
Two groups of standard names. Standard variables
and query interfaces..... (on overhead)
James showed some samples. Basic framework:
use attributes to describe dataset. Provide list of sample URLs.
Clarification: The container refers to the sequence; not the dataset.
Client needs to be fairly sophisticated on the way they process infor from
DODS servers. Jose said they discussed the potential conflict form
catalog services and the solution may involve creating new catalog server.
The point: If a particular server reports
data and time interfaces, you are not restricted. You can mix and
match within date/time interface.
James skipped some slides and showed more examples.
Common problem: grid contain byte array. Association with byte
array are two vectors. Data stored in SST but is actually a lot &
lon. In order to be a user and make sense of this, you need to do
a lot of work (read array). This process should be made easier.
James has written a grid selection to make it
easier for clients by making it possible for clients to ask standardized
questions.
Some discussion followed about netcdf and the
netcdf interface. In comparison, James' example has more functionality,
yet a more complex user interface.
Netcdf can represent grid but does not provide
the functionality here. There is something built in that can allow
this.
Q: Is there a list of catalog functions?
A: Yes, a brief one. Complete list
not enumerated yet.
James mentioned the "white paper" and will provide
the url. Need the url.
The way to implement functionality is through
a standard interface. Catalog services do not address the issue of
how user gets data.
comment: what is being done here is to build
a really strand infrastructure. Worry about interface later.
Background on "interface from hell": sometimes
it works. It works in the sense that you can get a list of URLs.
DODS Servers
Development of new servers. Comment:
halfway through TS server. Updating existing servers. Simplifying
custom server creation. There wil be breakout groups on these.
GUI
Intro
Graphical interface meshed with plain servers.
Large amount of metadata required. GUI is specialized for set of
metadata. Fundamental topic: discovery versus hard coded metadata.
MatLab (Dierdre Byrne)
Dierdre briefly explained her background. MatLab
is used for a lot of data analysis so it was obviously considered for DODS.
Dierdre showed an example of how the MatLab interface worked and identified
some problems. You can view a map or view all data sets. The
list is not static. MatLab shows geographical reference and available
DODS variables. You can then choose how to display the acquired data.
Go back to the MatLab window and it lists the requests of the user.
GUI keeps list of data and physical parameters available.
When you press "get details," the gui goes to
the subset and requests the info that would satisfy the certain constraints.
This function makes up a list of url's and says it has n# of urls.
User requests # from 1 to n for specific url. When you
select a dataset, you can view many different variables. The gui
automatically only gives you selections that are available for those datasets.
It's simple; you don't have to know how DODS works to use it.
A function specific to the data set will go out
and dereference the URL.

Problems:
-
Early on made decision to support MatLab4 and MatLab5.
Bad decision because ML4 only support 2-D arrays.
-
Difficult to request polar data
-
Meant to be intro; not meant to do everything
-
Location of catalog server is currently requested
-
It is very quickly going to become very difficult
to navigate with huge list of datasets. Subsets not solution.
Groups may be solution.
Q: Can it be user configurable?
Gui can go and get a lot of 2-d data sets.
Ferret (Steve Henkin)
Steve gave a bried demo. Principle users are
modelers. There are about 1000 downloads/year. "Open dataset"
gives list of datasets including DIDS datasets (delay result of Netcdf
open). You can then see coordinates. Click plot; view data.
You enter different variables and pick the way you want to view.
Short discussion followed on Netcdf Operators. Steve emphasized the
power of standards. Through standards, DODS can make obtaining data
easier.
Peter Cornillon asked a question (possibly about
"inventory" - 20 thousand files look the same; only differentiated by time.
Steve's response was that there are two sets of problems:
-
1. How to access the data
-
2. how to get the metadata that's available
in order to make a meaningful conclusion from that data.
Peter mentioned taht we are looking at data from
the bottom up. Problem: no consistent way of looking at a large
number of objects..
Q: Is there a working group on MetaData
problem?
IDL (Peter Fox)
History of why this way developed: At last
developers meeting, determined that IDL was needed. Since RSI is
in Boulder, Peter took on the task of development and looked at specifications
document.
Peter explained how it fits in with MatLab GUI.
-
1. loaddods: native interface between
IDL and DODS server. Biggest restriction: loaddods would not
turn variables into main scope of DODs server. Fundamental limitation.
-
2. GUI Status: almost complete; documentation
almost complete; implemented using IDL widgets.
-
3. What works: basic interface, catalog,
dataset and variable selections, constraints, get data, (loaddods), level
2 and 3.
-
4. What is left: zoom selection, catalog
descriptions, data plotting, tuning and state checking.
-
5. Constraints/Problems: current catalog
structure is adapted from MatLab Scheme, constraint construction specific
to acean datasets, this gui meant to be for a much larger audience.
-
6. Updating (maintenance) - who should do update
and maintian the gui? P. Fox does not want to do it. Therefore,
this needs to be discussed.
The IDL gui used a test display. P. Fox gave
a brief demonstration on how IDL works. It uses metadata instead
of functions.
GRIB (Ethan Alpert)
Main Issues
-
Functionality - What will software do for the user;
What flavors of GRIB to support.
-
Conventions - How to represent GRIB information in
DDS and DAS
-
Maintenance - GRIB is a moving target; the site specific
parameter, grid and time rep tables are hard to keep up to date.
What Is GRIB?
-
File composed of atomic 2D records on some type of
Geo-referenced GRID
-
No global file structure
-
Headers used to communicate: time, level, parameter,
data representation
PDS-Product Description Set
-
Parameter #
-
Grid #
-
Type of vertical coordinate system
-
Time representation
GDS-Grid Description Section
-
Optional section but usually is available
-
Specifies type of GRID and provides information to
generate coordinates.
BDS-Binary Data Section
-
NCEP only uses the simple scheme for packing
-
ECMWF uses both simple and complex packing
Functionality
-
Simplest level is just decose and transmit each header
and leave everything up to the user.
-
More advanced functionality would be to sort and
group individual records of data to present data as multidimensional sets
-
Even more advanced would be to present the GRID information
in a useful fashion
Conventions
-
Parameter Naming and Representation: The same
parameter number can appear several times within the same file as a different
data set (different level indicators - SFC, TRO, ISBL; different time indicators
- AVE, ACC; different grids - polar stereographic, lambert conformal)
-
So what should the DAS and DDS fields for DODS look
like? Develop structures for parameters that allows a single parameter
to have multiple entries. Questions: How should the tangent
projections grids which are common in GRIB be represented?; Should lat/lon
coordinate grids be provided? How should thinned grids and spherical
harmonic coefficients be dealt with?
Maintenance
-
Main problems with GRIB parameter tables is that
every site develops their own local conventions.
-
How can a system be designed to support extension
of parameter tables?
-
How can you find out about new data sets, parameters,
grids, and levels used at various centers.
Conclusion
-
Many issues need to be hammered out with respect
to translation of the GRIB data model to the netCDF model
-
NCL shows how a generic interface can work but not
ideal
-
Great care should be taken when designing the DODS
server for GRIB.
Q: Is there much software that uses this type
of data?
A: This data comes from the models - there
is tons of data in GRIB; that is the advantage of GRIB
Note: In the other GUIs you can view the
data through space.
"Chunking" (NCSA, JPL, Apu, Rob)
Chunking allows you to break up a data set into smaller
pieces and then compress. The compression is transparent to
the user. The biggest advantage is in the presence of subsetting.
The user only has to decompress what he needs.
Apu set out to determine the optimal chunking
parameters. He used GZIP to compress the files. Used to take
1-2 minutes for download; with chunking, got that time down to .5 seconds
in some cases (5 to 20 times better). CPU is faster than IO access.
The smaller the chunk, the more overhead.
The bigger the chunk, the more time it takes to download. For each
data set there is an ideal chunk size. Apu and others gathered statistics
and determined that 4 to 16 chunks was best depending on IO or CPU.
Q: Did you have a raid file system
A: Yes
Advantage of chunking: it is faster.
Disadvantage: more complex, less flexible, more effort to implement/maintain.
Storage is independent of user. Would not have to change DODS Server.
Results (Rob)
It took 250% longer to download unchunked data.
Dataset specific, dataset dependant, time dependant.
Data Fusion (Steve Hankin)
Reminder: There are data suppliers and data
consumers; Steve is a middleman.
What is Data Fusion?
Data Fusion is combining scientific data sets by
reconciling differing file formats, data structures, and coordinate systems
for purposes of merged analysis or visualization. Is this important?
Yes, very. (get
url)
It is lack of standards, not lack of technology
that is the stumbling point. Steve explained the strategy of DODS
and showed demo (drag rectangle over area on map, select type of image,
click OK). Web server prototype (server in a box). Desktop
Application (Ferret). Web access to the desktop. You can reference
two datasets not on the same grid. Second variable regridded on first
variable grid dimensions.
A DODS data request from a Web Server requires:
DDS("open"):
1 transaction
DAS("open"):
1 transaction
Coordinatroy("open") 4 transaction
Data("read")
1 transaction
Total
7 transactions
With DAS/DDS, ?
Considerations
-
Perfaormance - caching and/or persistent connections
-
security.....
Applicability
(on website...)
AGU View (John Sears)
(can we get overheads?)
AGU sees DODS and AGU as a possible fit.
AGU:
-
has 37,000 members
-
4000 journal articles/year
-
supplemental data sets peer reviewed/permanently
archived
-
ftp:/dosmos.agu.org/apend
-
creating GCMD DID records
Electronic Journals
-
earth interaction (EarthInteractions.org)(incorrect
link)
-
Geophysical Research Letters
-
Water Resources Research
-
Reviews of Geophysics
-
JGR -- coming
-
TOC's and abstracts
-
Links to/from datasets
AGU DODS Server
-
PR - meetings, EOS, Web
-
Importants tools for scientists
-
Access to datasets
-
What is involved?
-
Timeframe
-
Tech Support Overhead
Some discussion followed on PDF Servers.
Selection of Breakout Groups
END OF DAY 1
DODS Day 2
January 26, 1999
Report/Discussion on First Working Groups
Population (Peter Fox)
This working group wanted to have some very specific action items.
Geetting DODS accessible to many users is a major priority in the near
term. They defined "population strategy" as: primary focus
are global data sets because they are becoming of a lot more interest to
the community. There are four groups of datasets: NOAA, NASA
Climatology, NASA Podaq, and AGU. The question is: what do
we need to do to get these data sets accessible.
NOAA
Problem: We need to know what is available via DODS Servers.
Available does not equal accessible. need list of what is available/accesible.
Require some metadata to ge the datasets into the GUI's. Idea is
to impact Roalnd as little as possible.
NASA (Lola)
Four CD's worth of data sets. Currently served by ftp (not in HDF).
Lola is going to look into what would be required to get these served by
DODS.
NASA(Elaine, JPL)
Lots of data. Difficult to sort out. Wlaine will go through
the data sets that are publicly available or online and figure out which
are in the format we need and determine what would be involved to get the
data sets in a file server.
AGU Archive
Small data sets. Smaller than 5 Mb each. 150 data sets.
John Sears is going to provide a list of what's there. Probably easy
to build server because data in table format. Decided not to
just do oceanographic data sets, but all of them.
Peter Fox answered Dave Fulker's question: the starus of CSN currently
not in NETcdf format. Group should be formed to figure out how to
serve the data.
Note: there are other groups involved in population.
Dave Fulker asked: Would a suitable metic be some sort of
inventory of data? (!?!)
Tried to estimate the fraction of data sets in the GCMD.
Freeform/JGOFS
Two Questions:
-
1. What are the cases that we use one over the other?
-
2. (!?!)
JGOFS
You can write program - more flexibility. Can modify data more
easily. Limited to ascii.
Solution: Create Matrix and add documentation to helf people decide
between JGOFS and Freeform.
Step 1: Improve documentation (simplify). Currently, one
is too long and one is too short.
Step 2: Put FAQ some place.
Problem with format relates to provede way to check data syntax.
Add tail and header. James - there are tools like this in freeform.
Q: How do you point users to these tools.
A: In Documentation
Q: With respect to JGOFS, there is a programming issue there.
Its difficlt to install the server because you have to set up users...
Is it possible to change the software so it doesn't have to be installed
on someone's machine.
A: JGOFS needs access to the object database. It was done
so the database didn't all need to be in the same place. Yes, it
could be fixed.
The previous questions address an issue that needs to be revisited.
Comments (Peter Cornillon)
Web Form; whenever you get a manual it tends to be intimidating.
Is it possible to ask question of data holder to determine some things
about the data. For example, the computer says, "Tell me about your
data set," and receives a response to some questions. Then computer
can say, "That's easy, do this," or "Look at the manual." Can the
computer determine the complexity of the data? Answer: Maybe;
check tools may be more useful. Point: Can we make it easier.
Tom added:
-
1. There is the potential capacity that freeform can deal with multiple
files.
-
2. For a certain class of data sets, there is a tool called "make
HDF" that makes a header so that data can be served with an HDS Server.
Peter Fox: Issue is how to decide between freeform versus JGOFS.
JGOFS requires extra installation. Freeform comes with links to NETcdf
and ?!?. JGOF not distributed with DODS. Issue: What
do you need to install a DODS Server?
Catalog Services (James Gallagher)
In White Paper - 3 Sections (Add link to White Paper)
Group did not talk about grid contraint. Did talk about:
-
1. File Server, relational table, relational umbrella
-
2. Standard names
File Server
By taking a group of files which the DODS low level transfer sees as a
group of datasets, how do you make the computer look at 25,000 urls as
1 data set? 1 file contains data and URL's. Each would require
type promotion to promote simple arrays to grids. The file srver
is nice, but if we can go one step forward to this array thing...
You can see both views.
Comment: it could be real taxing on the archive. With some
big databases, you can't akd for the whole thing or you will get an error
message.
Standard Names
Concerns: We would 3end up falling into problems that GRIB did.
Things could become too complex. Standard names are too hard to find.
Easy to talk about/Hard to implement. Driven by requirments of GUI.
Question is how do you know what the variables are.
Notion of Profiles: Gills is profile.
Idea: You create disciplines and within disciplines, you create
standards.
Point: We are not talking about standardizing the names, we are
talking about people writing ancillary metadata files that provide information
about standards.
Q: Do you have to standardize to server your data?
A: No.
Q: Did we make any progress; Is there a starting point?
A: The names in the white paper are the starting poing.
Idea: Online dictionary that allows people to make suggestions
to it.
Note: We haven't had problems with our naming convention.
Two issues: Computer knowing the variable and people knowing the
variable.
MetaComment: 2 Questions: Did you look at DDS? No.
Comment: Think more about interface with query that asks for
standard names.
END OF MORNING DAY 2
DODS DAY 2
January 26, 1999
Tuesday Afternoon
Report/Discussion Second Working Groups
System Wide Performance (Richard Chinman)
Discussed metrics which they will recomment to the federation.
Neat Idea: http daemon has as part of an access log, a number
of parameters with changes to the cgi scripts. We could alter script
to capture parameters, put them into a file, and serve them to DODS Client.
The log becomes the data set. Freeform server could then be available
to evaluate performance/metrics.
It would be possible to automate this process.
Some notes on what statistics to collect
and how.
Green: easily measure
Orange:
Red: difficult to measure
Q: Can you add it on to the URL?
A: Bad idea to put it in the url. But, you can put it in
the header.
Richard noted that getting the metrics on the human client would be
difficult. Some discussion followed about ways to "survey" the human
client. Can it be determined by looking at other metrics? How
about actual survey?
Need to decide whtich metrics we need to collect/implement (1) for milestone
and (2) to talk to the Federation.
James: On the server Side: Time constant (tao), how much
time is it taking to crank up the data? Is it on the list?
Good to know in terms of server performance.
Peter Fox: Any discussion of report writing? Creating summary
statistics... Answer: there was no discussion; sense that we
would come up with some application to analyze data/.
Jose: make sure results are accurate.
Dierdre: Will be incomplete look as some servers won't want to
participate in data collection/distribution.
Peter: Summarize statistics in a way that recognizes the provider.
Look at linked usage.
Aside: CDC stands for Climate Diagnostics Center.
Richard: Select a group of people qand ask if they would mind
to have their usage monitored.
Note: need to be senstive to possible pribleged information/statistics.
Can collect demographics.
To determine user satisfaction, maybe check box at end of form asking
about satisfaction or "People who got this data, also got this type of
data".
Translation (James Gallagher)
There was one milestone 6 mo. ago and will be another in six months, so
they are right in the midst of it and on schedule. DODS can provide
a number of different data types. NEtcdf client library (client to
DODS which cannot understand sequence data type) operationally the Net
cdf is not very ... with out size info.
Solution: Client tell server: "I don't understand"
Server can turn sequence into arrays in finite sizes. Some work on
client side; some on server side.
Peter Cornillon: Idea: In my analysis tool, I may not be
able to get the ...?!?. Cube can be shipped out; cube can be subset
in a number of ways. This is easy to do. Application.
IDL/Java GUI (Peter Fox)
IDL
Current IDL Gui and its functionality.
GUI Status:
-
Almost complete v 1.0 of the IDL GUI and documentation (2/12)
-
Implemented using IDL widgets.
What works - basic interface, catalog, datasets and variable selections,
constraints, get data (loaddods)k, level 2/3.
What is left - zoom selection, catalog descriptions, data plotting,
tuning and state checking.
Constraints/Problems:
-
current catalog structure is adapted from MatLab Scheme
-
Contraint construction specific to ocean datasets.
Issue: Who should maintain/update the IDL GUI.
Discussed overall design of IDL Gui. Lack of catalog server and
file server was an issue.
Interface Issues: the desirability of giving strong support for
the gui's

Could implement IDL gui through netcdf interface.
First Issue: RSI effort for EOS support. Argument for HDF
library comes up again. Argument: possible that thatere is
info in DODS that cannot be translated faithfully into "client".
Strong considerations needs to be given to standard API
Important Issue: You may be able to get more information from
the loaddods path to make the url. All url's must be interchangeable.
Types and way that people want/need to delect data should be opened up
a lot.
Overall point in the future: is it desirable to have common look/functionalaity
between different gui's? Peter C.'s answer is "no". Might be
useful to have IDL gui be more powerful than MatLab gui. Issue is
the viability of how we maintain metadata for MatLab and how that IDL is
doing it a very different way.
Suggestion: in documentation, describe enhanced metadata, put
out documentation and determine if it would fbe useful for metadata.
Q: Will some data sets be in both guis?
A: In IDL, you chose catalog.
Might be useful for Peter Fox to jot down what would make it easier
and send it to P. Cornillon.
All worried about sealability of the catalog and file servers to hangle
very large numbers of metadata files. Also concerned about implementation.
Every DIDS Server site can build their own catalog server that complies
with the interface.
Java GUI
Java: good; Applets: not so good.
Aside: Discussion of limited version (like a student version)
of IDL, MatLab as marketing tool.
The "Web client from Hell" aka "Brooklyn" tries to do too much.
It might be useful to go for lessened functionality.
James: "Brooklyn" is a "Rat's nest of PERL scripts", a "URL within
a URL".
The priority of the interface is low because of the the forms.
Need to make a list of the forms. It is straight programming; not
DODS. Some stuff was very useful.
DODS Day 3
January 27, 1999
Wednesday Morning
Summary of Third Working Groups
Location (James Gallagher)
Def: being able to find data sets.
Ought to be some static list of all data sets in DODS.
Registration
Registration of Data Sets. Working with Global Change Master Directory
(GCMD) as they have beta software. Need to start using that interface.
Information about DODS gets into GCMD.
Q: Is there a global way to enter info into GCMD relational database?
A: James - No
A: Elain - they do have a batch interface but a batch job would
be too big.
There are already some DODS datasets already known by GCMD but not known
that they are DODS databases. Out registration interface is meant
for people who only want to do a small amount of work but want their data
available.
comment: metadata very important.
suggestion: DODS might think about stronger recommendation to
scientists to produce really good metadata -- if DOD takes off, lack of
metadata could become an achilles heel.
Overtime, we can prompt people to provide the metadata that is really
important. Getting scientists to do metadata that is really important.
Lack of metadata is a huge problem at NOAA.
Linda Miller recommended step by step web form that prompts entry of
metadata.
Debate: how much stuff it prompts for.
Initially, only ask for very few things. Interface might evolve
into something more automatic. May be able to build smart interface
into data that could give user metadata info.
P. Cornillon: Problem: difficult getting scientist to register
datasets because asked for too much into.
"Autometadata Generation"
2 classes of data:
-
1. data from JPL, Goddard
-
2. data from people like Peter Cornillon
Conclusion: Fine line betweeen getting enough data and asking for
too much data. Dont hide fact that DODs would like as much metadata
as possible. Incremental benefits that come from more metadata.
Glen Ferrell Idea: put data sets into the gui with little metadata
and let the community add the metadata.
Elaine: Since the URL's are going to change, how are we going
to handle updating URL's. Some discussion followed:
Q: Can it be automated?
A: Not a hard problem. Currently this is done manually
but we need to think about it and make sure we come up with a solution.
GCMD has real people working there.
We have records, two locators (making them not difficult):
-
1. Essentially static list of html links updated title and url's
of all known DODS databases
-
2. goes through search engine. For example, "all DODS databases
with wind stress" and you get back a list.
Client must keep track of server side functions.
GIS (Ted)
GIS: Geographical Info Systems
(insert picture)
No Data Type mismatch immediatly apparent
GIS data -- very rich attribute features; e.g.
paths taken by hurricanes.
Q: What is advantage of taking data our
of GIS?
A: For people without GIS.
Q: Are there GIS data providers that we
areinteresting in getting data from?
A: That is our assumption.
GIS --> IMAGE --> GRIDS
will change soon with release of ESRI SDE 4.0 (spatial database engine).
Archview, Archinfo, Airdos...: all ways
to access data in SDE.
Potential ESRI Connections - NOAA, ESIP 3's
Therefore, DODS could get into this.
ESRI "Feature Server" seems like best bet.
ESRI very interested in provision of data over the internet; very interested
in data discovery. Need to understand feature data system.
Some discussion followed regarding how popular
ESRI is - benefits/problems with ESRI.
Q:
A: no relationship between netcdf and grass.
Some discussion on grass.
Do any of the GIS systems have an underlying data
model on a higher level than DODS?
Conclusion: a lot of potentially exciting
things.