Retrieving Data from the PSU/CEI Database

This page addresses the following topics:

Access by non-EESI Users

All dataset documentation and other descriptive files are accessible to all WWW browsers, and may be downloaded to local disk using the browser's "Save as" facility. Users who do not have accounts on the EESI workstation network, however, may be unable to download the actual data files for one of three reasons: Data files which are available for downloading by non-EESI users are identified by the presence of hypertext links. Non-EESI users are requested to address requests for other data files to the database administrator, dbadmin@essc.psu.edu.

Downloading of multiple files from the same dataset can also be done using explicit FTP (file transfer protocol) commands, as follows

  1. Change to the local directory which is to receive the file
  2. Type ftp dbftp.essc.psu.edu
  3. Enter anonymous as the username
  4. Enter your complete email address as the password
  5. Type cd pub/data
  6. Type cd yyyy-nnnn where yyyy-nnnn is the dataset reference code
  7. Type binary (unless downloading ASCII files to PC)
  8. Type prompt to permit using mget command
  9. Type mget filespec to download the all files which satisfy filespec (e.g., mget *.bsq.gz)
  10. Type quit to terminate the ftp session

Access by EESI Users

EESI users will generally find it easier to copy data files directly from the database to their working directories. Suggestions for so doing are given below.

Dataset Reference Codes

To simplify directory references, each directory containing a dataset has been assigned a sequential reference code, of the form yyyy-nnnn, where yyyy indicates the year of acquisition and nnnn is a serial number for that year. For example, the 16th dataset created in 2002 has dataset reference code 2002-0016. This dataset reference code is displayed on the second line of the WWW page for the dataset.

For users on the EESI network, this dataset reference code is linked to the actual dataset directory via entries in a top-level link directory which is pointed to by the environment variable DSREF. This permits referencing files in the dataset directory as

 
    $DSREF/yyyy-nnnn/filename


Copying Multi-file Datasets

Depending on the type of data, their source, and the format in which the data are most compactly stored, the datasets may be in any of several formats; for frequently used data, a given dataset may be in more than one format. For formats such as those used by Arc/Info and LAS (the Land Analysis System), a single dataset occupies several files. For Arc/Info, these are packaged in an "info" subdirectory and one or more data directories. For LAS, the image data are in a .img file, and associated geographic positioning and processing history data are in .ddr and .his files.

Copying these multi-file datasets into the user's working directory using ftp or the unix cp command can be risky, especially, for Arc/Info, if the user's working directory already contains an "info" subdirectory (the "info" directory for the new dataset may contain files having the same names as those in the "info" directory in the user's working directory). For this reason it is safer to use the Arc/Info or LAS copy commands.
Note -- Both Arc/Info and LAS may try to write temporary files into the user's current working directory. For this reason, the user should chdir to a working directory to which (s)he has write access.

For Arc/Info, the commands

    chdir user-working-directory
    arc copy $DSREF/yyyy-nnnn/data-directory-name output
may be used directly from the Unix prompt, where output is the name the user wishes to give the dataset. Note that as of Arc/Info version 7.0, a multi-layer grid file ("stack") can be copied as a single unit using the copystack command. To check whether a directory contains grid, polygon, or other data, the Arc/Info describe command may be used.

For LAS, an image file and its associated .ddr and .his files may be copied using the commands

    chdir user-working-directory
    las
and at the LAS72> prompt
    copy $DSREF/yyyy-nnnn/image-name output-image


Unzipping (Uncompressing) Data

Many datasets are stored in compressed format. Datasets with a file extension of .gz have been compressed using the GNU gzip utility; they need to be uncompressed using gunzip or (for MSDOS) gzip -d.

Datasets with a .Z file extension are in Unix compressed format. They may be uncompressed using the Unix uncompress command or with gunzip, which automatically determines which compresssion method was used.

Converting Data Between Formats

Arc/Info, LAS, and ERDAS Imagine all provide a variety of tools for converting between different data formats. For Arc/Info, these include GRIDIMAGE, IMAGEGRID, and DEMLAttICE. For LAS, programs ARC2LAS, LAS2ARC, DEM2LAS, ERD2LAS, and LAS2ERD are available. These are described in greater detail elsewhere.

Using USGS DEM data

The USGS distributes digital elevation model (DEM) data in two formats. The traditional ASCII format reports elevations in a series of south-to-north profiles. A newer format, which adheres to the new Spatial Data Transfer Standard (SDTS) , packages a number of binary files into a Unix-style tar archive. The USGS is in the process of converting all 30-meter DEMs to the SDTS format. These DEMs are currently (as of August, 2003) available for free download from sources listed on the USGS EROS Data Center Geo Data server

Software for reading the USGS ASCII DEM format is provided by many GIS and image analysis software packages, some of which also support the SDTS format. LAS program DEM2LAS handles both the ASCII and SDTS formats, and can directly process the gzipped files. A standalone program which can read a DEM in either ASCII or SDTS format, and convert it to a 16-bit binary array is available in both C-language source code and MSDOS executable form.

For MacIntosh users, Brian Wagner, San Diego, CA, has developed and made available DEM Reader software, which directly displays ASCII-format DEMs on the screen. He also is developing support for the SDTS-format DEMs.

Reblocking DEM files

Some DEM files are in a fixed-record-length format which some programs, such as the Arc/Info DEMLAttICE, seem unable to process correctly. On Unix systems, a DEM file may be uncompressed, its blocked/unblocked stutus checked, and unblocking performed if necessary using the commands
  gunzip filename.gz
  od -c filename | more
  dd if=filename of=newfilename ibs=4096 cbs=1024 conv=unblock
The dd command is not needed if the output from od contains a linefeed character ("\n") at or before address 0002000, since this indicates the file is already unblocked.
Please send questions and comments to dbadmin@essc.psu.edu
Page creation date: 2004 Nov 19