Dataset File Formats

Different types of datasets are stored in a number of different formats. In general, datasets are stored in the format in which they were created or acquired, using compression when appropriate. When it is expected that a dataset will be extensively used in another format or may be needed by users with limited access to software for converting between formats, the dataset may be stored in more than one format.

Some GIS and image processing software, such as Arc/Info and LAS, use multiple files or directories to store a single dataset. In such cases, it it recommended that the tools of the particular software package be used to copy multifile datasets.

Most GIS and image processing software packages provide support for converting between file formats.

The dataset format can usually be identified by the directory name or file extension. The file types can be roughly classified as follows:

Generic binary array files

Band-interleaved-by-line: data are stored in the order (line 1, band 1), (line 1, band 2), ..., (line 1, band N), (line2, band 1), (line 2, band 2), ... .
Band sequential: data are stored in the order (line 1, band 1), (line 2, band 1), ... (line M, band 1), (line 1, band 2), (line 2, band 2), ... (line M, band 2), ... .

Attention PC users: 16-bit integer and 32-bit floating-point data are in "big-endian" format; i.e., the most significant byte comes first.

A subroutine to facilitate reading and unpacking "flat" binary files in band-sequential (BSQ) format is available in both Fortran 77 and C-language versions.

MacIntosh Users can use Photoshop to ingest 16-bit .bsq data. After uncompressing the file, open it as a "raw" file (make sure you can Open All Documents). In the ensuing dialog box, enter the image width and height (from the documentation page for the dataset) and enter the number of bands as the "Channel count". The depth is 16 bits, byte order is "Mac", and header size is 0.

Compressed files

Files compressed using the GNU gzip utility. Usually appended to another file extension.
Files compressed using Unix "compress" utility.

Generic text files

General descriptive text
Tables in ASCII format

Arc/Info GIS files

(no file extension) Arc/Info directory, under which INFO tables are maintained.
(no file extension) Arc/Info data directory, containing data files for a grid or coverage.
Arc/Info EXPORT format.

ERDAS files

ERDAS-7.X format single-band file.
ERDAS-7.X format multi-band file.

LAS files

Data descriptor record: projection and georeferencing information. A program for printing DDR contents is available in both C-language source code and MSDOS executable versions.
Processing history file.
Image data, compressed (using LAS-specific algorithm) or uncompressed. Uncompressed version is identical to .bsq file.

USGS DEM file formats

Digital elevation model data. NOTE: some software may require that these files be reblocked before use. A program to read a DEM file and convert it to a 16-bit binary array is available in both C-language source code and MSDOS executable versions.

Last change: 2004 Dec. 1, R. A. White /