Dataset File Formats

Different types of datasets are stored in a number of different formats. In general, datasets are stored in the format in which they were created or acquired, using compression when appropriate. When it is expected that a dataset will be extensively used in another format or may be needed by users with limited access to software for converting between formats, the dataset may be stored in more than one format.

Some GIS and image processing software, such as Arc/Info and LAS, use multiple files or directories to store a single dataset. In such cases, it it recommended that the tools of the particular software package be used to copy multifile datasets.

Most GIS and image processing software packages provide support for converting between file formats.

The dataset format can usually be identified by the directory name or file extension. The file types can be roughly classified as follows:

Generic binary array files

.bil
Band-interleaved-by-line: data are stored in the order (line 1, band 1), (line 1, band 2), ..., (line 1, band N), (line2, band 1), (line 2, band 2), ... .
.bsq
Band sequential: data are stored in the order (line 1, band 1), (line 2, band 1), ... (line M, band 1), (line 1, band 2), (line 2, band 2), ... (line M, band 2), ... .

Attention PC users: 16-bit integer and 32-bit floating-point data are in "big-endian" format; i.e., the most significant byte comes first.

A subroutine to facilitate reading and unpacking "flat" binary files in band-sequential (BSQ) format is available in both Fortran 77 and C-language versions.

MacIntosh Users can use Photoshop to ingest 16-bit .bsq data. After uncompressing the file, open it as a "raw" file (make sure you can Open All Documents). In the ensuing dialog box, enter the image width and height (from the documentation page for the dataset) and enter the number of bands as the "Channel count". The depth is 16 bits, byte order is "Mac", and header size is 0.

Compressed files

.gz
Files compressed using the GNU gzip utility. Usually appended to another file extension.
.Z
Files compressed using Unix "compress" utility.

Generic text files

.txt
General descriptive text
.ascii
Tables in ASCII format

Arc/Info GIS files

info
(no file extension) Arc/Info directory, under which INFO tables are maintained.
<dirname>
(no file extension) Arc/Info data directory, containing data files for a grid or coverage.
.e00
Arc/Info EXPORT format.

ERDAS files

.gis
ERDAS-7.X format single-band file.
.lan
ERDAS-7.X format multi-band file.

LAS files

.ddr
Data descriptor record: projection and georeferencing information. A program for printing DDR contents is available in both C-language source code and MSDOS executable versions.
.his
Processing history file.
.img
Image data, compressed (using LAS-specific algorithm) or uncompressed. Uncompressed version is identical to .bsq file.

USGS DEM file formats

.dem
Digital elevation model data. NOTE: some software may require that these files be reblocked before use. A program to read a DEM file and convert it to a 16-bit binary array is available in both C-language source code and MSDOS executable versions.


Last change: 2004 Dec. 1, R. A. White / raw@essc.psu.edu