User's Guide

STATS

Compute statistics for training sites and classes

Function:

Computes statistics for training sites and classes. Mean vectors, covariance and correlation matrices and histograms are computed for training sites and classes. The user may exclude certain grey values from statistics calculations by specifying subcommand -CUT; if subcommand -ALL is specified, then all the data will be used to calculate statistics. Polygons representing training sites must have been previously selected. Coordinates of the polygons are contained in the statistics file. Up to 256 classes and 256 sites may be processed.

Parameters:

Subcommand -ALL:
Computations will be based on all input pixel values. The subcommand ALL is used to compute statistics based on all the input pixel values for all input bands.

IN
Input image. Statistics will be computed for the classes and sites that are located on this image. The data type of IN may be BYTE, INTEGER*2, INTEGER*4, or REAL*4. No windowing is allowed. The image may contain up to 256 bands. Any polygons that are not entirely within the input image will be excluded from calculations.

INSTAT
Statistics file. The input/output statistics file. It contains the coordinates of the polygons as well as the statistics. Any existing statistics will be overwritten if specified. The user will be interactively prompted to overwrite any existing statistics. If a polygon does not fit within the image, the mean vector, covariance matrix and the number of points for that polygon will all be set to zero and that polygon will not contribute to calculations for the class statistics. If none of the polygons fit within the image, processing will halt and INSTAT will not be changed.

MEANFLG(YES)
Mean option. Determines whether or not STATS will calculate mean vectors for the polygons.


  = YES:  Calculate mean vector.  New mean vectors
          will be calculated and written into the
          input statistics file.  Any existing mean
          vectors will be lost.
  = NO:   Do not calculate vector.  Existing mean
          vectors are retained.

COVARFLG(YES)
Covariance option. Determines whether or not STATS will calculate covariance and correlation matrices for the current set of input polygons.


  = YES:  Calculate covariance and correlation matrices.  
          New matrices will be calculated and
          written into the input statistics file.
          Any existing matrices will be lost.
  = NO:   Do not calculate covariance and correlation 
          matrices. Existing matrices are retained.

HISTFLG(YES)
Histogram option. Specifies whether or not STATS will calculate histograms for the data contained in the polygons.


  = YES:  Generate histogram.  New histograms are
          calculated and written into the input
          statistics file.  Existing histograms will
          be lost.
  = NO:   No histogram.  Existing histograms in the
          statistics file are retained.

Note:  The following naming convention is used for
       histograms:

           HIST001  - band 1
           HIST002  - band 2
              .         .
              .         .
           HIST00n    band n

NCLASS(--)
Number of classes. If NCLASS is not equal to 0, the user will be prompted NCLASS times for the following:


1.  "ENTER CLASS NAME" (name of class to be processed)

2.  "ENTER NUMBER OF SITES FOR THIS CLASS"
     (# of sites to process; "0" or carriage return
      for all sites)
 
     If the specified number of sites is not equal to
     the total number of sites in the class, the user
     is then prompted:

3.  "ENTER SITE NAMES, 1 PER LINE" (names of sites to
     be processed)

OPTION(-- )
List option. Allows the user to display the class names, the site names, or both the class and site names in the statistics file prior to processing.


  = CLASS:  List class names.  Allows the user to
            display the class names in the statistics
            file prior to processing.
  = SITE:   List site names.  Allows the user to
            display the site names in the statistics
            file prior to processing.
  = BOTH:   List both class and site names.  Allows
            user to display both the class and site
            names in the statistics file prior to
            processing.
Subcommand -CUT:
Allows a given pixel range to be omitted from calculations. The subcommand CUT is used to eliminate a range of pixel values from the computations. One value range may be specified for each input band.

IN
Input image. Statistics will be computed for the classes and sites that are located on this image. The data type of IN may be BYTE, INTEGER*2, INTEGER*4, or REAL*4. No windowing is allowed. The image may contain up to 256 bands. Any polygons that are not entirely within the input image will be excluded from calculations.

INSTAT
Statistics file. The input/output statistics file. It contains the coordinates of the polygons as well as the statistics. Any existing statistics will be overwritten if specified. The user will be interactively prompted to overwrite any existing statistics. If a polygon does not fit within the image, the mean vector, covariance matrix and the number of points for that polygon will all be set to zero and that polygon will not contribute to calculations for the class statistics. If none of the polygons fit within the image, processing will halt and INSTAT will not be changed.

MEANFLG(YES)
Mean option. Determines whether or not STATS will calculate mean vectors for the polygons.


  = YES:  Calculate mean vector.  New mean vectors
          will be calculated and written into the
          input statistics file.  Any existing mean
          vectors will be lost.
  = NO:   Do not calculate vector.  Existing mean
          vectors are retained.

COVARFLG(YES)
Covariance option. Determines whether or not STATS will calculate covariance and correlation matrices for the current set of input polygons.


  = YES:  Calculate covariance and correlation matrices.  
          New matrices will be calculated and
          written into the input statistics file.
          Any existing matrices will be lost.
  = NO:   Do not calculate covariance and correlation 
          matrices. Existing matrices are retained.

HISTFLG(YES)
Histogram option. Specifies whether or not STATS will calculate histograms for the data contained in the polygons.


  = YES:  Generate histogram.  New histograms are
          calculated and written into the input
          statistics file.  Existing histograms will
          be lost.
  = NO:   No histogram.  Existing histograms in the
          statistics file are retained.

Note:  The following naming convention is used for
       histograms:

           HIST001  - band 1
           HIST002  - band 2
              .         .
              .         .
           HIST00n    band n

NCLASS(--)
Number of classes. If NCLASS is not equal to 0, the user will be prompted NCLASS times for the following:


1.  "ENTER CLASS NAME" (name of class to be processed)

2.  "ENTER NUMBER OF SITES FOR THIS CLASS"
     (# of sites to process; "0" or carriage return
      for all sites)
 
     If the specified number of sites is not equal to
     the total number of sites in the class, the user
     is then prompted:

3.  "ENTER SITE NAMES, 1 PER LINE" (names of sites to
     be processed)

OPTION(--)
List option. Allows the user to display the class names, the site names, or both the class and site names in the statistics file prior to processing.


  = CLASS:  List class names.  Allows the user to
            display the class names in the statistics
            file prior to processing.
  = SITE:   List site names.  Allows the user to
            display the site names in the statistics
            file prior to processing.
  = BOTH:   List both class and site names.  Allows
            user to display both the class and site
            names in the statistics file prior to
            processing.

BANDS
Band numbers. Specifies the band numbers for which corresponding cutting ranges have been given. The number of values for BANDS, HIGHVAL, and LOWVAL must be equal.

LOWVAL
Low cutting value. Specifies the lower value(s) of the cutting range for the band(s) specified by the BANDS parameter. All pixels with values between LOWVAL and HIGHVAL inclusive for the appropriate image band specified in BANDS will be excluded from statistics calculations. Corresponding pixels in all other image bands will also be excluded. For example, if values from 12 to 20 are specified for cutting in band 2, then the corresponding pixels in all other bands will also be excluded from statistics calculations. The number of pixels used for calculations will therefore be the same for all bands.

HIGHVAL
High cutting value. Specifies the upper value(s) of the cutting range for the band(s) specified by the BANDS parameter. All pixels with values between LOWVAL and HIGHVAL, inclusive, for the appropriate image band specified in BANDS, will be excluded from statistics calculations. See HELP for LOWVAL for further information.

Examples:

  1. LAS> stats-all in=color.img instat=scene.dat nclass=3 option=both

    The mean vectors, correlation and covariance matrices, and histograms for three classes in the statistics file SCENE.DAT are computed for the image COLOR.IMG. The user is prompted for the class names. The class and site names in the file are listed before processing.

  2. LAS> stats-all in=water.img instat=land.dat histflg=no

    The mean vectors and covariance and correlation matrices for all classes in the statistics file LAND.DAT are computed for the image WATER.IMG. The histograms are not calculated.

  3. LAS> stats-cut in=water.img instat=land.dat bands=(2,3) lowval=(63,27) highval=(68,31) histflg=no

    This example is the same as Example 2 except that pixels with values between 63 and 68 in band 2 and pixels with values between 27 and 31 in band 3 are excluded from computations. Note that these pixels are excluded from all the bands.

Description/Algorithm:

Using the specified classes and sites, the statistics are calculated as follows: Site Mean Vector:


    -    -       -           -
    x  = x (1) , x (2) , ... x (d)
     1        1       1           1

  where


    -        -1   N
    x(p)  = N         x(p)
        1        SUM      i1

                 i=1

Site Covariance:
 	

                   N
               -1             -              -
C (p,q) = (N-1)   SUM [x(p)  -x(p) ] [x(q)  -x(q) ]
 1                         i1     1       i1     1
                  i=1


where	p,q   = 1, ... d
          d   = number of bands of input image
          N   = number of pixels in a training site
          x(p)i1 denotes the gray level at pixel i at site 1 in 
          band p

The site statistics are then combined to generate class statistics.

Class Mean Vector:


    -        -1    -         -             -
    x(p) = NP   (N x(p)  + N x(p)  + ... N x(p) )
                  1    1    2    2        m    m

Class Covariance:


                   m                   m
              -1                               -     -     -      -
C(p,q) = (NP-1)  [SUM (N -1)C (p,q) + SUM N [x(p) -x(p)][x(q) - x(q) ] ]
                        i    1             i           i            i
	          i=1                 i=1

where	Ni = number of pixels at site i
        NP = total number of pixels in a class
        m  = number of sites in a class

Site histograms are calculated and then combined to form class histograms.

Nonfatal Error Messages:

  1. [stats-class] Class not found.. Total 3 chances.. Try again

    The user is given three chances to enter a valid class name.

  2. [stats-site] Site not found.. Total 3 chances.. Try again

    The user is given three chances to enter a valid site name.

  3. [stats-det] Determinant of covariance matrix .LE. 0 for.. CLASS: <class name> SITE: <site name>

    Check data in statistics file.

  4. [stats-diag] Diagonal element .LE. 0 in covariance matrix for.. CLASS: <class name> SITE: <site name> Correlation matrix not computed

    Check data in statistics file.

  5. [stats-info] MINMAX is being called prior to computing histograms

  6. [stats-cut] All pixels cut from.. CLASS: <class name> SITE: <site name>

  7. [stats-data] Input statistics file already contains data

    Either specify new file or overwrite current file.

Fatal Error Messages:

  1. [stats-wind] No window is allowed

    The input image name must be specified without a window.

  2. [stats-in] Error accessing input file names

    Check the value given for the parameter IN and rerun.

  3. [stats-nclass] Too many classes requested

    Stats was asked to process more classes than the statistics file held. Check the value of the parameter NCLASS.

  4. [stats-i/o] Statistics I/O error for CLASS. Check statistics file

  5. [stats-i/o] Statistics I/O error for SITE. Check statistics file

  6. [stats-i/o] Statistics I/O error for # sites. Check statistics file

  7. [stats-cname] Three errors occurred for CLASS name

    The user is given three chances to specify a valid class name. After three failures, the program terminates.

  8. [stats-sname] Three errors occurred for SITE name

    The user is given three chances to specify a valid site name. After three failures, the program terminates.

  9. [stats-maxvert] Maximum number of SITE vertices is 200

    The maximum number of site vertices is 200. Simplify site shapes and rerun program.

  10. [stats-match] Mismatch between number of NBANDS, HIGHVAL, and LOWVAL

    The BANDS, HIGHVAL, and LOWVAL parameters must have the same number of values.

  11. [stats-range] Invalid cutting range, LOWVAL's must be less than HIGHVAL's

    Respecify LOWVAL and HIGHVAL such that LOWVAL(1) < HIGHVAL(1), LOWVAL(2) < HIGHVAL(2), etc.

  12. [stats-band] Invalid band specification...band <XXX> not found

    Cutting was requested on an image band not given as input.

  13. [stats-nosit] No sites within image bounds

    The polygon vertices for all sites fell outside the given image. This occurs most often when the image given for IN is not the image used to generate the polygons. (See User Note 5.)

  14. [stats-wrstat] Error writing data to statistics file

    Check statistics file.

  15. [stats-vert] Polygon vertex is outside image area for.. CLASS: <class name> SITE: <site name>

    The polygon vertices fell outside the given image. This occurs most often when the image given for IN is not the image used to generate the polygons. (See User Note 5.)

  16. [stats-abort] STATS halted by user with no updates to statistics file

    Did not replace current statistics in file.

  17. [stats-open] Unable to open input image file

    Respecify input image file.

  18. [stats-sopen] Unable to open statistics file

    Respecify statistics file.

User Notes:

  1. INSTAT contains the coordinates for the site polygons and may contain statistics as well. Any existing statistics are overwritten if specified via the MEANFLG, COVARFLG, and HISTFLG parameters. If the polygon for a given site falls partially or entirely outside the image, the statistics for that site are all set to 0 and the site is excluded from class level computations.

  2. The -CUT subcommand allows certain value ranges to be eliminated from the computations. All pixels for the respective image band with values between LOWVAL and HIGHVAL, inclusive (specified in BANDS), are excluded. Note that if a pixel is cut from one band, the corresponding pixels in the remaining bands are cut as well. Hence, the number of pixels used for the calculations are the same for all bands.

  3. STATS handles up to 256 bands, 256 classes, and 256 sites per class. STATS does not handle a window.

  4. The BANDS parameter refers to ordinal, as opposed to spectral, band numbers. For example, if IN="IMG(:21,22,23)", the valid values for BANDS are 1, 2, and 3 where BANDS(1) refers to band 21 of IMG, BANDS(2) to band 22 of IMG, and BANDS(3) to band 23 of IMG. The number of values in BANDS must be the same as that in LOWVAL and HIGHVAL.

  5. Polygons may fall outside the input image if IN is not the image used to generate them. The two possible methods of resolving this are:

        o  Use the program EDITSTAT.  Copy the polygons to a new
           statistics file (to preserve the original) and use the
           EDITSITE function within EDITSTAT to make the polygons
           conform to the input image master scene coordinates. 
           (This method is recommended.)
    
        o  Use the program EDITDDR to modify the master line
           and sample coordinates of the input image to conform to 
           the polygons.
    

    The program DSPDDR may be used to show the master line and sample coordinates.