User's Guide

KMEANS

Performs a clustering classification on an input image.

Function:

Performs a clustering classification on an input image. An image of pixel cluster assignments is produced, and the cluster means and standard deviations and separability information are printed. KMEANS also has the option of producing a file of class statistics. Input image may be any data type. Output classification image is BYTE.

Parameters:

IN
Input image. Data type may be BYTE, INTEGER*2, INTEGER*4, or REAL*4.

OUT
Output cluster assignment image. Contains pixel cluster assignments. Data type is BYTE and values will be in the range of 1 to NCLUST.

OUTSTAT(--)
Output statistics file. Program MASKSTAT will be run using the input image and output classified image to produce the statistics file.

Default: No statistics file is produced.

NCLUST(8)
Number of clusters desired. The possible range is from 2 to 255.

MAXNUMIT(2)
Maximum number of assignment iterations. Program execution will terminate once MAXNUMIT iterations have occurred, or the clustering has converged. Refer to parameter THRPERC for information about convergence.

THRPERC(25)
Threshold value of percent change. If the percentage of pixels changing cluster assignment between iterations falls below this value then the clustering will have converged and execution will be complete.

PRINT(LP)
Output destination for class means, standard deviations, and number of members in each class.

  = --:        No Report
  = TERM:      Terminal
  = LP:        Line printer
  = filename:  User specified
               filename

VIEWIT(0)
Iterations at which the cluster statistics are to be printed. If a value of "N" is specified, the statistics will be printed at intervals of N iterations. If left at the default of 0, only the initial and final cluster statistics will be printed.

Examples:

  1. LAS> kmeans in="washdc(l00,l00,5l2,5l2:l,2,3)" out=washdc.clust nclust=l0 maxnumit=4

    Cluster classification will be performed on the input image WASHDC using the 5l2 x 5l2 window specified, producing an output image WASHDC.CLUST with 10 classes. The clustering will have converged when less than 25 percent (the default THRPERC) of the pixels have changed classification. Clustering will stop at convergence or when four iterations have been performed, whichever occurs first. No class statistics file is produced.

  2. LAS> kmeans in="washdc(:l-5)" out=washdc.clust thrperc=30

    Same as above, but with eight cluster classifications (default), two maximum iterations (default), and the percentage change threshold at 30 percent. The entire image area will be used.

Description/Algorithm:

1. Compute image means and standard deviations.

2. Determine locations of initial cluster centers.

Center for cluster k, for band i =

                      2Si
       Xi+ Si-(k-1) -------- 
                    NCLUST-l  

      where Xi is the mean value of the ith band,
            Si is the standard deviation of the ith band, 
            and k ranges from 1,NCLUST.

3. Assign data to clusters using minimum Euclidean distance rule.

4. Update cluster centers using assignments made in step 3.

5. Stop iterations if MAXNUMIT has been exceeded or if percentage of pixels that changed clusters in step 3 falls below the user-supplied threshold (THRPERC). Otherwise, return to step 3.

6. Compute cluster means, cluster standard deviations, and cluster separability information of final cluster image.

Cluster separability information is computed using the following equations:

For the rth and sth clusters

                        Drs    
             QUOT =   ---------
                      Dr  +  Ds

       where

                DIM                 1/2
      Drs  =  { SUM  (uir - uis) 2 }
                i=l


       and

              1/2  DIM            2  DIM           2  1/2
Dk = (DIM + 2)   { SUM (uir - uis)   SUM  uir - uis  }
                   i=l               i=1  ---------  
                                               2 
                                            sik

for k = r,s. If QUOT is greater than l, the clusters are well separated. For QUOT less than 0.75, the clusters can probably be merged without resulting in a multimodal class density. The situation is unclear for QUOT between 0.75 and l.0. Dr is a measure of the spread of class r in the direction of the mean of class s, and similarly for Ds.

Nonfatal Error Message:

    None.

Fatal Error Messages:

  1. [kmeans_assign] Error allocating dynamic memory

    There was an error allocating memory for the data buffers. If the error persists contact the system administrator.

User Notes:

    None.