Spatial Analysis Functions and Data Management in GIS

Spatial Analysis Functions

Focal Functions

Focal Sum

Assigns the sum of values of a subject variable within a cell’s neighborhood in the input layer to each position in the output layer. Requires a minimum interval input layer and outputs a proportion variable. Useful as an intermediate step for determining local densities.

Focal Percentile

Assigns the percentage of cells with lower values within a cell’s neighborhood in the input layer to each output cell. Applicable to ordinal variables and higher. Useful for locating local maxima and minima.

Focal Range

Assigns the number of distinct values within a cell’s immediate neighborhood in the input layer to each output cell. Works with all variable types and is suitable for approximating zone boundaries.

Focal Minimum

Assigns the percentage of cells with lower values within a cell’s immediate neighborhood in the input layer to each output cell. Requires at least ordinal data and helps locate local maxima.

Near Focus

Assigns the Euclidean distance to the nearest non-zero cell in the input layer to each output cell. Input variable type can be any, output is proportional.

Neighborhood Focus

Assigns the value of the nearest non-zero cell in the input layer to each output cell. Works with all variable types.

Focal Insularity

Assigns a unique identifier to each group of contiguous cells with the same value in a layer. Multiple disjoint groups can have the same value, representing different areas in the output layer.

Focal Gravitation

Assigns the weighted average of non-zero input cell values to each output cell, weighted by the inverse square of the distance. Applies to numeric (interval or ratio) variables and uses an extended neighborhood.

Local Functions

Local Mean

Computes the average value of corresponding cells across multiple input layers for each output cell. Input layers must be numeric (interval or ratio).

Local Maximum

Computes the maximum value of corresponding cells across multiple input layers for each output cell. Input layers must be ordinal.

Local Mode

Computes the modal value of corresponding cells across multiple input layers for each output cell. Input layers must be nominal.

Zonal Functions

Zonal Sum

Assigns the sum of input layer values within each zone to all cells in that zone in the output layer.

Zonal Mean

Assigns the average of input layer values within each zone to all cells in that zone in the output layer. Applicable to interval or ratio data.

Zonal Minimum, Maximum, and Mode

Assigns the minimum, maximum, or modal value of input layer values within each zone to all cells in that zone in the output layer.

Incremental Gradient

Calculates spatial variation by relating explicit variable values to implicit distance within a cell’s neighborhood.

Raster Data Variables

  • Ratio: Values on a calibrated scale with a fixed origin, where ratios between values are meaningful.
  • Interval: Values on a calibrated scale without a fixed origin, where differences between values are meaningful.
  • Ordinal: Values on a non-calibrated scale, where comparisons of greater/lesser are meaningful.
  • Nominal: Values representing qualities, not quantities, with no numerical relationship.

GIS Data Management

Data Dictionaries

Data dictionaries define data structures and facilitate interpretation of GIS data files.

Spatial Data Storage in Relational Databases

  • Data Types: Each column stores a single coordinate value. Point, line, and polygon tables can be linked to entity tables.
  • Bitstream Columns: Store complete coordinate sequences for an entity in a single column, either in the same table or a separate geometry table.

Fictitious Polygons (Slivers)

Narrow, elongated polygons resulting from geometric mismatches. Can be identified by a low area-to-perimeter-squared ratio.

Basic Spatial Operators

A complete and minimal set of spatial operators, where for any spatial location, one and only one operator applies.

Entity-Relationship Model

  • Roles: Function of each entity in the relationship.
  • Cardinality: Maximum number of related entities (1:1, 1:M, M:M).
  • Necessity-Contingency: Whether the relationship is mandatory or optional for each entity.

Spatial Indexes

Accelerate spatial queries (locate, locate by site, link by spatial relationship) by associating cell identifiers with entity identifiers.

Composite Entities

Groups of entities, recursively. Used for representing complex geometries, multiple geometries, or entities with varying attribute values.

Data Updates

  • Smallest unit for update/selection: Column
  • Smallest unit for insertion/deletion: Row

Data Integration

  • Format conversion and classification unification.
  • Point density adjustment for lines during projection transformations.
  • Projection and coordinate system transformations (linear or nonlinear).
  • Geometric editing for residual discrepancies.
  • Partition reorganization.

Harmonization

  • Vertical: Unifying overlapping datasets.
  • Horizontal: Adjusting boundaries between adjacent datasets.

Error Detection in Vertical Harmonization

  • Common Area: Comparing intersection and union areas.
  • Elongation Ratio: Comparing area-to-perimeter-squared ratios.

Candidate Entity Pairs for Horizontal Harmonization

  • Angle: Maximum deviation from a straight line.
  • Maximum Distance: Maximum separation between line ends.
  • Minimum Length: Minimum length of each entity in mismatch compensation.

Normal Forms

  • First: Atomic values, no duplicate rows, order irrelevant.
  • Second: Full functional dependency on the primary key.
  • Third: No transitive dependencies.
  • Boyce-Codd: All determinants are candidate keys.