site search by freefind

follow me


Voser 1998c - Towards Hybrid Analysis - Specification of High Level Analytical GIS Operators english
Stefan A. Voser, Stefan Jung 1998
Towards Hybrid Analysis - Specification of High Level Analytical GIS Operators
First AGILE-Conference (ASSOCIATION OF GEOGRAPHIC INFORMATION LABORATORIES IN EUROPE),  23.-25.-April 1998, ITC, Enschede (NL), ITC-Publications.
Abstract
The targets and demands for interoperability and the integration of GIS and Remote Sensing focus towards hybrid spatial analysis. This implies a combinational processing of spatial data of any representation and data type. Hybrid analysis covers more than the overlay of raster and vector data, whereas the combinational interaction between raster and vector data covers the main hybrid need for the integration of GIS and Remote Sensing.
Hybrid spatial data processing is an extension of the traditional monomorphic processing of spatial data and has to be embedded into the existing functionality. In this manner, hybridity means the processing of multiple datasets in any kind of data representation and semantics. Many spatial analysis tasks may be realised either in raster or in vector structures, depending of the available data. For these types of operations, the extension with hybrid functionality includes the geometrically combined processing in both structural representations. Other operations for instance need limitation boundaries from the vector space to process raster analysis locally or vice versa.
To design and develop hybrid algorithms, the semantics of the data is also considered because raster data may describe raw satellite imagery, interpreted raster data or terrain models. Because of such different meanings of geometric representations, implemented algorithms just conform to a limited set of semantics. When the functionality of spatial operations should cover a good deal of semantics of the input data, the operation needs the polymorph implementation of many algorithms.
The processing of data deals with the lowest importance of semantics and is regarded as a low level of abstraction. Therefor, at the next higher level of abstraction, at the controlling level, the semantics of data types and the interpretation of metadata interact and build the core for an automated control of spatial analysis. This becomes the role of the turntable for the management of the different kind of geometric and structural representations of spatial data.
At the highest level of abstraction, the users cognition, the application and the task of the operation are central and grasp the management of the analysis. Here, the structure and the geometrical representation of spatial data are pushed into the background, whereas the handling of metadata rich of content is in the foreground. The metadata is needed both for spatial data and for the operation.
These different aspects, the different levels of abstraction that are divided in the management, the controlling and the processing level as well as the therefor required components build up high level spatial analysis operators.
In the following, a generic and conceptual design of such high level spatial analysis operators is presented in order to introduce the surrounding field where hybrid spatial analysis is embedded. At the end, an overview of principles for hybrid analysis is given.
1. Introduction
GIS- or environmental analysis tasks are specific to individual Information Communities (ICs) and related to their spatial questions and problems to solve. The accomplishing GIS-tasks exist in a variety of complexity. These tasks can be solved through well defined GIS- Operations. The user’s demands are metadata driven process and operation design. For this, a conceptual framework and generic design of analytical GIS-Operators have to be built up. To fulfil all requirements of spatial analysis, the simultaneous and combinational processing of vector and raster data is needed. For reaching this approach, a zoom-in towards the functionality of hybrid analysis is needed.
All spatial data analysis systems, including image processing and GIS, offer a large amount of algorithms and analytical operations to solve spatial problems. For non-experts, this variety as well as the complexity of operations, mostly technical in nature, often impede fast digital solutions. Necessary system decisions, different data formats, data structures, and data models as well as the lack of compatibility are additional drawbacks to theme of GIS for different spatially related tasks. What is needed is a broad technical toolbox that is capable of supplying full functionality for a wide range of complex spatial problems. Consequently there are strong demands to design operators closer to the user’s needs, but still of universal nature. That requires independence from any Information Community and its specific applications.
During the design of each analysis task, these operations have to be modelled in a data independent manner. This means, the operations have to be designed at a high level of abstraction as universal analytical GIS-Operators. At this level, the operators are independent of any data catalogue, data type or data structure.
At task level which manages the operation due to the application, the operators only work with metadata where the end-user is responsible for the conceptual task. The operation is defined by the input data and task with the underlying functionality. Both are selected from catalogues and their metadata: The input data is chosen from a data catalogue, the functionality is selected from an operation library and its metadata.
The results are data in a data catalogue, attached with new metadata and the results, descriptions and characteristics of the operation.
Based on this conceptual, technology independent modelling of an operator, a mapping to the technological tools is needed. For that, requirements of interoperability in geoprocessing have to be fulfilled. The generic design of the operator’s functionality should be able to process any kind of geographic data types (point, line, area, nodes, edges, meshes, grid, tin, raster ...).
Especially for the integration and combination of vector and raster technology (e.g. for the integration of GIS and remote sensing), there is an increasing need for hybrid analysis. This paper represents an approach to design and structure such universal and generic operators.
An overview is shown in figure 1: An operator needs input, and its results are the output. Input, the operator itself and the output consist of data and metadata, all concerning geometry and semantics (chapter 2). A zoom-in to the architecture of such an operator is focused in chapter 3, where the different levels of abstraction are classified as management, controlling and processing. In chapter 4 specifications of different components as user interface, plausibility check as well as data types and algorithms are described. The descriptive and cognitive control of the operations is made upon the metadata for input, output and the operator. The relationships and mapping between them have to be known (see chapter 5). The increasing need of hybrid analysis asks for hybrid functionality. The basic principles for that are shown in chapter 6.
Figure 1: The generic design of an Operator
graphic
2. The Design of Analytical GIS Operators
The design of Analysis-Operators in GIS predominantly concerns their structure and behaviour. A universal analytical GIS- operator is characterised by its functionality. It is able to analyse the input data to choose the appropriate algorithm. As a result, new output data are generated. A High-Level- Operator has to be metadata driven. For selecting data and functionality, only metadata will be used. The following survey describes the concept of operators and their input and output data. Their relationships are also shown in figure 2.
Input Data
The input data are described through a data catalogue including its metadata and lineage information. The link to their digital representation in the related and implemented data schema has to be known.
Operator
The operator is defined by a generic method. This is specified by its characteristic parameters. The operation is described by the operators metadata. The operator generates specific metadata as a protocol for the lineage.
The underlying control has to select the correct algorithm that fits the types and structures of the input data. Each algorithm has its own profile. The process control executes an analysis of the data as an inquiry for the right algorithm. The operator is designed in a polymorph way which includes hybrid operations.
Output Data
The output data are mapped into a data catalogue which may already exist, or has to be newly generated or extended. It includes metadata and lineage information. Its representation is linked to the underlying data schema.
The output data catalogue may be derived from the operator, or the operator has to meet the conditions of the data model chosen by the user.
The functionality of a GIS-Analysis-Operator can be modelled by the following three categories: geometric converter, semantic translator and metadata processor. Each category concerns different kind of data.
Geometric Converter
All geographic data have a geometric component. One of the main goals of spatial analysis is to solve geometric (metric and topological) questions. Generally a GIS analysis produces data with new geometric information. Consequently a GIS- Analysis-Operator converts the geometric input into the new geometry of the output data.
Semantic Translator
The semantics of the data is given by the Information Community, its conceptual model with all its attributes and descriptions. It is an overlay to the underlying geometry. The geometric process of the operator is linked to the thematic component of the data, a new theme is generated during the combination process with the geometric and semantic overlay. The result is a new semantics.
Metadata Processor
The Operator is controlled and prepared for the processing, based on the metadata of the input data. The metadata is translated to process control parameters. The process produces new metadata. The generation of the output metadata is controlled by the operations metadata.
The controlling of the operator includes the preconditions that are derived from the input data and the chosen operation with its implemented algorithms. For the output, the postconditions include the requirements of the output data. The operation has to be protocolled including the lineage and other metadata.
Figure 2: The conceptional design of a spatial operator
graphic
3. Different Abstraction Levels of Analytical GIS-Operators
The technical implementation of the described conceptual design of analysis operations asks for concrete solutions. Therefor the operator is divided into three levels of abstraction. The design is a top-down concept because of its user- oriented and user-friendly approach. At high level, concerning the management , the operator works with metadata only. It is this level which the user is faced with. At mid level, the control of the operations, including the handling of the data types and the metadata, is performed. At low level, the algorithms are processed and protocolled.
The universality of an operator is displayed at the highest level of abstraction because of its design which is independent of data structure and data schema. The interactions between the levels directly concern neighbour levels because of their hierarchy. The core of such a universal operator is at the control level. The performance of the operator is carried out at the processing level.
Management
(High Level )
At high level, the semantics of the operation is defined. The data to be analysed are selected from the data catalogue with their associated metadata. The operator is specified upon its parametrisation which is also metadata driven. This level manages the whole operator. It is the constituent part of the analysis task. The user defines the operation upon his cognitive experience.
The data description at this level is independent of any spatial data structure (e.g. raster or vector) or data schema.
Controlling
(Mid Level)
At mid level, the controlling of the operation is carried out. The input data, selected at high level, are linked to their corresponding data representations in a geometric and structural level. The operator analyses the metadata, given at the management level, and chooses the algorithm that matches the data types and other requirements. Here, the requirements for hybrid analysis functionality arise because of the different data representations.
At this level, the cognitive semantics of the data and operation gets lost, the information is translated into syntactically structured information for the processing in which their cognitive context is unimportant.
Processing
(Low Level)
The low level is the processing level. The algorithm processes the data numerically, all operations characteristics are processed and sent back to mid level.
4. Specification of Analytical GIS-Operators
Based upon the operator design described above, the specification of a generic high level spatial analysis operator is necessary to create instances of an operator class "high level-operators". An instance of that class could be any high level operator, e.g. buffer, overlay, shortest path. At the current state of development the specification introduced in this section does not claim to be complete or formal in mathematical terms. It rather serves as a more detailed description of components in order to approach to a formal specification. It is the attempt to outline the general conditions necessary to create high level operators. Therefore it has to be regarded as a step towards the realisation of operator design.
The variety of instances of high level operators, design and specification comprises different aspects. The data structure independence of the management level has to be implemented through a data type driven polymorphism at the control level. Due to the polymorphism a high level operator is able to execute different algorithms or sequences of algorithms at the processing level.
In addition, cognitive and semantic aspects can not be neglected in operator design. Yet the structure of data plays a central role in the building process of high level operators concerning different parts of the specification.
User Interface
The user interface specifies what kind of interaction can take place between the user and the operator. User interaction includes any form of communication between the user and the operator, such as:
operator control
  • Selecting data through a data catalogue
  • Input or determination of parameters for the operator
user support, documentation
  • Access to documentation and description (metadata) of operations and data
  • Transparency of the operator by describing architecture and algorithms
process control, messages
  • User comfort with messages and warnings
  • Error-control

Plausibility Control
Plausibility control of an operator should be able to determine whether an operation proves to be meaningful or not. Representation of this advanced form of semantics has to be one level above the creation of the correct data type management. It comprises several aspects.
Semantics Control
Analysis and comparison of semantics in input and output data should enable the operator to distinguish between allowed and forbidden combinations. Predefined semantic results for given input/output combinations could also be attached to the operator ( semantic templates).
Property Control
A relation between geometric and semantic properties of data should be established. For example, geometric accuracy control should decide which combination of different resolutions is allowed. Tolerated ranges of accuracy for different semantics of spatial data can be supplied.
Lineage Control
Lineage information attached to output data could include plausibility control documentation and results.

Types
Assumed that spatial data are organised in data types, it has to be exactly specified which data types can be processed by an operator and which data types will be produced by the operator after processing. Data types distinguish between different representations of geometry within spatial data.
Catalogues
  • Specification of allowed data types and its definitions
  • data model (topological vector model, task specific models, ...)
  • structures (raster, vector, tin ...)
Combinations
  • List of input-output-couples (combinations of input and output data types related to an operator)
Restrictions
  • Restriction for operator use with regard to the allowed data couples

Polymorphism
Polymorphism stands for different algorithms related to different data types of the same high level operator. Program structures are created at the control level by decisions made according to data types, user parametrisation and semantic specifications. To include conditions have to be fulfilled:
Uniqueness
  • Non-ambiguous decision rules
Completeness
  • A function must exist for every combination of allowed input data
Correctness
  • A method to test consistency and correctness of decision rules
Extensibility
  • Operations at control level must be extensible to new data types and new algorithms

Algorithms
At processing level standard software development guidelines have to be regarded. It is planned to choose an object-oriented approach with certain advantages related to the described operator design.
Transparency
  • Transparency of algorithms by documentation
Uniqueness
  • Avoiding identical processing and geometric results for different high level operators using the same algorithm
Redundancy Free
  • Non-redundant implementation of algorithms (modules, function libraries, etc.)

Application Specification
Optional is an extension of high level operators towards the restricted and/or expanded use in certain information communities. That would require:
Limitations or Reductions
  • Restriction in functionality according to a reduction of defined operations in addition to the specifications mentioned so far
Specialisation
  • Specialisation of certain operators through the input of domain specific data and functionality (e.g. integration of domain specific rules and algorithms)
5. Metadata
Metadata is used to manage all operators at the highest level of abstraction. At the interface or management level, only metadata control user interaction. Explicit and implicit metadata are analysed to control the operator at mid level. The metadata have different meanings at the three levels as described in chapter ??.
In the following, the processing of metadata for high level-GIS-operators is divided into two parts: metadata of spatial data and metadata of the operator.
Metadata of Data
The metadata describe the data at a high level of abstraction in a data catalogue with its related information. The information is the link to the database and to its data schema in which all other information is stored implicitly.
Management
(High Level)
The metadata of the data describe the content and the conceptual organisation of the data.
Controlling
(Mid Level)
At mid level, the semantics of the metadata of high level has no meaning. The information of the data types, in which the data are represented, mainly important for the control of the respective polymorphism. The main aim at mid level is the control of the structural information of input and output data. This includes the control of the polymorphism of the operator.
The operator transforms the semantics of the metadata to parameters used for processing, whereas the semantics of the data is of no relevance for the parameters of a process.
Processing
(Low Level)
At low level, metadata focus on the values and categories of the parameters. The characteristic parameters of the operators and its values are instanciated and assigned to the data to be processed.
Metadata of Operators
The operator metadata control their spatial analysis process. They map and process the metadata of input and output and protocol the operation and the lineage.
Management
(High Level)
The metadata of the operators describe the task, the functional behaviour and the required information. The operators metadata include the semantics, which is related to the data.
Controlling
(Mid Level)
The operators metadata control the operation. At this level the mapping of the data types to the algorithm of the polymorph implementation takes place.
Processing
(Low Level)
This level generates the input data for the lineage of the operation which is related to the derived data.
6. Hybrid Analysis
Hybrid analysis is part of polymorph implementation of GIS-analysis-operations. Hybrid analysis is needed for the integration of GIS and remote sensing, for terrain analysis (DTM’s in raster format with overlay of vector data) etc.
As input data, we have raster and vector data which have to interact correctly. The result of a hybrid analysis are new raster data, vector data or both of them.
Some examples of hybrid analysis functionality
  • Fencing off raster processes by the overlay of vector data
  • Verification of geometry and attributes
  • Transferring extracted geometry from raster to vector
  • Transferring attributes from image interpretation to vector data
  • Transferring attributes form vector databases to raster data
  • Using raster information for determination of uncertainty of vector data
  • Equalising Data

The complexity of the hybrid functionality may be shown by the following survey:
Projecting Information
  • geometric overlay
  • semantic overlay
Combined Analysis
  • derivation by aggregation
  • derivation by interaction
  • accumulation by information transfer
Knowledge-based Conversion
  • knowledge-based conversion
  • extraction of objects

The full range of spatial analysis may be reached with operations that interact with raster and vector data without the need of conversion. The requirements of hybrid analysis are divided in geometric interaction, semantic analysis and metadata processing.
Geometric Analysis
  • positional location: generating the relation of co-ordinates of different data and identifying identical positions of geometric primitives
  • linking geometric features: identifying features covering the same location
  • extracting and transferring geometry: generating new geometry by extraction, transfer and interpolation
Semantic Analysis
  • geometric interpretation: giving new semantics to the data
  • thematic analysis: linking, projecting and deriving integrates new information
  • combination of geometric and thematic interpretation: combined analysis of geometric and thematic information
Metadata Processing
  • analysis of metadata
  • description of the geometric, semantic and combined analysis
  • Description of the semantic analysis
  • statistics about geometric overlay
Figure 3: The hybrid principle
graphic
Conclusion
The concept of universal analytical GIS-Operators independent of Information Communities has been outlined and specified. Universality in this context includes a domain independence as well as the aim to work with data . This can only be achieved by metadata driven user interaction.
To generate the aspired data structure independence of universal operators, hybrid analysis techniques have to be integrated. Due to a currently increasing demand for the integration of remote sensing and GIS, hybrid analysis gains significance. Our future work will focus on a detailed specification of hybrid analysis operations and its prototypical implementation.
References
ALBRECHT J. (1996). Universal Analytical GIS Operations: a task-oriented systematisation of data structure- independent GIS functionality leading towards a geographic modelling language. PhD-Thesis, ISPA Mitteilungen 23, University of Vechta: Vechta, Germany.
FISCHER M., SCHOLTEN H. J. and UNWIN D. (ed.): Spatial Analytical Perspectives on GIS
FORTHERINGHAM S. and ROGERSON P. (ed.): Spatial Analysis and GIS
JUNG S., ALBRECHT J. and  EHLERS M., (1997): Multi-Level Comparative Analysis of Spatial Operators in GIS and Remote Sensing as a Foundation for an Integrated GIS, In: FÖRSTNER W. and PLÜMER L. (ed.) Semantic Modelling for the Acquisition of Topographic Information from Images and Maps, SMATI ´97, Birkhäuser Verlag, Basel Boston Berlin, pp. 72-88.
LONGLEY P. and BATTY M. (ed.): Spatial Analysis: Modelling in a GIS Environment. GeoInformation International, Cambridge 1996.
MEYER, B. (1997). Object-Oriented Software Construction. Second Edition. Prentice Hall PTR, New Jersey 07458.
OGC (1996). The OpenGIS Guide: Introduction to Interoperable Geoprocessing, OpenGIS TC Document Number 96- 001. Open GIS Consortium: Wayland, MA.
RUMBAUGH, J., BLAHA, M., PREMERLANI, W., EDDY, F. and W. LORENSEN (1991). Object-Oriented Modelling and Design. Englewood Cliffs, NJ: Prentice-Hall.