|
Publications
>
Voser 1998c - Towards Hybrid Analysis - Specification of High Level Analytical GIS Operators
|
back
next
|
|
|
|
Stefan A. Voser, Stefan Jung 1998
Towards Hybrid Analysis - Specification
of High Level Analytical GIS
Operators
First AGILE-Conference (ASSOCIATION OF GEOGRAPHIC INFORMATION
LABORATORIES IN EUROPE), 23.-25.-April
1998, ITC, Enschede (NL), ITC-Publications.
Abstract
The targets and demands for interoperability and the integration of GIS and
Remote Sensing focus
towards hybrid spatial analysis. This implies a combinational processing of spatial data of any
representation and data type. Hybrid analysis covers more than the overlay of raster and vector
data, whereas the combinational interaction between raster and vector data covers the main hybrid
need for the integration of GIS and Remote Sensing.
Hybrid spatial data processing is an extension of the traditional monomorphic
processing of spatial
data and has to be embedded into the existing functionality. In this manner, hybridity means the
processing of multiple datasets in any kind of data representation and semantics. Many spatial
analysis tasks may be realised either in raster or in vector structures, depending of the available
data. For these types of operations, the extension with hybrid functionality includes the
geometrically combined processing in both structural representations. Other operations for
instance need limitation boundaries from the vector space to process raster analysis locally or vice
versa.
To design and develop hybrid algorithms, the semantics of the data is also
considered because
raster data may describe raw satellite imagery, interpreted raster data or terrain models. Because
of such different meanings of geometric representations, implemented algorithms just conform to a
limited set of semantics. When the functionality of spatial operations should cover a good deal of
semantics of the input data, the operation needs the polymorph implementation of many
algorithms.
The processing of data deals with the lowest importance of semantics and
is regarded as a low
level of abstraction. Therefor, at the next higher level of abstraction, at the controlling level, the
semantics of data types and the interpretation of metadata interact and build the core for an
automated control of spatial analysis. This becomes the role of the turntable for the management
of the different kind of geometric and structural representations of spatial data.
At the highest level of abstraction, the users cognition, the application
and the task of the
operation are central and grasp the management of the analysis. Here, the structure and the
geometrical representation of spatial data are pushed into the background, whereas the handling of
metadata rich of content is in the foreground. The metadata is needed both for spatial data and for
the operation.
These different aspects, the different levels of abstraction that are divided
in the management, the
controlling and the processing level as well as the therefor required components build up high level
spatial analysis operators.
In the following, a generic and conceptual design of such high level spatial
analysis operators is
presented in order to introduce the surrounding field where hybrid spatial analysis is embedded. At
the end, an overview of principles for hybrid analysis is given.
|
|
GIS- or environmental analysis tasks are specific to individual Information Communities
(ICs) and
related to their spatial questions and problems to solve. The accomplishing GIS-tasks exist in a
variety of complexity. These tasks can be solved through well defined GIS- Operations. The user’s
demands are metadata driven process and operation design. For this, a conceptual framework and
generic design of analytical GIS-Operators have to be built up. To fulfil all requirements of spatial
analysis, the simultaneous and combinational processing of vector and raster data is needed. For
reaching this approach, a zoom-in towards the functionality of hybrid analysis is needed.
All spatial data analysis systems, including image processing and GIS, offer a large
amount of
algorithms and analytical operations to solve spatial problems. For non-experts, this variety as well
as the complexity of operations, mostly technical in nature, often impede fast digital solutions.
Necessary system decisions, different data formats, data structures, and data models as well as
the lack of compatibility are additional drawbacks to theme of GIS for different spatially related
tasks. What is needed is a broad technical toolbox that is capable of supplying full functionality for
a wide range of complex spatial problems. Consequently there are strong demands to design
operators closer to the user’s needs, but still of universal nature. That requires independence from
any Information Community and its specific applications.
During the design of each analysis task, these operations have to be modelled in a
data
independent manner. This means, the operations have to be designed at a high level of abstraction
as universal analytical GIS-Operators. At this level, the operators are independent of any data
catalogue, data type or data structure.
At task level which manages the operation due to the application, the operators only
work with
metadata where the end-user is responsible for the conceptual task. The operation is defined by
the input data and task with the underlying functionality. Both are selected from catalogues and
their metadata: The input data is chosen from a data catalogue, the functionality is selected from
an operation library and its metadata.
The results are data in a data catalogue, attached with new metadata and the results,
descriptions
and characteristics of the operation.
Based on this conceptual, technology independent modelling of an operator, a mapping
to the
technological tools is needed. For that, requirements of interoperability in geoprocessing have to be
fulfilled. The generic design of the operator’s functionality should be able to process any kind of
geographic data types (point, line, area, nodes, edges, meshes, grid, tin, raster ...).
Especially for the integration and combination of vector and raster technology (e.g.
for the
integration of GIS and remote sensing), there is an increasing need for hybrid analysis. This paper
represents an approach to design and structure such universal and generic operators.
An overview is shown in figure
1: An operator needs input, and its results are the output. Input, the
operator itself and the output consist of data and metadata, all concerning geometry and semantics
(chapter 2). A
zoom-in to the architecture of such an operator is focused in chapter
3, where the
different levels of abstraction are classified as management, controlling and processing. In chapter
4 specifications of different components as user interface, plausibility check as well
as data types
and algorithms are described. The descriptive and cognitive control of the operations is made upon
the metadata for input, output and the operator. The relationships and mapping between them have
to be known (see chapter 5).
The increasing need of hybrid analysis asks for hybrid functionality.
The basic principles for that are shown in chapter
6.
|
|
|
|
The design of Analysis-Operators in GIS predominantly concerns their structure and
behaviour. A
universal analytical GIS- operator is characterised by its functionality. It is able to analyse the
input
data to choose the appropriate algorithm. As a result, new output data are generated. A High-Level-
Operator has to be metadata driven. For selecting data and functionality, only metadata will be
used. The following survey describes the concept of operators and their input and output data. Their
relationships are also shown in figure
2.
Input Data
|
The input data are described through a data catalogue including its metadata
and lineage information. The link to their digital representation in the related
and implemented data schema has to be known.
|
Operator
|
The operator is defined by a generic method. This is specified by its
characteristic parameters. The operation is described by the operators
metadata. The operator generates specific metadata as a protocol for the
lineage.
The underlying control has to select the correct algorithm that fits the types
and structures of the input data. Each algorithm has its own profile. The
process control executes an analysis of the data as an inquiry for the right
algorithm. The operator is designed in a polymorph way which includes
hybrid operations.
|
Output Data
|
The output data are mapped into a data catalogue which may already exist,
or has to be newly generated or extended. It includes metadata and lineage
information. Its representation is linked to the underlying data schema.
The output data catalogue may be derived from the operator, or the operator
has to meet the conditions of the data model chosen by the user.
|
The functionality of a GIS-Analysis-Operator can be modelled by the following three
categories:
geometric converter, semantic translator and metadata processor. Each category concerns
different kind of data.
Geometric Converter
|
All geographic data have a geometric component. One of the main goals of
spatial analysis is to solve geometric (metric and topological) questions.
Generally a GIS analysis produces data with new geometric information.
Consequently a GIS- Analysis-Operator converts the geometric input into the
new geometry of the output data.
|
Semantic Translator
|
The semantics of the data is given by the Information Community, its
conceptual model with all its attributes and descriptions. It is an overlay to
the underlying geometry. The geometric process of the operator is linked to
the thematic component of the data, a new theme is generated during the
combination process with the geometric and semantic overlay. The result is a
new semantics.
|
Metadata Processor
|
The Operator is controlled and prepared for the processing, based on the
metadata of the input data. The metadata is translated to process control
parameters. The process produces new metadata. The generation of the
output metadata is controlled by the operations metadata.
|
The controlling of the operator includes the preconditions that are derived from the
input data and
the chosen operation with its implemented algorithms. For the output, the postconditions include
the requirements of the output data. The operation has to be protocolled including the lineage and
other metadata.
|
|
|
|
The technical implementation of the described conceptual design of analysis operations
asks for
concrete solutions. Therefor the operator is divided into three levels of abstraction. The design is
a
top-down concept because of its user- oriented and user-friendly approach. At high level,
concerning the management , the operator works with metadata only. It is this level which the user
is faced with. At mid level, the control of the operations, including the handling of the data types
and the metadata, is performed. At low level, the algorithms are processed and protocolled.
The universality of an operator is displayed at the highest level of abstraction because
of its design
which is independent of data structure and data schema. The interactions between the levels
directly concern neighbour levels because of their hierarchy. The core of such a universal operator
is at the control level. The performance of the operator is carried out at the processing level.
Management
(High Level )
|
At high level, the semantics of the operation is defined. The data to be
analysed are selected from the data catalogue with their associated
metadata. The operator is specified upon its parametrisation which is also
metadata driven. This level manages the whole operator. It is the constituent
part of the analysis task. The user defines the operation upon his cognitive
experience.
The data description at this level is independent of any spatial data structure
(e.g. raster or vector) or data schema.
|
Controlling
(Mid Level)
|
At mid level, the controlling of the operation is carried out. The input data,
selected at high level, are linked to their corresponding data representations
in a geometric and structural level. The operator analyses the metadata,
given at the management level, and chooses the algorithm that matches the
data types and other requirements. Here, the requirements for hybrid
analysis functionality arise because of the different data representations.
At this level, the cognitive semantics of the data and operation gets lost, the
information is translated into syntactically structured information for the
processing in which their cognitive context is unimportant.
|
Processing
(Low Level)
|
The low level is the processing level. The algorithm processes the data
numerically, all operations characteristics are processed and sent back to
mid level.
|
|
|
Based upon the operator design described above, the specification of a generic high
level spatial
analysis operator is necessary to create instances of an operator class "high level-operators".
An
instance of that class could be any high level operator, e.g. buffer, overlay, shortest path. At the
current state of development the specification introduced in this section does not claim to be
complete or formal in mathematical terms. It rather serves as a more detailed description of
components in order to approach to a formal specification. It is the attempt to outline the general
conditions necessary to create high level operators. Therefore it has to be regarded as a step
towards the realisation of operator design.
The variety of instances of high level operators, design and specification comprises
different
aspects. The data structure independence of the management level has to be implemented through
a data type driven polymorphism at the control level. Due to the polymorphism a high level operator
is able to execute different algorithms or sequences of algorithms at the processing level.
In addition, cognitive and semantic aspects can not be neglected in operator design.
Yet the
structure of data plays a central role in the building process of high level operators concerning
different parts of the specification.
User Interface
The user interface specifies what kind of interaction can take place between the user
and the
operator. User interaction includes any form of communication between the user and the operator,
such as:
operator control
|
- Selecting data through a data catalogue
- Input or determination of parameters
for the operator
|
user support,
documentation
|
- Access to documentation and description
(metadata) of operations and
data
- Transparency of the operator by describing
architecture and algorithms
|
process control,
messages
|
- User comfort with messages and warnings
- Error-control
|
Plausibility Control
Plausibility control of an operator should be able to determine whether an operation
proves to be
meaningful or not. Representation of this advanced form of semantics has to be one level above the
creation of the correct data type management. It comprises several aspects.
Semantics Control
|
Analysis and comparison of semantics in input and output data should
enable the operator to distinguish between allowed and forbidden
combinations. Predefined semantic results for given input/output
combinations could also be attached to the operator ( semantic templates).
|
Property Control
|
A relation between geometric and semantic properties of data should be
established. For example, geometric accuracy control should decide which
combination of different resolutions is allowed. Tolerated ranges of accuracy
for different semantics of spatial data can be supplied.
|
Lineage Control
|
Lineage information attached to output data could include plausibility control
documentation and results.
|
Types
Assumed that spatial data are organised in data types, it has to be exactly specified
which data
types can be processed by an operator and which data types will be produced by the operator after
processing. Data types distinguish between different representations of geometry within spatial
data.
Catalogues
|
- Specification of allowed data types
and its definitions
- data model (topological vector model,
task specific models, ...)
- structures (raster, vector, tin ...)
|
Combinations
|
- List of input-output-couples (combinations
of input and output data types
related to an operator)
|
Restrictions
|
- Restriction for operator use with
regard to the allowed data couples
|
Polymorphism
Polymorphism stands for different algorithms related to different data types of the
same high level
operator. Program structures are created at the control level by decisions made according to data
types, user parametrisation and semantic specifications. To include conditions have to be fulfilled:
Uniqueness
|
- Non-ambiguous decision rules
|
Completeness
|
- A function must exist for every combination
of allowed input data
|
Correctness
|
- A method to test consistency and
correctness of decision rules
|
Extensibility
|
- Operations at control level must
be extensible to new data types and
new algorithms
|
Algorithms
At processing level standard software development guidelines have to be regarded.
It is planned to
choose an object-oriented approach with certain advantages related to the described operator
design.
Transparency
|
- Transparency of algorithms by documentation
|
Uniqueness
|
- Avoiding identical processing and
geometric results for different high level
operators using the same algorithm
|
Redundancy Free
|
- Non-redundant implementation of algorithms
(modules, function libraries,
etc.)
|
Application Specification
Optional is an extension of high level operators towards the restricted and/or expanded
use in
certain information communities. That would require:
Limitations or
Reductions
|
- Restriction in functionality according
to a reduction of defined operations
in addition to the specifications mentioned so far
|
Specialisation
|
- Specialisation of certain operators
through the input of domain specific
data and functionality (e.g. integration of domain specific rules and
algorithms)
|
|
|
Metadata is used to manage all operators at the highest level of abstraction. At the
interface or
management level, only metadata control user interaction. Explicit and implicit metadata are
analysed to control the operator at mid level. The metadata have different meanings at the three
levels as described in chapter ??.
In the following, the processing of metadata for high level-GIS-operators is divided
into two parts:
metadata of spatial data and metadata of the operator.
Metadata of Data
The metadata describe the data at a high level of abstraction in a data catalogue
with its related
information. The information is the link to the database and to its data schema in which all other
information is stored implicitly.
Management
(High Level)
|
The metadata of the data describe the content and the conceptual
organisation of the data.
|
Controlling
(Mid Level)
|
At mid level, the semantics of the metadata of high level has no meaning.
The information of the data types, in which the data are represented, mainly
important for the control of the respective polymorphism. The main aim at mid
level is the control of the structural information of input and output data. This
includes the control of the polymorphism of the operator.
The operator transforms the semantics of the metadata to parameters used
for processing, whereas the semantics of the data is of no relevance for the
parameters of a process.
|
Processing
(Low Level)
|
At low level, metadata focus on the values and categories of the parameters.
The characteristic parameters of the operators and its values are instanciated
and assigned to the data to be processed.
|
Metadata of Operators
The operator metadata control their spatial analysis process. They map and process
the metadata
of input and output and protocol the operation and the lineage.
Management
(High Level)
|
The metadata of the operators describe the task, the functional behaviour and
the required information. The operators metadata include the semantics,
which is related to the data.
|
Controlling
(Mid Level)
|
The operators metadata control the operation. At this level the mapping of the
data types to the algorithm of the polymorph implementation takes place.
|
Processing
(Low Level)
|
This level generates the input data for the lineage of the operation which is
related to the derived data.
|
|
|
Hybrid analysis is part of polymorph implementation of GIS-analysis-operations. Hybrid
analysis is
needed for the integration of GIS and remote sensing, for terrain analysis (DTM’s in raster format
with overlay of vector data) etc.
As input data, we have raster and vector data which have to interact correctly. The
result of a hybrid
analysis are new raster data, vector data or both of them.
Some examples of hybrid analysis functionality
- Fencing off raster processes by the
overlay of vector data
- Verification of geometry and attributes
- Transferring extracted geometry from
raster to vector
- Transferring attributes from image
interpretation to vector data
- Transferring attributes form vector
databases to raster data
- Using raster information for determination
of uncertainty of vector data
- Equalising Data
The complexity of the hybrid functionality may be shown by the following survey:
Projecting Information
|
- geometric overlay
- semantic overlay
|
Combined Analysis
|
- derivation by aggregation
- derivation by interaction
- accumulation by information transfer
|
Knowledge-based
Conversion
|
- knowledge-based conversion
- extraction of objects
|
The full range of spatial analysis may be reached with operations that interact with
raster and
vector data without the need of conversion. The requirements of hybrid analysis are divided in
geometric interaction, semantic analysis and metadata processing.
Geometric Analysis
|
- positional
location: generating the relation of co-ordinates of
different data and identifying identical positions of geometric
primitives
- linking
geometric features: identifying features covering the
same location
- extracting
and transferring geometry: generating new
geometry by extraction, transfer and interpolation
|
Semantic Analysis
|
- geometric
interpretation: giving new semantics to the data
- thematic
analysis: linking, projecting and deriving integrates
new information
- combination
of geometric and thematic interpretation:
combined analysis of geometric and thematic information
|
Metadata Processing
|
- analysis of metadata
- description of the geometric, semantic
and combined
analysis
- Description of the semantic analysis
- statistics about geometric overlay
|
|
|
|
|
The concept of universal analytical GIS-Operators independent of Information Communities
has
been outlined and specified. Universality in this context includes a domain independence as well as
the aim to work with data . This can only be achieved by metadata driven user interaction.
To generate the aspired data structure independence of universal operators, hybrid
analysis
techniques have to be integrated. Due to a currently increasing demand for the integration of
remote sensing and GIS, hybrid analysis gains significance. Our future work will focus on a
detailed specification of hybrid analysis operations and its prototypical implementation.
|
|
ALBRECHT J. (1996). Universal Analytical GIS Operations: a task-oriented systematisation
of data
structure- independent GIS functionality leading towards a geographic modelling language. PhD-Thesis,
ISPA Mitteilungen 23, University of Vechta: Vechta, Germany.
FISCHER M., SCHOLTEN H.
J. and UNWIN D. (ed.): Spatial Analytical
Perspectives on GIS
FORTHERINGHAM S. and ROGERSON
P. (ed.): Spatial Analysis and GIS
JUNG S., ALBRECHT J. and
EHLERS M., (1997): Multi-Level Comparative
Analysis of Spatial Operators
in GIS and Remote Sensing as a Foundation for an Integrated GIS, In: FÖRSTNER W. and PLÜMER
L.
(ed.) Semantic Modelling for the Acquisition of Topographic Information from Images and Maps, SMATI
´97, Birkhäuser Verlag, Basel Boston Berlin, pp. 72-88.
LONGLEY P. and BATTY M.
(ed.): Spatial Analysis: Modelling in a
GIS Environment. GeoInformation
International, Cambridge 1996.
MEYER, B. (1997). Object-Oriented Software Construction. Second Edition.
Prentice Hall PTR, New
Jersey 07458.
OGC (1996). The OpenGIS Guide: Introduction to Interoperable Geoprocessing,
OpenGIS TC Document
Number 96- 001. Open GIS Consortium: Wayland, MA.
RUMBAUGH, J., BLAHA, M.,
PREMERLANI, W., EDDY, F. and W. LORENSEN (1991).
Object-Oriented
Modelling and Design. Englewood Cliffs, NJ: Prentice-Hall.
|
|
|
|
|