Siêu thị PDFTải ngay đi em, trời tối mất

Thư viện tri thức trực tuyến

Kho tài liệu với 50,000+ tài liệu học thuật

© 2023 Siêu thị PDF - Kho tài liệu học thuật hàng đầu Việt Nam

Handbook of Research on Geoinformatics - Hassan A. Karimi Part 5 potx
PREMIUM
Số trang
52
Kích thước
2.2 MB
Định dạng
PDF
Lượt xem
1845

Handbook of Research on Geoinformatics - Hassan A. Karimi Part 5 potx

Nội dung xem thử

Mô tả chi tiết

172

Geospatial Image Metadata Catalog Services

1. INTRODUCT ION

As earth observation continues worldwide, large

volumes of remotely sensed data on the Earth’s

climate and environment have been collected and

archived. In order to maintain the data archives

efficiently and to facilitate discovery by users of

desired data in the holdings, each data provider

normally maintains a digital metadata catalog.

Some online catalogs provide services to users

for searching the catalog and discovering the data

they need through a well-established Application

Programming Interface (API). Such services are

called Catalog Services. The information in the

catalog is the searchable metadata that describe

individual data entries in the archives. Currently

most Catalog Services are provided through Web￾based interfaces.

This chapter analyses three open catalog

service systems. It reviews the metadata stan￾dards, catalog service conceptual schemas and

protocols, and the components of catalog service

specifications.

2. REV IEW of G eosp at ial Image

Cat alog Ser vices

2.1 Pilot Catalog Service Systems

The Federal Geographic Data Committee (FGDC)

Clearinghouse is a virtual collection of digital

spatial data distributed over many servers in the

United States and abroad. The primary intention

of the Clearinghouse is to provide discovery

services for digital data, allowing users to evalu￾ate its quality through metadata. Most metadata

provide information on how to acquire the data;

in many cases, links to the data or an order form

are available online.

The NASA Earth Observing System Clear￾ingHOuse (ECHO) is a clearinghouse of spatial

and temporal metadata that enables the science

community to exchange data and information.

ECHO technology can provide metadata discovery

services and serve as an order broker for clients

and data partners. All the NASA Distributed Ac￾tive Archive Centers (DAACs), as data providers,

generate and ingest metadata information into

ECHO.

The Open Geospatial Consortium (OGC) has

promoted standardization and interoperability

among the geospatial communities. In catalogue

service aspect, OGC has defined the Catalog

Service implementation standard (OpenGIS,

2004) and published two recommendation papers

(OpenGIS, 2005a; OpenGIS 2005b). The George

Mason University (GMU) CSISS Catalog service

for Web (CSW) system is an OGC-compliant

catalog service, which demonstrates how the

earth science community can publish geospatial

resources by searching pre-registered spatial and

temporal metadata information. In particular, the

GMU CSISS CSW catalog service is based on

the OpenGIS implementation standard, and the

ebRIM application profile (OpenGIS, 2005). It

provides users with an open and standard means

to access more than 15 Terabytes global Landsat

datasets.

2.2 Conceptual System Architecture

Since these geospatial catalog services address

similar needs, it is not surprising that they have

almost the same conceptual system architecture,

as shown in Figure 1.

From the point of view of metadata circula￾tion, a catalog service usually consists of three

components: metadata generation and ingestion,

a conceptual schema for catalog service, and a

query interface for catalog service.

Metadata generation and ingestion is always

based on applicable metadata standards, such

as the Dublin Core (DCMI, 2003), Geographic

information – Metadata (19115) from Interna￾tional Organization for Standard (ISO, 2003),

Content Standard for Digital Geospatial Metadata

(CSDGM) from Federal Geographic Data Com-

173

Geospatial Image Metadata Catalog Services

mittee (FGDC, 1998), or the ECS Earth Science

Information Model from National Aeronautics

and Space Administration (NASA, 2006).

Metadata structures, relationships and defini￾tions, known as conceptual schemas, play a key

role in catalog services. They define what kind

of metadata information can be provided and

how the metadata are organized. The concep￾tual schemas are closely related to those of the

pre-ingested metadata information, but are not

necessarily identical. Catalog service conceptual

schemas are always oriented toward the field of

application and may be tailored to particular ap￾plication profiles.

The query interface for a catalog service

defines the necessary operations, the syntax of

each operation, and the binding protocol. To

facilitate access and promote interoperability

among catalog services, the interface definition

may be kept open.

2.3 Metadata G eneration

In this section, the three open catalog services

identified in Section 2.2 are analyzed on the follow￾ing two aspects regarding metadata generation.

2.3.1 Base Metadata Standard

The base metadata standard is the public geospatial

metadata standard on which the catalog service

is based and to which the catalog service is tai￾lored, to meet a given agency’s requirements. In

addition to international and national geospatial

metadata standards, such as ISO 19115 and FGDC

CSDGM, several agencies may have de-facto

standards in their production environment, such

as NASA ECS.

The metadata used by the FGDC Clearing￾house follows FGDC CSDGM. Each affiliated

catalog service site must organize their metadata

information following the CSDGM standard

before they join the clearinghouse.

The ECHO Science Metadata Conceptual

Model has been developed based on the NASA

Earth Observation System Data and Information

Core System (EOSDIS) Science Data Model, with

modifications to suit project needs.

GMU CSISS CSW builds up its metadata con￾ceptual model by combining the ebRIM informa￾tion model and the ECS science data model.

2.3.2 Automatic Generation of

Metadata

As the volume of spatial datasets keeps growing,

generation of metadata becomes increasingly

time-consuming. An automatic mechanism for

generating metadata will facilitate the generation

and frequent update of metadata.

Metadata information needs to be organized

as TXT or SGML or HTML files before a node

Figure 1. Conceptual Architecture of Catalog Service

Catalog Service

Client Catalog Service

Metadata

Holdings

Data

Holdings

Query Interface

Conceptual Schema

User

174

Geospatial Image Metadata Catalog Services

joins the FGDC clearinghouse. Some metadata

generation tools are available in addition to the

commercial software packages. These tools are

advertised on the FGDC website. To help the user

set up a clearinghouse node easily, a software

package, ISite, is provided. With this software,

a qualified clearinghouse node server can be set

up in minutes.

All the ECHO metadata holdings are obtained

directly from the data providers. DAACs can

use some ECS tools to automatically generate

metadata information.

GMU CSISS is developing Java-based tools

to automatically extract metadata information

from each granule. The Hierarchical Data Format

(HDF), Hierarchical Data Format - Earth Observ￾ing System (HDF-EOS), GeoTIFF and NetCDF

data formats are currently supported.

2.4 Metadata Ingestion

2.4.1 Metadata Distribution

This function deals with the physical distribu￾tion of metadata information within the catalog

service.

The FGDC Clearinghouse is a decentralized

system of servers that contain field-level meta￾data descriptions of available digital spatial data

located on the Internet. The metadata informa￾tion is physically managed within the affiliated

server node.

Even though in ECHO scenario, the metadata

information is periodically generated by those

distinct data centers, they are centrally managed

by the ECHO operation team. That is, in the

design time, metadata information in ECHO is

distributed; while in the run time it is managed

centrally.

The GMU CSISS CSW maintains more than

15 Terabytes of global Landsat images. All the

metadata information for these images has been

registered into a centralized metadata database.

2.4.2 Ingestion Type

This section examines how each catalog service

ingests metadata. It focuses on two aspects: remote

vs. local and automatic vs. manual.

In the FGDC Clearinghouse, all the metadata

information is manipulated only in the affiliated

server node. Remote ingestion is not supported

in server nodes. The ingestion has to been manu￾ally.

Due to a centralized metadata information, a

database approach is taken. Metadata ingestion in

ECHO involves two steps. Data centers need to up￾load their current metadata information remotely

to a dedicated File Transfer Protocol (FTP) server,

and the ECHO operation team is responsible for

ingesting these metadata information into the

ECHO operational system.

GMU CSISS CSW provides published inter￾faces. As long as the metadata information is well

organized, it can be remotely ingested into the

GMU CSISS CSW metadata database. All the

metadata information in that database is online

and ready for client’s query.

2.5 Conceptual Schema

We examine how the metadata conceptual schema

is defined in each catalog service.

In each FGDC Clearinghouse collection, all

the metadata information is organized according

to the FGDC CSDGM. The conceptual schema

of FGDC Clearinghouse collection is exactly the

same as that of the FGDC CSDGM.

In ECHO, all the metadata information col￾lected in the NASA DAACs is based on the ECS

science data model, with some modifications

necessary to suit project needs.

GMU CSISS CSW defines its conceptual

schema based on the ECS science data model

combined with ISO 19115. Since GMU CSISS

CSW supports metadata queries and data retrieval

(through the OGC services), an ebRIM-based

profile has been selected to support defining the

175

Geospatial Image Metadata Catalog Services

association between a data granule instance and

applicable geospatial service instances.

2.6 T ransfer Protocol

A catalog service usually provides a standard,

API-based interface to support the client’s query.

This “design-by-contract” mechanism promote

third party members’ contribution to develop new

query interfaces, besides those web-based query

interfaces provided by the catalog server itself.

The backbone of the FGDC Clearinghouse is

Z39.50 (ISO, 1998). This protocol was initially

developed by the library community to discover

bibliographic records using a standard set of attri￾butes. To guide how to implement FGDC metadata

elements within a Z39.50 service, the FGDC has

developed an application profile for geospatial

metadata called "GEO," which provides sets of

attributes, operators, and rules of implementation

that suit geospatial needs. In fact, the node server

is a Z39.50 server, which enables FGDC query

utilities to search its metadata holdings on the fly

through Z39.50 protocol and GEO profile.

ECHO exposes the Session Manager and a lim￾ited set of the ECHO services as Web Services de￾fined via the Web Services Description Language

(WSDL). ECHO also provides two client packages,

Façade and EchoTalk, for client developers. The

syntax of the communication protocol between

client and ECHO is based on the Web Services

Interoperability (WS-I) Basic Profile. However,

the semantics of the communication protocol are

defined by ECHO itself. Specific query syntax,

in Extensible Markup Language (XML) format,

has been proposed and implemented.

GMU CSISS CSW’s communication protocol

is based on the OGC Catalog Service Implementa￾tion Specification, which specifies the interfaces

and several applicable bindings for catalog ser￾vices. Operations, core information schema and

query language encodings are included. The

transportation-related communication protocol

follows this specification.

2.7 System Distribution

This section examines the physical distribution

of catalog service systems.

The FGDC Clearinghouse has 400 worldwide

registered nodes as of March 22, 2006. FGDC

maintains several Web-based search interfaces

to carry out distributed searches across multiple

clearinghouse nodes.

ECHO acts as an intermediary between data

partners and client partners. Data partners provide

information about their data holdings, and client

partners develop software to access this informa￾tion through ECHO Query and Order Web Service

interface. End users who want to search ECHO's

metadata must use one of the ECHO clients.

Although ECHO has close connections with the

DAACs and ECHO Clients, ECHO itself is not

a distributed system. It does not need to build a

distributed search across multiple agencies and

nodes at run time.

GMU CSISS CSW is a standalone service.

Like ECHO, it is not a distributed system.

2.8 Review Summaries

Table 1 summarizes the results of the analysis.

3. CONC LUS ION and Discuss ion

We have reviewed three public catalog services

— FGDC Clearinghouse, NASA ECHO and GMU

CSISS CSW— considering the following aspects:

metadata generation, metadata ingestion, catalog

service conceptual schema, query protocols and

system distribution. This review shows how it

is becoming possible to query metadata hold￾ings through public, standard Web-based query

interfaces.

The review results also show that the catalog

service providers still must define a catalog service

schema that meets their particular needs. These

application-oriented approaches can meet projects

176

Geospatial Image Metadata Catalog Services

requirements, but they will make it more difficult

to create future cross-federation multi catalog

services. We recommend that a standard, common

and discipline-oriented-metadata based schema

be used for future implementations of catalog

services in the same and/or related fields.

R eferences

DCMI. (2003). DCMI Metadata Terms. Retrieved

March 8, 2007, from http://dublincore.org/docu￾ments/dcmi-terms/ß

ECHO. (2005). Earth Observing System Clearing￾house. Retrieved March 8, 2007, from http://www.

echo.eos.nasa.gov/

FGDC. (1998). Content Standard for Digital

Geospatial Metadata (CSDGM). Retrieved March

8, 2007, from http://fgdc.er.usgs.gov/metadata/

contstan.html

FGDC. (2005). FGDC Geospatial Data Clear￾inghouse Activity. Retrieved March 8, 2007,

from http://www.fgdc.gov/clearinghouse/clear￾inghouse.html

ISO. (1998). ISO 23950: Information and

documentation - Information retrieval (Z39.50)

- Application service definition and protocol

specification.

ISO. (2003). ISO 19115: Geographic Information

- Metadata.

LAITS. (2005). LAITS OGC Catalog Service

for Web - Discovery Interface. Retrieved March

8, 2007, from http://geobrain.laits.gmu.edu/csw/

discovery/

NASA. (2006). EOSDIS Core System Data Model,

Retrieved March 8, 2007, from http://spg.gsfc.

nasa.gov/standards/heritage/eosdis-core-system￾data-model

OpenGIS. (2004). OpenGIS Catalogue Service

Implementation Specification. Retrieved March

8, 2007, from http://www.opengeospatial.org/

specs/?page=specs

OpenGIS. (2005a). OGC Recommendation Pa￾per 04-17r1: OGC Catalogue Services- ebRIM

(ISO/TS 15000-3 profile of CSW. Retrieved

March 8, 2007, from http://www.opengeospatial.

org/specs/?page=recommendation

Tables 1. Review summaries

Evaluation Points FGDC Clearinghouse NASA ECHO GMU CSISS CSW

Metadata generation –

Base standard

FGDC CSDGM ECS Core ECS Core/ISO 19115

Metadata generation –

Generation automation

manually with tools manually with tools automatically

Metadata ingestion –

Metadata Distribution

distributed centralized centralized

Metadata ingestion –

Ingestion Type

N/A Remotely and

automatically

Locally and automatically

Conceptual Schema FGDC CSDGM Based on ECS Core Based on ISO 19115 and

ebRIM

Transfer Protocol Z39.50 and GEO profile Proprietary and based

on Web Service

OGC Catalog Service and

HTTP binding

System distribution Distributed Centralized Centralized

Tải ngay đi em, còn do dự, trời tối mất!