Thư viện tri thức trực tuyến
Kho tài liệu với 50,000+ tài liệu học thuật
© 2023 Siêu thị PDF - Kho tài liệu học thuật hàng đầu Việt Nam

Tài liệu Query Processing in RDF/S-based P2P Database Systems ppt
Nội dung xem thử
Mô tả chi tiết
Query Processing in RDF/S-based P2P
Database Systems
George Kokkinidis, Lefteris Sidirourgos and Vassilis Christophides
Institute of Computer Science - FORTH
Vassilika Vouton, PO Box 1385, GR 71110, Heraklion, Greece and
Department of Computer Science, University of Crete
GR 71409, Heraklion, Greece
{kokkinid, lsidir, christop}@ics.forth.gr
1 Introduction
Peer-to-peer (P2P) computing is currently attracting enormous attention,
spurred by the popularity of file sharing systems such as Napster [31],
Gnutella [15], Freenet [9], Morpheus [30] and Kazaa [25]. In P2P systems a
very large number of autonomous computing nodes (the peers) pool together
their resources and rely on each other for data and services. P2P computing
introduces an interesting paradigm of decentralization going hand in hand
with an increasing self-organization of highly autonomous peers. This new
paradigm bears the potential to realize computing systems that scale to very
large numbers of participating nodes while ensuring fault-tolerance.
However, existing P2P systems offer very limited data management facilities. In most of the cases, searching relies on simple selection conditions on
attribute-value pairs or IR-style string pattern matching. These limitations
are acceptable for file-sharing applications, but in order to support highly
dynamic, ever-changing, autonomous social organizations (e.g., scientific or
educational communities) we need richer facilities in exchanging, querying
and integrating (semi-)structured data hosted by peers. To this end, we essentially need to adapt the P2P computing paradigm to a distributed data
management setting. More precisely, we would like to support loosely coupled
communities of peer bases, where each base can join and leave the network at
free will, while groups of peers can collaboratively undertake the responsibility
of query processing.
The importance of intensional (i.e., schema) information for integrating and querying peer bases has been highlighted by a number of recent
projects [4, 34, 17, 1]. A natural candidate for representing descriptive
schemata of information resources (ranging from simple structured vocabularies to complex reference models [40]) is the Resource Description Framework/Schema Language (RDF/S). In particular, RDF/S (a) enables a mod-
2 George Kokkinidis, Lefteris Sidirourgos and Vassilis Christophides
ular design of descriptive schemata based on the mechanism of namespaces;
(b) allows easy reuse or refinement of existing schemata through subsumption
of both class and property definitions; (c) supports partial descriptions since
properties associated with a resource are by default optional and repeated and
(d) permits super-imposed descriptions in the sense that a resource may be
multiply classified under several classes from one or several schemata. These
modelling primitives are crucial for P2P data management systems where
monolithic RDF/S schemata and resource descriptions cannot be constructed
in advance and peers may have only partial descriptions about the available
resources.
In this chapter, we present the ongoing SQPeer middleware for routing and
planning declarative queries in peer RDF/S bases by exploiting the schema
of peers. More precisely, we make the following contributions:
• In Section 2.1 we illustrate how peers can formulate complex (conjunctive)
queries against an RDF/S schema using RQL query patterns [23].
• In Section 2.2 we detail how peers can advertise their base at a fine-grained
level. In particular, we are employing RVL view patterns [29] for declaring
the parts of an RDF/S schema which are actually (or can be) populated
in a peer base.
• In Section 2.3 we introduce a semantic routing algorithm that matches a
given RQL query against a set of RVL peer views in order to localize relevant peer bases. More precisely, this algorithm relies on the query/view
subsumption techniques introduced in [8] to produce query patterns annotated with localization information.
• In Section 2.4 we describe how SQPeer query plans are generated by taking
into account the involved data distribution (e.g., vertical, horizontal) in
peer bases. To this end, we employ an object algebra for RQL queries
introduced in [24].
• In Section 2.5 we discuss several compile and run-time optimization opportunities for SQPeer query plans.
• In Section 3 we sketch how the SQPeer query routing and planning phases
can be actually used by groups of peers in order to deploy hybrid (i.e.,
super-peer) and structured P2P database systems.
Finally, Section 4 discusses related work and Section 5 summarizes our
contributions.
2 The SQPeer Middleware
In order to design an effective query routing and planning middleware for peer
RDF/S bases, we need to address the following issues:
1. How peer nodes formulate queries?
2. How peer nodes advertise their bases?
3. How peer nodes route a query?
4. How peer nodes process a query?
5. How distributed query plans are optimized?