Thư viện tri thức trực tuyến
Kho tài liệu với 50,000+ tài liệu học thuật
© 2023 Siêu thị PDF - Kho tài liệu học thuật hàng đầu Việt Nam

Beginning RSS and Atom programming
Nội dung xem thử
Mô tả chi tiết
Beginning RSS and Atom Programming
Danny Ayers
Andrew Watt
01_579169 ffirs.qxd 3/31/05 12:26 PM Page i
01_579169 ffirs.qxd 3/31/05 12:26 PM Page vi
Beginning RSS and Atom Programming
Danny Ayers
Andrew Watt
01_579169 ffirs.qxd 3/31/05 12:26 PM Page i
Beginning RSS and Atom Programming
Published by
Wiley Publishing, Inc.
10475 Crosspoint Boulevard
Indianapolis, IN 46256
www.wiley.com
Copyright © 2005 by Wiley Publishing
Published by Wiley Publishing, Inc., Indianapolis, Indiana
Published simultaneously in Canada
ISBN-13: 978-0-7645-7916-5
ISBN-10: 0-7645-7916-9
Manufactured in the United States of America
10 9 8 7 6 5 4 3 2 1
1MA/RZ/QU/QV/IN
No part of this publication may be reproduced, stored in a retrieval system or transmitted in any form or by any means,
electronic, mechanical, photocopying, recording, scanning or otherwise, except as permitted under Sections 107 or 108 of
the 1976 United States Copyright Act, without either the prior written permission of the Publisher, or authorization
through payment of the appropriate per-copy fee to the Copyright Clearance Center, 222 Rosewood Drive, Danvers, MA
01923, (978) 750-8400, fax (978) 646-8600. Requests to the Publisher for permission should be addressed to the Legal
Department, Wiley Publishing, Inc., 10475 Crosspoint Blvd., Indianapolis, IN 46256, (317) 572-3447, fax (317) 572-4355,
www.wiley.com/go/permissions.
LIMIT OF LIABILITY/DISCLAIMER OF WARRANTY: THE PUBLISHER AND THE AUTHOR MAKE NO REPRESENTATIONS OR WARRANTIES WITH RESPECT TO THE ACCURACY OR COMPLETENESS OF THE CONTENTS OF
THIS WORK AND SPECIFICALLY DISCLAIM ALL WARRANTIES, INCLUDING WITHOUT LIMITATION WARRANTIES OF FITNESS FOR A PARTICULAR PURPOSE. NO WARRANTY MAY BE CREATED OR EXTENDED BY
SALES OR PROMOTIONAL MATERIALS. THE ADVICE AND STRATEGIES CONTAINED HEREIN MAY NOT BE
SUITABLE FOR EVERY SITUATION. THIS WORK IS SOLD WITH THE UNDERSTANDING THAT THE PUBLISHER IS
NOT ENGAGED IN RENDERING LEGAL, ACCOUNTING, OR OTHER PROFESSIONAL SERVICES. IF PROFESSIONAL ASSISTANCE IS REQUIRED, THE SERVICES OF A COMPETENT PROFESSIONAL PERSON SHOULD BE
SOUGHT. NEITHER THE PUBLISHER NOR THE AUTHOR SHALL BE LIABLE FOR DAMAGES ARISING HEREFROM. THE FACT THAT AN ORGANIZATION OR WEBSITE IS REFERRED TO IN THIS WORK AS A CITATION
AND/OR A POTENTIAL SOURCE OF FURTHER INFORMATION DOES NOT MEAN THAT THE AUTHOR OR THE
PUBLISHER ENDORSES THE INFORMATION THE ORGANIZATION OR WEBSITE MAY PROVIDE OR RECOMMENDATIONS IT MAY MAKE. FURTHER, READERS SHOULD BE AWARE THAT INTERNET WEBSITES LISTED IN
THIS WORK MAY HAVE CHANGED OR DISAPPEARED BETWEEN WHEN THIS WORK WAS WRITTEN AND
WHEN IT IS READ.
For general information on our other products and services or to obtain technical support, please contact our Customer
Care Department within the U.S. at (800) 762-2974, outside the U.S. at (317) 572-3993 or fax (317) 572-4002.
Wiley also publishes its books in a variety of electronic formats. Some content that appears in print may not be available
in electronic books.
Library of Congress Cataloging-in-Publication Data
Ayers, Danny.
Beginning RSS and atom programming / Danny Ayers, Andrew Watt.
p. cm.
Includes index.
ISBN-13: 978-0-7645-7916-5
ISBN-10: 0-7645-7916-9 (paper/website)
1. Internet programming. 2. Web site development. 3. MPLS standard. I. Watt, Andrew, 1953- II. Title.
QA76.625.A93 2005
006.7'6—dc22
2005003120
Trademarks: Wiley, the Wiley logo, Wrox, the Wrox logo, Programmer to Programmer, and related trade dress are trademarks or registered trademarks of John Wiley & Sons, Inc. and/or its affiliates, in the United States and other countries,
and may not be used without written permission. All other trademarks are the property of their respective owners. Wiley
Publishing, Inc., is not associated with any product or vendor mentioned in this book.
01_579169 ffirs.qxd 3/31/05 12:26 PM Page ii
About the Authors
Danny Ayers is a freelance developer, technical author, and consultant specializing in cutting-edge Web
technologies. He has worked with XML since its early days and got drawn into RSS development
around four years ago. He is an active member of the Atom Working Group, the Semantic Web Interest
Group, and various other Web-related community groups and organizations. He has been a regular
blogger for several years, generally posting on technical or feline issues. Originally from Tideswell in the
north of England, he now lives in a village near Lucca in Northern Italy with his wife, Caroline, a dog,
and a herd of cats.
I dedicate my contribution to this book to my wife, Caroline, and our four-legged companions, who have
tolerated my air of irritable distraction these past few months. Okay, actually for several years now.
Andrew Watt is an independent consultant and computer book author with an interest and expertise in
various XML technologies. Currently, he is focusing primarily on the use of XML in Microsoft technologies. He is a Microsoft Most Valuable Professional for Microsoft InfoPath 2003.
I dedicate my contribution to this book to the memory of my late father, George Alec Watt, a very special
human being.
01_579169 ffirs.qxd 3/31/05 12:26 PM Page iii
Credits
Acquisitions Editor
Jim Minatel
Development Editor
Kezia Endsley
Technical Editor
Brian Sletten
Editorial Manager
Mary Beth Wakefield
Vice President & Executive Group Publisher
Richard Swadley
Vice President and Publisher
Joseph B. Wikert
Project Coordinator
Erin Smith
Graphics and Production Specialists
Karl Brandt
Lauren Goddard
Jennifer Heleine
Amanda Spagnuolo
Julie Trippetti
Quality Control Technicians
Susan Moritz
Carl William Pierce
Brian Walls
Proofreading and Indexing
TECHBOOKS Production Services
01_579169 ffirs.qxd 3/31/05 12:26 PM Page iv
Acknowledgments
Danny Ayers: Many thanks first of all to Andrew for getting this book started and more generally for his
encouragement and role model of good-humored determination. Thanks to Jim Minatel for all the effort
that went into making this project happen and for his diplomacy when I needed to be nagged out of procrastination. Many thanks to Kezia Endsley for taking care of the translation from Broad Derbyshire to
U.S. English and to Brian Sletten for keeping a keen eye on technical matters (and remembering my
birthday!).
I am extremely grateful to all the people who have helped me personally with various issues throughout
the book. Unfortunately, if I were to thank them individually this would read like an Oscars ceremony
screed. Worse, I’d also be bound to forget someone, and that just wouldn’t be nice. I can at least show
a little gratitude in my ongoing appreciation of their work, some of which will hopefully have been
reflected in this book. More generally, I’d like to thank the developers behind the Web, RSS, Atom, and
related technologies for providing such a rich seam of material to draw on and helping my own learning
through mailing-list discussions and blog conversations. The material is alive out there! Finally, I’d like
to thank the reader for showing an interest in a field that I personally believe has a lot to offer everyone
and is certain to play a significant role in the shaping of at least the Web landscape over the next few
years. Be inquisitive; be creative.
Andrew Watt: I thank Jim Minatel, acquisitions editor, for patience above and beyond the call of duty
as the writing of this book took much longer than we had all originally anticipated. I also thank Kezia
Endsley for helpful and patient editing and Brian Sletten for his constructive and assiduous technical
assessment.
01_579169 ffirs.qxd 3/31/05 12:26 PM Page v
01_579169 ffirs.qxd 3/31/05 12:26 PM Page vi
Contents
Acknowledgments v
Foreword by Dare Obasanjo xxvii
Foreword by Greg Reinacker xxix
Introduction xxxi
Part I: Understanding the Issues and Taking Control 1
Chapter 1: Managing the Flow of Information: A Crucial Skill 3
New Vistas of Information Flow 4
The Information Well and Information Flow 4
The Information Well 4
The Information Flow 5
The Information Flood 5
Managing Information 5
What Do You Want to Do with Information? 6
Browse and Discard 6
Read 6
Study and Keep 6
Taking Control of Information 7
Determining What Is Important to You 7
Avoiding Irrelevant Information 7
Determining the Quality of Information 8
Information Flows Other Than the Web 8
Books 8
Magazines 9
Newspapers 9
Broadcast Media 9
The Web and Information Feeds 10
New Information Opportunities 10
New Information Problems 10
The Need to Keep Up-to-Date 11
Distractions 11
Summary 11
Exercise 11
02_579169 ftoc.qxd 3/31/05 12:26 PM Page vii
viii
Contents
Chapter 2: Where Did Information Feeds Start? 13
The Nature of the Web 13
HTTP 14
HTML 14
XML 14
Polling the Web 14
Precursors to RSS 17
MCF and HotSauce 17
Netscape Channels 17
The Microsoft Channel Definition Format 18
RSS: An Acronym with Multiple Meanings 18
RSS 0.9 19
RSS 0.91 19
RSS 1.0 19
RSS 0.92, 0.93, and 0.94 19
RSS 2.0 19
Use of RSS and Atom Versions 20
Summary 21
Exercises 21
Chapter 3: The Content Provider Viewpoint 23
Why Give Your Content Away? 23
Selling Your Content 24
Creating Community 25
Content to Include in a Feed 25
The Importance of Item Titles 25
Internal Links 25
One Feed or Several? 26
Structuring Topics 26
Blogging Tools 26
Wikis 28
Publicizing Your Information Feed 28
Deciding on a Target Audience 28
Registering with Online Sites 28
How Information Feeds Can Affect Your Site’s Visibility 29
Advertisements and Information Feeds 29
Power to the User? 30
Filtering Out Advertisements 30
Summary 31
Exercise 31
02_579169 ftoc.qxd 3/31/05 12:26 PM Page viii
ix
Contents
Chapter 4: The Content Recipient Viewpoint 33
Access to Information 34
Convenience 34
Timeliness of Access 35
Timeliness and Data Type 35
Timeliness and Data Source 35
Newsreaders and Aggregators 36
Aggregating for Intranet Use 36
Security and Aggregators 36
Directories 37
Finding Information about Interesting Feeds 38
The Known Sites Approach 38
The Blogroll Approach 39
The Directory Approach 41
Filtering Information Feeds 42
Filtering Blogs 42
Summary 42
Chapter 5: Storing, Retrieving, and Exporting Information 43
Storing Information 44
Storing URLs 44
Storing Content 44
Storing Static Files 44
Relational Databases 45
RDF Triple Stores 46
Two Examples of Longer-Term Storage 46
Onfolio 2.0 46
OneNote 2003 50
Retrieving Information 52
Search in Onfolio 2.0 53
Search in OneNote 2003 54
Exporting Information 55
Exporting in Onfolio 2.0 55
Exporting in OneNote 2003 55
Summary 56
02_579169 ftoc.qxd 3/31/05 12:26 PM Page ix
x
Contents
Part II: The Technologies 57
Chapter 6: Essentials of XML 59
What Is XML? 60
XML Declaration 62
XML Names 62
XML Elements 62
XML Attributes 63
XML Comments 64
Predefined Entities 64
Character References 65
XML Namespaces 66
HTML, XHTML, and Feed Autodiscovery 68
Summary 70
Exercises 70
Chapter 7: Atom 0.3 71
Introducing Atom 0.3 72
The Atom 0.3 Specification 72
The Atom 0.3 Namespace 72
Atom 0.3 Document Structure 72
The feed Element 73
The title Element 74
The link Element 74
The author Element 74
The id Element 74
The generator Element 75
The copyright Element 75
The info Element 75
The modified Element 75
The tagline Element 75
The entry Element 76
Using Modules with Atom 0.3 78
Summary 79
Exercises 79
Chapter 8: RSS 0.91 and RSS 0.92 81
What Is RSS 0.91? 82
The RSS 0.91 Document Structure 82
The rss Element 83
The channel Element 83
02_579169 ftoc.qxd 3/31/05 12:26 PM Page x
xi
Contents
Required Child Elements of channel 83
Optional Child Elements of channel 84
The image Element 85
The textInput Element 85
The item Element 86
Introducing RSS 0.92 87
The RSS 0.92 Document Structure 87
New Child Elements of the item Element 87
The cloud Element 88
Summary 88
Exercises 88
Chapter 9: RSS 1.0 89
What Is RSS 1.0? 89
RSS 1.0 Is RDF 90
RSS 1.0 Uses XML Namespaces 90
RSS 1.0 Uses Modules 91
The RSS 1.0 Document Structure 91
The channel Element 93
The items Element 93
The image Element 94
The item Element 94
The textinput Element 95
Some Real-World RSS 1.0 95
Summary 98
Exercise 99
Chapter 10: RSS 1.0 Modules 101
RSS Modules 101
The RSS 1.0 Official Modules 102
RDF Parser Compatibility 103
Module Compatibility 103
The Content Module 103
The content:items Element 104
The content:item Element 104
The Dublin Core Module 105
The Syndication Module 107
Including Other Modules in RSS 1.0 Feed Documents 108
Adding the Namespace Declaration 108
The Admin Module 108
The FOAF Module 108
Summary 109
02_579169 ftoc.qxd 3/31/05 12:26 PM Page xi
xii
Contents
Chapter 11: RDF: The Resource Description Framework 111
What Is RDF? 112
Simple Metadata 112
Simple Facts Expressed in RDF 112
The RDF Triple 113
Using URIs in RDF 113
Directed Graphs 114
How RDF and XML Are Related 115
What RDF Is Used For 115
RDF and RSS 1.0 116
RDF Vocabularies 117
Dublin Core 117
FOAF 117
RDF Toolkits 118
Jena 118
Redland 118
RDFLib 118
rdfdata.org 118
Summary 118
Chapter 12: RSS 2.0: Really Simple Syndication 121
What Is RSS 2.0? 121
XML Namespaces in RSS 2.0 122
New Elements in RSS 2.0 122
The RSS 2.0 Document Structure 122
The rss Element 122
The channel Element 123
The image Element 124
The cloud Element 125
The textinput Element 125
The item Element 126
An Example RSS 2.0 Document 126
RSS 2.0 Extensions 127
The blogChannel RSS Module 127
Summary 128
Chapter 13: Looking Forward to Atom 1.0 129
Why Another Specification? 130
Aiming for Clarity 130
Archiving Feeds 130
02_579169 ftoc.qxd 3/31/05 12:26 PM Page xii