Siêu thị PDFTải ngay đi em, trời tối mất

Thư viện tri thức trực tuyến

Kho tài liệu với 50,000+ tài liệu học thuật

© 2023 Siêu thị PDF - Kho tài liệu học thuật hàng đầu Việt Nam

Beginning RSS and Atom programming
PREMIUM
Số trang
769
Kích thước
13.4 MB
Định dạng
PDF
Lượt xem
1584

Beginning RSS and Atom programming

Nội dung xem thử

Mô tả chi tiết

Beginning RSS and Atom Programming

Danny Ayers

Andrew Watt

01_579169 ffirs.qxd 3/31/05 12:26 PM Page i

01_579169 ffirs.qxd 3/31/05 12:26 PM Page vi

Beginning RSS and Atom Programming

Danny Ayers

Andrew Watt

01_579169 ffirs.qxd 3/31/05 12:26 PM Page i

Beginning RSS and Atom Programming

Published by

Wiley Publishing, Inc.

10475 Crosspoint Boulevard

Indianapolis, IN 46256

www.wiley.com

Copyright © 2005 by Wiley Publishing

Published by Wiley Publishing, Inc., Indianapolis, Indiana

Published simultaneously in Canada

ISBN-13: 978-0-7645-7916-5

ISBN-10: 0-7645-7916-9

Manufactured in the United States of America

10 9 8 7 6 5 4 3 2 1

1MA/RZ/QU/QV/IN

No part of this publication may be reproduced, stored in a retrieval system or transmitted in any form or by any means,

electronic, mechanical, photocopying, recording, scanning or otherwise, except as permitted under Sections 107 or 108 of

the 1976 United States Copyright Act, without either the prior written permission of the Publisher, or authorization

through payment of the appropriate per-copy fee to the Copyright Clearance Center, 222 Rosewood Drive, Danvers, MA

01923, (978) 750-8400, fax (978) 646-8600. Requests to the Publisher for permission should be addressed to the Legal

Department, Wiley Publishing, Inc., 10475 Crosspoint Blvd., Indianapolis, IN 46256, (317) 572-3447, fax (317) 572-4355,

www.wiley.com/go/permissions.

LIMIT OF LIABILITY/DISCLAIMER OF WARRANTY: THE PUBLISHER AND THE AUTHOR MAKE NO REPRE￾SENTATIONS OR WARRANTIES WITH RESPECT TO THE ACCURACY OR COMPLETENESS OF THE CONTENTS OF

THIS WORK AND SPECIFICALLY DISCLAIM ALL WARRANTIES, INCLUDING WITHOUT LIMITATION WAR￾RANTIES OF FITNESS FOR A PARTICULAR PURPOSE. NO WARRANTY MAY BE CREATED OR EXTENDED BY

SALES OR PROMOTIONAL MATERIALS. THE ADVICE AND STRATEGIES CONTAINED HEREIN MAY NOT BE

SUITABLE FOR EVERY SITUATION. THIS WORK IS SOLD WITH THE UNDERSTANDING THAT THE PUBLISHER IS

NOT ENGAGED IN RENDERING LEGAL, ACCOUNTING, OR OTHER PROFESSIONAL SERVICES. IF PROFES￾SIONAL ASSISTANCE IS REQUIRED, THE SERVICES OF A COMPETENT PROFESSIONAL PERSON SHOULD BE

SOUGHT. NEITHER THE PUBLISHER NOR THE AUTHOR SHALL BE LIABLE FOR DAMAGES ARISING HERE￾FROM. THE FACT THAT AN ORGANIZATION OR WEBSITE IS REFERRED TO IN THIS WORK AS A CITATION

AND/OR A POTENTIAL SOURCE OF FURTHER INFORMATION DOES NOT MEAN THAT THE AUTHOR OR THE

PUBLISHER ENDORSES THE INFORMATION THE ORGANIZATION OR WEBSITE MAY PROVIDE OR RECOM￾MENDATIONS IT MAY MAKE. FURTHER, READERS SHOULD BE AWARE THAT INTERNET WEBSITES LISTED IN

THIS WORK MAY HAVE CHANGED OR DISAPPEARED BETWEEN WHEN THIS WORK WAS WRITTEN AND

WHEN IT IS READ.

For general information on our other products and services or to obtain technical support, please contact our Customer

Care Department within the U.S. at (800) 762-2974, outside the U.S. at (317) 572-3993 or fax (317) 572-4002.

Wiley also publishes its books in a variety of electronic formats. Some content that appears in print may not be available

in electronic books.

Library of Congress Cataloging-in-Publication Data

Ayers, Danny.

Beginning RSS and atom programming / Danny Ayers, Andrew Watt.

p. cm.

Includes index.

ISBN-13: 978-0-7645-7916-5

ISBN-10: 0-7645-7916-9 (paper/website)

1. Internet programming. 2. Web site development. 3. MPLS standard. I. Watt, Andrew, 1953- II. Title.

QA76.625.A93 2005

006.7'6—dc22

2005003120

Trademarks: Wiley, the Wiley logo, Wrox, the Wrox logo, Programmer to Programmer, and related trade dress are trade￾marks or registered trademarks of John Wiley & Sons, Inc. and/or its affiliates, in the United States and other countries,

and may not be used without written permission. All other trademarks are the property of their respective owners. Wiley

Publishing, Inc., is not associated with any product or vendor mentioned in this book.

01_579169 ffirs.qxd 3/31/05 12:26 PM Page ii

About the Authors

Danny Ayers is a freelance developer, technical author, and consultant specializing in cutting-edge Web

technologies. He has worked with XML since its early days and got drawn into RSS development

around four years ago. He is an active member of the Atom Working Group, the Semantic Web Interest

Group, and various other Web-related community groups and organizations. He has been a regular

blogger for several years, generally posting on technical or feline issues. Originally from Tideswell in the

north of England, he now lives in a village near Lucca in Northern Italy with his wife, Caroline, a dog,

and a herd of cats.

I dedicate my contribution to this book to my wife, Caroline, and our four-legged companions, who have

tolerated my air of irritable distraction these past few months. Okay, actually for several years now.

Andrew Watt is an independent consultant and computer book author with an interest and expertise in

various XML technologies. Currently, he is focusing primarily on the use of XML in Microsoft technolo￾gies. He is a Microsoft Most Valuable Professional for Microsoft InfoPath 2003.

I dedicate my contribution to this book to the memory of my late father, George Alec Watt, a very special

human being.

01_579169 ffirs.qxd 3/31/05 12:26 PM Page iii

Credits

Acquisitions Editor

Jim Minatel

Development Editor

Kezia Endsley

Technical Editor

Brian Sletten

Editorial Manager

Mary Beth Wakefield

Vice President & Executive Group Publisher

Richard Swadley

Vice President and Publisher

Joseph B. Wikert

Project Coordinator

Erin Smith

Graphics and Production Specialists

Karl Brandt

Lauren Goddard

Jennifer Heleine

Amanda Spagnuolo

Julie Trippetti

Quality Control Technicians

Susan Moritz

Carl William Pierce

Brian Walls

Proofreading and Indexing

TECHBOOKS Production Services

01_579169 ffirs.qxd 3/31/05 12:26 PM Page iv

Acknowledgments

Danny Ayers: Many thanks first of all to Andrew for getting this book started and more generally for his

encouragement and role model of good-humored determination. Thanks to Jim Minatel for all the effort

that went into making this project happen and for his diplomacy when I needed to be nagged out of pro￾crastination. Many thanks to Kezia Endsley for taking care of the translation from Broad Derbyshire to

U.S. English and to Brian Sletten for keeping a keen eye on technical matters (and remembering my

birthday!).

I am extremely grateful to all the people who have helped me personally with various issues throughout

the book. Unfortunately, if I were to thank them individually this would read like an Oscars ceremony

screed. Worse, I’d also be bound to forget someone, and that just wouldn’t be nice. I can at least show

a little gratitude in my ongoing appreciation of their work, some of which will hopefully have been

reflected in this book. More generally, I’d like to thank the developers behind the Web, RSS, Atom, and

related technologies for providing such a rich seam of material to draw on and helping my own learning

through mailing-list discussions and blog conversations. The material is alive out there! Finally, I’d like

to thank the reader for showing an interest in a field that I personally believe has a lot to offer everyone

and is certain to play a significant role in the shaping of at least the Web landscape over the next few

years. Be inquisitive; be creative.

Andrew Watt: I thank Jim Minatel, acquisitions editor, for patience above and beyond the call of duty

as the writing of this book took much longer than we had all originally anticipated. I also thank Kezia

Endsley for helpful and patient editing and Brian Sletten for his constructive and assiduous technical

assessment.

01_579169 ffirs.qxd 3/31/05 12:26 PM Page v

01_579169 ffirs.qxd 3/31/05 12:26 PM Page vi

Contents

Acknowledgments v

Foreword by Dare Obasanjo xxvii

Foreword by Greg Reinacker xxix

Introduction xxxi

Part I: Understanding the Issues and Taking Control 1

Chapter 1: Managing the Flow of Information: A Crucial Skill 3

New Vistas of Information Flow 4

The Information Well and Information Flow 4

The Information Well 4

The Information Flow 5

The Information Flood 5

Managing Information 5

What Do You Want to Do with Information? 6

Browse and Discard 6

Read 6

Study and Keep 6

Taking Control of Information 7

Determining What Is Important to You 7

Avoiding Irrelevant Information 7

Determining the Quality of Information 8

Information Flows Other Than the Web 8

Books 8

Magazines 9

Newspapers 9

Broadcast Media 9

The Web and Information Feeds 10

New Information Opportunities 10

New Information Problems 10

The Need to Keep Up-to-Date 11

Distractions 11

Summary 11

Exercise 11

02_579169 ftoc.qxd 3/31/05 12:26 PM Page vii

viii

Contents

Chapter 2: Where Did Information Feeds Start? 13

The Nature of the Web 13

HTTP 14

HTML 14

XML 14

Polling the Web 14

Precursors to RSS 17

MCF and HotSauce 17

Netscape Channels 17

The Microsoft Channel Definition Format 18

RSS: An Acronym with Multiple Meanings 18

RSS 0.9 19

RSS 0.91 19

RSS 1.0 19

RSS 0.92, 0.93, and 0.94 19

RSS 2.0 19

Use of RSS and Atom Versions 20

Summary 21

Exercises 21

Chapter 3: The Content Provider Viewpoint 23

Why Give Your Content Away? 23

Selling Your Content 24

Creating Community 25

Content to Include in a Feed 25

The Importance of Item Titles 25

Internal Links 25

One Feed or Several? 26

Structuring Topics 26

Blogging Tools 26

Wikis 28

Publicizing Your Information Feed 28

Deciding on a Target Audience 28

Registering with Online Sites 28

How Information Feeds Can Affect Your Site’s Visibility 29

Advertisements and Information Feeds 29

Power to the User? 30

Filtering Out Advertisements 30

Summary 31

Exercise 31

02_579169 ftoc.qxd 3/31/05 12:26 PM Page viii

ix

Contents

Chapter 4: The Content Recipient Viewpoint 33

Access to Information 34

Convenience 34

Timeliness of Access 35

Timeliness and Data Type 35

Timeliness and Data Source 35

Newsreaders and Aggregators 36

Aggregating for Intranet Use 36

Security and Aggregators 36

Directories 37

Finding Information about Interesting Feeds 38

The Known Sites Approach 38

The Blogroll Approach 39

The Directory Approach 41

Filtering Information Feeds 42

Filtering Blogs 42

Summary 42

Chapter 5: Storing, Retrieving, and Exporting Information 43

Storing Information 44

Storing URLs 44

Storing Content 44

Storing Static Files 44

Relational Databases 45

RDF Triple Stores 46

Two Examples of Longer-Term Storage 46

Onfolio 2.0 46

OneNote 2003 50

Retrieving Information 52

Search in Onfolio 2.0 53

Search in OneNote 2003 54

Exporting Information 55

Exporting in Onfolio 2.0 55

Exporting in OneNote 2003 55

Summary 56

02_579169 ftoc.qxd 3/31/05 12:26 PM Page ix

x

Contents

Part II: The Technologies 57

Chapter 6: Essentials of XML 59

What Is XML? 60

XML Declaration 62

XML Names 62

XML Elements 62

XML Attributes 63

XML Comments 64

Predefined Entities 64

Character References 65

XML Namespaces 66

HTML, XHTML, and Feed Autodiscovery 68

Summary 70

Exercises 70

Chapter 7: Atom 0.3 71

Introducing Atom 0.3 72

The Atom 0.3 Specification 72

The Atom 0.3 Namespace 72

Atom 0.3 Document Structure 72

The feed Element 73

The title Element 74

The link Element 74

The author Element 74

The id Element 74

The generator Element 75

The copyright Element 75

The info Element 75

The modified Element 75

The tagline Element 75

The entry Element 76

Using Modules with Atom 0.3 78

Summary 79

Exercises 79

Chapter 8: RSS 0.91 and RSS 0.92 81

What Is RSS 0.91? 82

The RSS 0.91 Document Structure 82

The rss Element 83

The channel Element 83

02_579169 ftoc.qxd 3/31/05 12:26 PM Page x

xi

Contents

Required Child Elements of channel 83

Optional Child Elements of channel 84

The image Element 85

The textInput Element 85

The item Element 86

Introducing RSS 0.92 87

The RSS 0.92 Document Structure 87

New Child Elements of the item Element 87

The cloud Element 88

Summary 88

Exercises 88

Chapter 9: RSS 1.0 89

What Is RSS 1.0? 89

RSS 1.0 Is RDF 90

RSS 1.0 Uses XML Namespaces 90

RSS 1.0 Uses Modules 91

The RSS 1.0 Document Structure 91

The channel Element 93

The items Element 93

The image Element 94

The item Element 94

The textinput Element 95

Some Real-World RSS 1.0 95

Summary 98

Exercise 99

Chapter 10: RSS 1.0 Modules 101

RSS Modules 101

The RSS 1.0 Official Modules 102

RDF Parser Compatibility 103

Module Compatibility 103

The Content Module 103

The content:items Element 104

The content:item Element 104

The Dublin Core Module 105

The Syndication Module 107

Including Other Modules in RSS 1.0 Feed Documents 108

Adding the Namespace Declaration 108

The Admin Module 108

The FOAF Module 108

Summary 109

02_579169 ftoc.qxd 3/31/05 12:26 PM Page xi

xii

Contents

Chapter 11: RDF: The Resource Description Framework 111

What Is RDF? 112

Simple Metadata 112

Simple Facts Expressed in RDF 112

The RDF Triple 113

Using URIs in RDF 113

Directed Graphs 114

How RDF and XML Are Related 115

What RDF Is Used For 115

RDF and RSS 1.0 116

RDF Vocabularies 117

Dublin Core 117

FOAF 117

RDF Toolkits 118

Jena 118

Redland 118

RDFLib 118

rdfdata.org 118

Summary 118

Chapter 12: RSS 2.0: Really Simple Syndication 121

What Is RSS 2.0? 121

XML Namespaces in RSS 2.0 122

New Elements in RSS 2.0 122

The RSS 2.0 Document Structure 122

The rss Element 122

The channel Element 123

The image Element 124

The cloud Element 125

The textinput Element 125

The item Element 126

An Example RSS 2.0 Document 126

RSS 2.0 Extensions 127

The blogChannel RSS Module 127

Summary 128

Chapter 13: Looking Forward to Atom 1.0 129

Why Another Specification? 130

Aiming for Clarity 130

Archiving Feeds 130

02_579169 ftoc.qxd 3/31/05 12:26 PM Page xii

Tải ngay đi em, còn do dự, trời tối mất!