Siêu thị PDFTải ngay đi em, trời tối mất

Thư viện tri thức trực tuyến

Kho tài liệu với 50,000+ tài liệu học thuật

© 2023 Siêu thị PDF - Kho tài liệu học thuật hàng đầu Việt Nam

Tài liệu Dive Into Python-Chapter 11. HTTP Web Services doc
MIỄN PHÍ
Số trang
60
Kích thước
206.3 KB
Định dạng
PDF
Lượt xem
1911

Tài liệu Dive Into Python-Chapter 11. HTTP Web Services doc

Nội dung xem thử

Mô tả chi tiết

Chapter 11. HTTP Web Services

11.1. Diving in

You've learned about HTML processing and XML processing, and along the

way you saw how to download a web page and how to parse XML from a

URL, but let's dive into the more general topic of HTTP web services.

Simply stated, HTTP web services are programmatic ways of sending and

receiving data from remote servers using the operations of HTTP directly. If

you want to get data from the server, use a straight HTTP GET; if you want

to send new data to the server, use HTTP POST. (Some more advanced

HTTP web service APIs also define ways of modifying existing data and

deleting data, using HTTP PUT and HTTP DELETE.) In other words, the

“verbs” built into the HTTP protocol (GET, POST, PUT, and DELETE)

map directly to application-level operations for receiving, sending,

modifying, and deleting data.

The main advantage of this approach is simplicity, and its simplicity has

proven popular with a lot of different sites. Data -- usually XML data -- can

be built and stored statically, or generated dynamically by a server-side

script, and all major languages include an HTTP library for downloading it.

Debugging is also easier, because you can load up the web service in any

web browser and see the raw data. Modern browsers will even nicely format

and pretty-print XML data for you, to allow you to quickly navigate through

it.

Examples of pure XML-over-HTTP web services:

* Amazon API allows you to retrieve product information from the

Amazon.com online store.

* National Weather Service (United States) and Hong Kong Observatory

(Hong Kong) offer weather alerts as a web service.

* Atom API for managing web-based content.

* Syndicated feeds from weblogs and news sites bring you up-to-the￾minute news from a variety of sites.

In later chapters, you'll explore APIs which use HTTP as a transport for

sending and receiving data, but don't map application semantics to the

underlying HTTP semantics. (They tunnel everything over HTTP POST.)

But this chapter will concentrate on using HTTP GET to get data from a

remote server, and you'll explore several HTTP features you can use to get

the maximum benefit out of pure HTTP web services.

Here is a more advanced version of the openanything module that you saw

in the previous chapter:

Example 11.1. openanything.py

If you have not already done so, you can download this and other examples

used in this book.

import urllib2, urlparse, gzip

from StringIO import StringIO

USER_AGENT = 'OpenAnything/1.0

+http://diveintopython.org/http_web_services/'

class SmartRedirectHandler(urllib2.HTTPRedirectHandler):

def http_error_301(self, req, fp, code, msg, headers):

result = urllib2.HTTPRedirectHandler.http_error_301(

self, req, fp, code, msg, headers)

result.status = code

return result

def http_error_302(self, req, fp, code, msg, headers):

result = urllib2.HTTPRedirectHandler.http_error_302(

self, req, fp, code, msg, headers)

result.status = code

return result

class DefaultErrorHandler(urllib2.HTTPDefaultErrorHandler):

def http_error_default(self, req, fp, code, msg, headers):

result = urllib2.HTTPError(

req.get_full_url(), code, msg, headers, fp)

result.status = code

return result

def openAnything(source, etag=None, lastmodified=None,

agent=USER_AGENT):

'''URL, filename, or string --> stream

This function lets you define parsers that take any input source

(URL, pathname to local or network file, or actual data as a string)

and deal with it in a uniform manner. Returned object is guaranteed

to have all the basic stdio read methods (read, readline, readlines).

Just .close() the object when you're done with it.

If the etag argument is supplied, it will be used as the value of an

If-None-Match request header.

If the lastmodified argument is supplied, it must be a formatted

date/time string in GMT (as returned in the Last-Modified header of

a previous request). The formatted date/time will be used

as the value of an If-Modified-Since request header.

If the agent argument is supplied, it will be used as the value of a

User-Agent request header.

'''

if hasattr(source, 'read'):

return source

if source == '-':

return sys.stdin

if urlparse.urlparse(source)[0] == 'http':

# open URL with urllib2

request = urllib2.Request(source)

request.add_header('User-Agent', agent)

if etag:

Tải ngay đi em, còn do dự, trời tối mất!