Thư viện tri thức trực tuyến
Kho tài liệu với 50,000+ tài liệu học thuật
© 2023 Siêu thị PDF - Kho tài liệu học thuật hàng đầu Việt Nam

Tài liệu Dive Into Python-Chapter 10. Scripts and Streams docx
Nội dung xem thử
Mô tả chi tiết
Chapter 10. Scripts and Streams
10.1. Abstracting input sources
One of Python's greatest strengths is its dynamic binding, and one powerful
use of dynamic binding is the file-like object.
Many functions which require an input source could simply take a filename,
go open the file for reading, read it, and close it when they're done. But they
don't. Instead, they take a file-like object.
In the simplest case, a file-like object is any object with a read method with
an optional size parameter, which returns a string. When called with no size
parameter, it reads everything there is to read from the input source and
returns all the data as a single string. When called with a size parameter, it
reads that much from the input source and returns that much data; when
called again, it picks up where it left off and returns the next chunk of data.
This is how reading from real files works; the difference is that you're not
limiting yourself to real files. The input source could be anything: a file on
disk, a web page, even a hard-coded string. As long as you pass a file-like
object to the function, and the function simply calls the object's read method,
the function can handle any kind of input source without specific code to
handle each kind.
In case you were wondering how this relates to XML processing,
minidom.parse is one such function which can take a file-like object.
Example 10.1. Parsing XML from a file
>>> from xml.dom import minidom
>>> fsock = open('binary.xml') 1
>>> xmldoc = minidom.parse(fsock) 2
>>> fsock.close() 3
>>> print xmldoc.toxml() 4
<?xml version="1.0" ?>
<grammar>
<ref id="bit">
<p>0</p>
<p>1</p>
</ref>
<ref id="byte">
<p><xref id="bit"/><xref id="bit"/><xref id="bit"/><xref id="bit"/>\
<xref id="bit"/><xref id="bit"/><xref id="bit"/><xref id="bit"/></p>
</ref>
</grammar>
1 First, you open the file on disk. This gives you a file object.
2 You pass the file object to minidom.parse, which calls the read
method of fsock and reads the XML document from the file on disk.
3 Be sure to call the close method of the file object after you're done
with it. minidom.parse will not do this for you.
4 Calling the toxml() method on the returned XML document prints out
the entire thing.
Well, that all seems like a colossal waste of time. After all, you've already
seen that minidom.parse can simply take the filename and do all the opening
and closing nonsense automatically. And it's true that if you know you're just
going to be parsing a local file, you can pass the filename and
minidom.parse is smart enough to Do The Right Thing™. But notice how
similar -- and easy -- it is to parse an XML document straight from the
Internet.
Example 10.2. Parsing XML from a URL
>>> import urllib
>>> usock = urllib.urlopen('http://slashdot.org/slashdot.rdf') 1
>>> xmldoc = minidom.parse(usock) 2
>>> usock.close() 3
>>> print xmldoc.toxml() 4
<?xml version="1.0" ?>
<rdf:RDF xmlns="http://my.netscape.com/rdf/simple/0.9/"
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
<channel>