Tài liệu HTML & CSS: The Complete Reference- P16 pdf

This page intentionally left blank

Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark.

Character Entities

Keyboard characters such as < and > have special meanings to (X)HTML because they

are part of HTML tags and must be encoded. Other characters, such as certain

foreign language accent characters and special symbols, can be difficult to specify,

depending on the keyboard being used. To address escaping of special-purpose characters

and inserting a wide range of characters and symbols, character entities should be

employed.

The format of character entities is in general

&code;

where code may be a

• A decimal form like Ë

• A hex form like Ë or stripped of leading zeros, simply &xCB;

• A named value if available, such as Ë

NOTE When using a hex form, either a lowercase or uppercase x may be used as well as upperand lowercase values for digits A–F, so &#XCB; and Ë and Ë and so on are all

equivalent. Case sensitivity is not, however, guaranteed for named entities and may result in

errors or wrong characters. Good style would suggest lowercase for the hex symbol and uppercase

for the digits.

As an example,

Numeric entity decimal: £

Numeric entity hex: £

Named entity: £

727

APPENDIX

Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark.

728 P a r t I I I : A p p e n d i x e s

would look like this:

Encoding Quirks and Considerations

Encoding characters is quite important if you want to validate your markup. For example,

consider when you have nontrivial query strings in (X)HTML links like so:

Does this <a href="http://www.pint.com/program?p1=foo&p2=bar">link</a>

validate?

The markup will not validate.

For this line to validate, you must encode the special characters in the link like so:

Does this <a href="http://www.pint.com/program?p1=foo&p2=bar">link</a>

validate?

Do not, however, take this as advice to change ampersands in typed URLs everywhere you

encounter them, such as within e-mails or the browser’s location bar. Typically, a browser

will exchange an entity for its correct value, but this change may not take place in other

environments.

Commonly, you will also have trouble when using characters that are part of (X)HTML

itself, particularly the less than (<) and greater than (>) symbols and, of course, the

ampersand that starts entities. As an example, consider this contrived example with a

mathematical expression:

A silly math statement ahead x<y>z is dangerous to validation.

Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark.

A p p e n d i x A : C h a r a c t e r E n t i t i e s 729

PART III

For the greatest safety, the markup should have had the special characters encoded like so:

A silly math statement ahead x<y>z is not dangerous to

validation.

We note that this example is fairly contrived and often just an extra space will allow the

validator (and browser) to tokenize the text correctly. For example,

A silly math statement ahead x < y > z is dangerous to validation?

will likely validate. The loose enforcement of special character handling is both a blessing

and a curse. It leads to sloppy usage and surprising bugs.

Sloppy syntax is troubling because interpretation may vary browser to browser.

Consider the point of case sensitivity of named entities in browsers. Named entities are

supposed to be case sensitive. For example, à and À are two different

characters.

Now given this fact, what should a browser do when faced with

&POUND; and £

Apparently it treats the first as text and the second as an entity.

But does that hold for all characters? Apparently not—some entities like © are

generally case insensitive, while others like ™ may vary by browser, and others like

¥ will always be case sensitive.

Initial drafts of HTML5 attempted to formalize what named entities should be case

insensitive; these drafts focused on the commonly used and supported entities. The current

list of what should be case-insensitive named entities is shown in Table A-1.

Best practice, however, would be not to rely on case insensitivity of named entities, it is

still inconsistent. In general, lax syntax enforcement and permissive interpretation of

entities in browsers just leads to all sorts of small quirks. Consider

&QUOTE; and &quote;

Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark.

730 P a r t I I I : A p p e n d i x e s

Under Internet Explorer, the rendering engine even in a strict mode will “fix” this

problem and effectively convert this into

&QUOT;E; and "e;

while other browsers will correctly leave this mistake alone.

While it turns out that SGML (and thus traditional HTML) does allow the final

semicolon to be left off in an entity in some cases, the preceding example clearly indicates it

does not allow for that latitude in the middle of words. Just as when dealing with markup

and CSS, it is best to get syntax right rather than rely on some variable fix-up applied by a

browser’s rendering engine.

There will be instances when you may get the syntax correct but the browser may not be

able to render the characters meaningfully. The reasons for nonsupport can vary and may

be because a particular font is missing or the operating environment or browser is unable to

render the character. Generally, browsers will present these failures as boxes or diamonds,

like so:

Named

Entity HTML5 Alias

Numbered

Entity Unicode Entity

Intended

Rendering Description

& &AMP; & & & Ampersand

> &GT; > > > Greater than

< &LT; < < < Less than

" &QUOT; " " “ Double quotes

® &REG; ® ® ® Registration mark

™ &TRADE; ™ ™ ™ Trademark symbol

TABLE A-1 Entities Considered Case Insensitive in HTML5

Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark.

Thư viện tri thức trực tuyến

Tài liệu HTML & CSS: The Complete Reference- P16 pdf

Nội dung xem thử

Mô tả chi tiết

Tài liệu tương tự (6)

Tài liệu HTML, DHTML VÀ JAVASCRIPT

Tài liệu HTML căn bản

Tài liệu HTML & CSS: The Complete Reference- P13 ppt

Tài liệu HTML & CSS: The Complete Reference- P12 pptx

Tài liệu HTML & CSS: The Complete Reference- P14 ppt

Tài liệu HTML & CSS: The Complete Reference- P15 docx