Tài liệu MySQL Administrator''''s Bible- P14 docx

Query Analysis and Index Tuning 18

The reason this is a full table scan is that there are no suitable indexes to use. We can use the

INFORMATION_SCHEMA table STATISTICS to show all the indexes on the rental table:

mysql> SELECT COLUMN_NAME, INDEX_NAME, SEQ_IN_INDEX AS pos

-> FROM INFORMATION_SCHEMA.STATISTICS

-> WHERE TABLE_SCHEMA=’sakila’ AND TABLE_NAME=’rental’;

+--------------+---------------------+-----+

| COLUMN_NAME | INDEX_NAME | pos |

+--------------+---------------------+-----+

| rental_id | PRIMARY | 1 |

| rental_date | rental_date | 1 |

| inventory_id | rental_date | 2 |

| customer_id | rental_date | 3 |

| inventory_id | idx_fk_inventory_id | 1 |

| customer_id | idx_fk_customer_id | 1 |

| staff_id | idx_fk_staff_id | 1 |

+--------------+---------------------+-----+

7 rows in set (0.11 sec)

There is no index that includes the return_date field, so add an index to optimize this query:

mysql> USE sakila;

Database changed

mysql> ALTER TABLE rental ADD INDEX (return_date);

Query OK, 16044 rows affected (12.08 sec)

Records: 16044 Duplicates: 0 Warnings: 0

mysql> EXPLAIN SELECT return_date FROM rental\G

*************************** 1. row ***************************

id: 1

select_type: SIMPLE

table: rental

type: index

possible_keys: NULL

key: return_date

key_len: 9

ref: NULL

rows: 16249

Extra: Using index

1 row in set (0.00 sec)

Now the type is index, which means a full scan of an index is being done. The index being

scanned is the return_date index (key), which we just created, with a length (key_len) of 9.

Is there a way to further optimize this query?

Looking at Table 18-1, data access strategy types below index involve using only parts of an

index. The query we are analyzing returns every value of the return_date field. Therefore,

there is no way to avoid accessing every value in the return_date index. mysqld needs to

617

Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark.

Part IV Extending Your Skills

access a value in order to return it, and every value is returned, so every value must be accessed.

This need to access every value is also shown by the lack of Using where in the Extra field.

Index consequences

n Chapter 6, we explained how indexes work. Indexes can make data retrieval faster because

they are ordered subsets of data, and can be searched faster than the entire set of data, which

may be ordered differently than an index. There is a cost to maintaining indexes. Data changes are

slower because the data needs to be inserted into the table and any appropriate indexes need to be

updated. An index needs uses disk space, memory, and processing power to stay up to date.

When analyzing queries, remember that there are tradeoffs for actions. Many times, adding an index

will make an application run faster because the query runs faster. However, there are times when

adding an index makes an application run more slowly, because although the SELECT query runs

faster, the INSERT, UPDATE, and DELETE queries run more slowly.

It helps to be familiar with the nature of all the queries against the database. If you find that selecting

a field from a table that stores user session information is slow, adding an index may make the

application slower because there are many changes to user session information. From time to time,

you may want to reexamine indexes to ensure that they are being used. An index that is not being

used is a waste of resources.

Optimizing away Using filesort

The Extra value Using filesort is not desirable; it means that mysqld has to pass through

the data an extra time in order to sort it. If the Extra value Using filesort shows up in a

subquery, it is best to optimize the query by eliminating the subquery. In queries that do not

involve subqueries, the Extra value Using filesort may occur in the EXPLAIN plan for

queries that use ORDER BY, DISTINCT, and GROUP BY.

ON the theWEBSITE More information on how to create and use subqueries can be found on the

accompanying website for this book at www.wiley.com/go/mysqladminbible.

For example, the following EXPLAIN plan is for a query to find the customer name and active

status based on an e-mail lookup, sorted by last name:

mysql> EXPLAIN SELECT first_name, last_name, active

-> FROM customer WHERE email=’[email protected]’

-> ORDER BY last_name\G

*************************** 1. row ***************************

id: 1

select_type: SIMPLE

618

Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark.

Query Analysis and Index Tuning 18

table: customer

type: ALL

possible_keys: NULL

key: NULL

key_len: NULL

ref: NULL

rows: 541

Extra: Using where; Using filesort

1 row in set (0.00 sec)

In order to optimize away the Using filesort, you need to have an index that

mysqld can use instead of sorting. In most cases, mysqld can only use one index, so

you will need to have an index that handles both the sorting and the filter of WHERE

email=’[email protected]’:

mysql> ALTER TABLE customer ADD INDEX (email, last_name);

Query OK, 599 rows affected (0.56 sec)

Records: 599 Duplicates: 0 Warnings: 0

mysql> EXPLAIN SELECT first_name, last_name, active

-> FROM customer WHERE email=’[email protected]’

-> ORDER BY last_name\G

*************************** 1. row ***************************

id: 1

select_type: SIMPLE

table: customer

type: ref

possible_keys: email

key: email

key_len: 203

ref: const

rows: 1

Extra: Using index condition; Using where

1 row in set (0.00 sec)

You have removed the undesirable Extra value Using filesort, and added the desirable

Using index condition. You have also gone from a data access strategy (type) of full table

scan (ALL) to one of looking up a nonunique index value (ref).

Often, first instincts may not fully optimize a query. For example, your first instinct in optimizing this query might have been to add an index on only the email field. This would have

optimized the data access strategy, but the query would still have an Extra value of Using

filesort. Having one index for both fields allows mysqld to use that index to optimize

the data access strategy and the filesort. It is always a good idea to test as many optimization

solutions as possible see the sidebar ‘‘Testing ideas.’’

619

Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark.

Part IV Extending Your Skills

Testing ideas

n the example from the section ‘‘Optimizing away Using filesort,’’ you might have tried to see if

mysqld would use an index on last_name only; if that was your first instinct, you can try out

the following commands to see if the index would work:

ALTER TABLE customer DROP KEY email;

ALTER TABLE customer ADD INDEX (last_name);

EXPLAIN SELECT first_name, last_name, active

FROM customer WHERE email=’[email protected]’

ORDER BY last_name\G

Sometimes, the first idea you have to optimize a query will not actually optimize the query. In this

case, the index on last_name does not help because mysqld needs to filter for the WHERE clause

first, before ordering. If mysqld was to use the index on last_name, it would have to go through

the entire index, and for each row in the index, look up the email field from the data to see if it

matched. If there were a match, the last_name would be put in the result set, and the first_name

and active field would be looked up and also put in the result set. Those lookups are a lot of extra

work, and the query optimizer rightfully uses a full table scan, even with an index on last_name.

There will be other times when the best solution for optimization is not the best solution overall for

the application. In this example, an index was added on (email, last_name) and the EXPLAIN

plan showed a key length (key_len) of 203. That is a very large key to keep up to date, and if it

slows down the application, it may be more beneficial to use an index with a shorter length, even

if it means mysqld has to do a filesort.

Optimizing away Range checked for each record

As shown in Table 18-2, the Extra value Range checked for each record is faster than

a full table scan (type: ALL) but slower than a full index scan (type: index). To optimize

queries with this Extra value, create or modify an index so that the query optimizer has a good

index to use. Often, optimizing queries to get rid of Range checked for each record results

in a data access strategy (type) of range, ref or eq_ref.

Optimizing away Using temporary

Unlike in previous discussions, optimizing away an Extra value of Using temporary cannot

be done by adding an index. Using temporary is undesirable, as it means that a temporary

table must be used to store intermediate results. There are several ways to optimize this,

depending on why a temporary table is used:

■ If ORDER BY and GROUP BY are both present, and use different fields and/or ordering,

the way to optimize this is to get rid of either the ORDER BY or the GROUP BY. This may

620

Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark.

Query Analysis and Index Tuning 18

be done by splitting the query into two queries. It may be possible to combine the two

queries by using UNION so that intermediate results do not need to be stored in a temporary table.

■ The presence of ORDER BY and DISTINCT may cause a temporary table to be

used. The way to optimize this is to get rid of either the ORDER BY or the DISTINCT. This

may be done by splitting the query into two queries. It may be possible to combine the

two queries by using UNION so that intermediate results do not need to be stored in a

temporary table.

■ If the SQL_CALC_FOUND_ROWS keyword is used, the number of rows is stored in a temporary table, which can be retrieved by issuing SELECT FOUND ROWS(). To optimize, get

rid of SQL_CALC_FOUND_ROWS. Depending on what you are counting, you might count

results periodically and have an estimate for a time period (i.e., run a query every 10 minutes to put the number into table and read the table, doing one count every 10 minutes

instead of one count every time the query is issued).

■ The SQL_SMALL_RESULT keyword is used in a SELECT statement with DISTINCT or

GROUP BY. The SQL_SMALL_RESULT keyword is a hint to the optimizer that the result

is small, and thus it should use a temporary table instead of a filesort. To optimize, get rid

of SQL_SMALL_RESULT. If you need the SQL_SMALL_RESULT keyword because a temporary table is more desirable than a filesort, then you cannot optimize Using temporary

away.

If you use optimizer hints, be sure to run periodic testing. Only through periodic testing can you determine whether a temporary table or a filesort is better for your particular

situation.

■ ORDER BY or GROUP BY is used on a field that is not the first table in the join queue (the

first row returned in the EXPLAIN plan). One way to optimize this query is to change or

eliminate the ORDER BY clause. Another way would be to change the filter so that the table

order changes.

For example, the following query uses the customer table first in the join queue, but is

sorting based on rental_date, a field in the rental table:

mysql> EXPLAIN SELECT first_name, last_name FROM rental

-> INNER JOIN customer USING (customer_id)

-> ORDER BY rental_date\G

*************************** 1. row ***************************

id: 1

select_type: SIMPLE

table: customer

type: ALL

possible_keys: PRIMARY

key: NULL

key_len: NULL

ref: NULL

rows: 591

Extra: Using temporary; Using filesort

621

Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark.

Thư viện tri thức trực tuyến

Tài liệu MySQL Administrator''''s Bible- P14 docx

Nội dung xem thử

Mô tả chi tiết

Tài liệu tương tự (6)

Tài liệu MySQL Administrator''''s Bible- P7 pdf

Tài liệu MySQL Administrator''''s Bible- P11 pptx

Tài liệu MySQL Administrator''''s Bible- P12 doc

Tài liệu MySQL Administrator''''s Bible- P10 doc

Tài liệu MySQL Administrator''''s Bible- P9 doc

Tài liệu MySQL Administrator''''s Bible- P13 pdf