Thư viện tri thức trực tuyến
Kho tài liệu với 50,000+ tài liệu học thuật
© 2023 Siêu thị PDF - Kho tài liệu học thuật hàng đầu Việt Nam

Tài liệu MySQL Administrator''''s Bible- P14 docx
Nội dung xem thử
Mô tả chi tiết
Query Analysis and Index Tuning 18
The reason this is a full table scan is that there are no suitable indexes to use. We can use the
INFORMATION_SCHEMA table STATISTICS to show all the indexes on the rental table:
mysql> SELECT COLUMN_NAME, INDEX_NAME, SEQ_IN_INDEX AS pos
-> FROM INFORMATION_SCHEMA.STATISTICS
-> WHERE TABLE_SCHEMA=’sakila’ AND TABLE_NAME=’rental’;
+--------------+---------------------+-----+
| COLUMN_NAME | INDEX_NAME | pos |
+--------------+---------------------+-----+
| rental_id | PRIMARY | 1 |
| rental_date | rental_date | 1 |
| inventory_id | rental_date | 2 |
| customer_id | rental_date | 3 |
| inventory_id | idx_fk_inventory_id | 1 |
| customer_id | idx_fk_customer_id | 1 |
| staff_id | idx_fk_staff_id | 1 |
+--------------+---------------------+-----+
7 rows in set (0.11 sec)
There is no index that includes the return_date field, so add an index to optimize this query:
mysql> USE sakila;
Database changed
mysql> ALTER TABLE rental ADD INDEX (return_date);
Query OK, 16044 rows affected (12.08 sec)
Records: 16044 Duplicates: 0 Warnings: 0
mysql> EXPLAIN SELECT return_date FROM rental\G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: rental
type: index
possible_keys: NULL
key: return_date
key_len: 9
ref: NULL
rows: 16249
Extra: Using index
1 row in set (0.00 sec)
Now the type is index, which means a full scan of an index is being done. The index being
scanned is the return_date index (key), which we just created, with a length (key_len) of 9.
Is there a way to further optimize this query?
Looking at Table 18-1, data access strategy types below index involve using only parts of an
index. The query we are analyzing returns every value of the return_date field. Therefore,
there is no way to avoid accessing every value in the return_date index. mysqld needs to
617
Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark.
Part IV Extending Your Skills
access a value in order to return it, and every value is returned, so every value must be accessed.
This need to access every value is also shown by the lack of Using where in the Extra field.
Index consequences
I
n Chapter 6, we explained how indexes work. Indexes can make data retrieval faster because
they are ordered subsets of data, and can be searched faster than the entire set of data, which
may be ordered differently than an index. There is a cost to maintaining indexes. Data changes are
slower because the data needs to be inserted into the table and any appropriate indexes need to be
updated. An index needs uses disk space, memory, and processing power to stay up to date.
When analyzing queries, remember that there are tradeoffs for actions. Many times, adding an index
will make an application run faster because the query runs faster. However, there are times when
adding an index makes an application run more slowly, because although the SELECT query runs
faster, the INSERT, UPDATE, and DELETE queries run more slowly.
It helps to be familiar with the nature of all the queries against the database. If you find that selecting
a field from a table that stores user session information is slow, adding an index may make the
application slower because there are many changes to user session information. From time to time,
you may want to reexamine indexes to ensure that they are being used. An index that is not being
used is a waste of resources.
Optimizing away Using filesort
The Extra value Using filesort is not desirable; it means that mysqld has to pass through
the data an extra time in order to sort it. If the Extra value Using filesort shows up in a
subquery, it is best to optimize the query by eliminating the subquery. In queries that do not
involve subqueries, the Extra value Using filesort may occur in the EXPLAIN plan for
queries that use ORDER BY, DISTINCT, and GROUP BY.
ON the theWEBSITE More information on how to create and use subqueries can be found on the
accompanying website for this book at www.wiley.com/go/mysqladminbible.
For example, the following EXPLAIN plan is for a query to find the customer name and active
status based on an e-mail lookup, sorted by last name:
mysql> EXPLAIN SELECT first_name, last_name, active
-> FROM customer WHERE email=’[email protected]’
-> ORDER BY last_name\G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
618
Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark.
Query Analysis and Index Tuning 18
table: customer
type: ALL
possible_keys: NULL
key: NULL
key_len: NULL
ref: NULL
rows: 541
Extra: Using where; Using filesort
1 row in set (0.00 sec)
In order to optimize away the Using filesort, you need to have an index that
mysqld can use instead of sorting. In most cases, mysqld can only use one index, so
you will need to have an index that handles both the sorting and the filter of WHERE
email=’[email protected]’:
mysql> ALTER TABLE customer ADD INDEX (email, last_name);
Query OK, 599 rows affected (0.56 sec)
Records: 599 Duplicates: 0 Warnings: 0
mysql> EXPLAIN SELECT first_name, last_name, active
-> FROM customer WHERE email=’[email protected]’
-> ORDER BY last_name\G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: customer
type: ref
possible_keys: email
key: email
key_len: 203
ref: const
rows: 1
Extra: Using index condition; Using where
1 row in set (0.00 sec)
You have removed the undesirable Extra value Using filesort, and added the desirable
Using index condition. You have also gone from a data access strategy (type) of full table
scan (ALL) to one of looking up a nonunique index value (ref).
Often, first instincts may not fully optimize a query. For example, your first instinct in optimizing this query might have been to add an index on only the email field. This would have
optimized the data access strategy, but the query would still have an Extra value of Using
filesort. Having one index for both fields allows mysqld to use that index to optimize
the data access strategy and the filesort. It is always a good idea to test as many optimization
solutions as possible see the sidebar ‘‘Testing ideas.’’
619
Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark.
Part IV Extending Your Skills
Testing ideas
I
n the example from the section ‘‘Optimizing away Using filesort,’’ you might have tried to see if
mysqld would use an index on last_name only; if that was your first instinct, you can try out
the following commands to see if the index would work:
ALTER TABLE customer DROP KEY email;
ALTER TABLE customer ADD INDEX (last_name);
EXPLAIN SELECT first_name, last_name, active
FROM customer WHERE email=’[email protected]’
ORDER BY last_name\G
Sometimes, the first idea you have to optimize a query will not actually optimize the query. In this
case, the index on last_name does not help because mysqld needs to filter for the WHERE clause
first, before ordering. If mysqld was to use the index on last_name, it would have to go through
the entire index, and for each row in the index, look up the email field from the data to see if it
matched. If there were a match, the last_name would be put in the result set, and the first_name
and active field would be looked up and also put in the result set. Those lookups are a lot of extra
work, and the query optimizer rightfully uses a full table scan, even with an index on last_name.
There will be other times when the best solution for optimization is not the best solution overall for
the application. In this example, an index was added on (email, last_name) and the EXPLAIN
plan showed a key length (key_len) of 203. That is a very large key to keep up to date, and if it
slows down the application, it may be more beneficial to use an index with a shorter length, even
if it means mysqld has to do a filesort.
Optimizing away Range checked for each record
As shown in Table 18-2, the Extra value Range checked for each record is faster than
a full table scan (type: ALL) but slower than a full index scan (type: index). To optimize
queries with this Extra value, create or modify an index so that the query optimizer has a good
index to use. Often, optimizing queries to get rid of Range checked for each record results
in a data access strategy (type) of range, ref or eq_ref.
Optimizing away Using temporary
Unlike in previous discussions, optimizing away an Extra value of Using temporary cannot
be done by adding an index. Using temporary is undesirable, as it means that a temporary
table must be used to store intermediate results. There are several ways to optimize this,
depending on why a temporary table is used:
■ If ORDER BY and GROUP BY are both present, and use different fields and/or ordering,
the way to optimize this is to get rid of either the ORDER BY or the GROUP BY. This may
620
Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark.
Query Analysis and Index Tuning 18
be done by splitting the query into two queries. It may be possible to combine the two
queries by using UNION so that intermediate results do not need to be stored in a temporary table.
■ The presence of ORDER BY and DISTINCT may cause a temporary table to be
used. The way to optimize this is to get rid of either the ORDER BY or the DISTINCT. This
may be done by splitting the query into two queries. It may be possible to combine the
two queries by using UNION so that intermediate results do not need to be stored in a
temporary table.
■ If the SQL_CALC_FOUND_ROWS keyword is used, the number of rows is stored in a temporary table, which can be retrieved by issuing SELECT FOUND ROWS(). To optimize, get
rid of SQL_CALC_FOUND_ROWS. Depending on what you are counting, you might count
results periodically and have an estimate for a time period (i.e., run a query every 10 minutes to put the number into table and read the table, doing one count every 10 minutes
instead of one count every time the query is issued).
■ The SQL_SMALL_RESULT keyword is used in a SELECT statement with DISTINCT or
GROUP BY. The SQL_SMALL_RESULT keyword is a hint to the optimizer that the result
is small, and thus it should use a temporary table instead of a filesort. To optimize, get rid
of SQL_SMALL_RESULT. If you need the SQL_SMALL_RESULT keyword because a temporary table is more desirable than a filesort, then you cannot optimize Using temporary
away.
If you use optimizer hints, be sure to run periodic testing. Only through periodic testing can you determine whether a temporary table or a filesort is better for your particular
situation.
■ ORDER BY or GROUP BY is used on a field that is not the first table in the join queue (the
first row returned in the EXPLAIN plan). One way to optimize this query is to change or
eliminate the ORDER BY clause. Another way would be to change the filter so that the table
order changes.
For example, the following query uses the customer table first in the join queue, but is
sorting based on rental_date, a field in the rental table:
mysql> EXPLAIN SELECT first_name, last_name FROM rental
-> INNER JOIN customer USING (customer_id)
-> ORDER BY rental_date\G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: customer
type: ALL
possible_keys: PRIMARY
key: NULL
key_len: NULL
ref: NULL
rows: 591
Extra: Using temporary; Using filesort
621
Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark.