Speed is almost always a concern when
working with databases. Many things contribute to your database's
performance: the operating system; hardware considerations, such as
memory and disk space; and the design of the applications accessing
the database. Your database's design also plays an important role in
performance.
What you need: SQL Server 2000
| |
In this article, I'll
discuss some of the things you can do to improve your SQL Server
database's performance. I'll focus on SQL Server 2000's ability to
create indexes on views, as well as how to configure your indexes
properly using the Index Tuning Wizard (ITW). I'll also discuss how
you can make sure your queries are taking full advantage of indexes
and statistical information.
You should take performance
into account from the beginning of your database's design phase.
However, you can make modifications after your database goes into
production that affect performance greatly. Indexes are database
objects that optimize the searching and sorting of data. Properly
configured indexes can make a huge difference in the time it takes
to get data out of or insert data into your database. The ITW can
aid you in determining how you should implement indexes in your
database for maximum performance gain.
You can use the ITW to
recommend the best configuration of indexes for a database based on
a given workload. A workload is a SQL script or trace that you save
to an external file. The ITW's recommendations are only as good as
the workload, so it's important that you prepare the proper
information first. Probably the easiest and most thorough way to
create a workload for the ITW is by saving a trace created by the
SQL Profiler. New in SQL Server 7.0, the SQL Profiler records server
activity using the filters and criteria that you provide. Make sure
your trace records typical database activity when you create a trace
with the SQL Profiler to use with the ITW. In other words, select a
time when database usage isn't particularly heavy or light. How long
a period of time you need to run the trace for varies depending on
your specific system. In some cases, you might have to run a trace
for an hour to capture an accurate representation of system
activity. In other cases, it might take days to record all the
standard variations of activities that occur in your
database.
Invoke the ITW Once
you have a workload file ready for use, invoke the ITW by selecting
a server in the Enterprise Manager treeview. Select Wizards from the
Tools menu, then drop down the Management node in the treeview and
select Index Tuning Wizard. The welcome screen for the ITW appears.
The ITW's second screen allows you to specify which server and
database you want to analyze. You have two additional choices on
this screen: Keep All Existing Indexes and Perform Thorough
Analyses. Clearing the Keep All Existing Indexes option lets the ITW
make the optimal indexing recommendations. However, the ITW might
suggest deleting or altering some of your existing indexes. Leave
this option checked if you don't want your existing indexes altered.
Selecting the Perform Thorough Analyses option results in the ITW
performing the most extensive analysis possible. Although this might
give the best results, it's likely that performing a thorough
analysis will take a while to complete. While it's running, a
thorough analysis is also likely to cause a strain on your server.
For these reasons, it's a good idea to either run the analysis on a
test server or during off-peak hours on your production server if
you choose to run a thorough analysis.
The third screen in the ITW
has you specify the workload you want to work with. If you're using
a file created by the SQL Profiler, select the My Workload File
option button and use the File dialog box to locate your saved trace
file. You can also set some advanced options on this screen by
clicking on the Advanced Options command button. These options
include specifying the maximum amount of disk space the recommended
indexes should use, and the maximum number of queries in your
workload file to sample. You specify which table or tables you want
to receive index recommendations for on the fourth screen. Selecting
only certain tables saves time, as well as helps you focus in on
specific problem areas. However, if you're looking for the ITW to
offer suggestions on how to optimize your database as a whole, you
probably want to select all the tables in your database. The next
screen in the ITW shows you the index recommendations based on the
criteria you provided (
see Figure 1 ). You have the
option of implementing the recommendations immediately, scheduling
the execution for later, or saving the execution script to an
external file.
| |
|
| Figure 1 |
Optimize With the Index Tuning Wizard |
The ITW neither offers
recommendations on primary keys and other unique indexes, nor
provides index recommendations on system tables. Other limitations
of the ITW include the inability to review more than 32,767 queries
in a given workload and the inability to offer suggestions for a
database created using SQL Server 6.5 or earlier. Note that the ITW
makes its recommendations by looking at a sample of your data. For
this reason, you might find that the ITW gives you different
recommendations if you run it more than once using the same
workload. If the ITW can't provide you with any recommendations, it
might be for either of two reasons. First, the ITW might not be able
to determine any index configuration offering improved performance
from the one you already have in place, or second, there might not
be enough data in the tables being sampled to ascertain an
appropriate recommendation.
View Indexes SQL
Server 2000 Enterprise Edition offers the ability to index views, in
addition to indexing tables. Say you're interested in looking at the
total units sold for each order in the Pubs database. This SQL code
creates a view in the Pubs database called Quantity_ Totals that
shows you that information along with a count of how many items were
contained in the order:
Use Pubs
GO
CREATE VIEW Quantity_Totals
with SCHEMABINDING
AS
SELECT ord_num, Total_Quantity =
Sum(qty), Total_Items = Count_Big(*)
FROM dbo.sales
GROUP BY ord_num
The COUNT_BIG function is
new in SQL Server 2000. It works the same way as the COUNT function;
the difference is its return value has a bigint data type, as
opposed to the int data type, which is the value COUNT returns. Any
view that includes a GROUP BY clause must also contain the COUNT_BIG
function to be eligible to contain an index. You must create a view
that contains an index using the SCHEMABINDING option, which is new
in SQL Server 2000. The view is bound to the schema of its
underlying tables when you specify this option.
The resultset returned by
the view isn't stored permanently in the database when a view
doesn't contain an index. You might have a view that references many
rows of data or involves complex processing, such as performing
aggregate grouping and multiple joins. The overhead required for SQL
Server to re-create the resultset for the view every time it's
referenced could be substantial.
View indexes operate in
much the same way as indexes on tables. Like tables, views can have
one clustered index and multiple nonclustered indexes. However, you
must create a unique, clustered index for a view first before you
have the option to create any nonclustered indexes. Creating a
clustered index on a view stores the resultset for the view in the
database permanently. Although the stored resultset reflects the
data at the time you created the index, any modifications to the
underlying data are reflected in the resultset automatically. Create
an index on a view using the CREATE INDEX statement as you would
create an index on a table:
CREATE UNIQUE CLUSTERED INDEX
PRIMARY_IDX on
Quantity_Totals(ord_num)
This code creates a unique
clustered index on the Quantity_Totals view created in the previous
example. In addition to improving the performance of views,
implementing view indexes can have unexpected performance increases
as well. Once you create and store an index for a view in the
database, the Query Optimizer might choose to use that index to help
boost the performance of other queries, even those that might not
reference the view in the FROM clause directly. For example, this
SQL statement selects the sum of all quantities sold grouped by
order number:
SELECT ord_num, Sum(qty)
FROM sales
GROUP BY ord_num
When this code is executed,
the Query Optimizer realizes that the summation for the quantities
exists in the index that SQL Server created for the Quantity_Totals
view. In this case, the Query Optimizer might determine that using
the index created for the view would result in the most efficient
execution of this query. You must first meet many terms and
conditions before you begin attempting to add indexes to your views
(
see Table
1
and Resources).
Fine-Tune Your Queries In addition to a database's physical design,
the structure of the queries you use to save and retrieve data can
also have a significant performance impact. Hardware problems such
as insufficient memory or disk space can contribute to slow queries.
However, poor query performance can be due to the Query Optimizer's
inability to take advantage of indexes or statistical
information.
|
|
| Tune-Up
Checklist
|
Statistical information
consists of information regarding the distribution of the data in a
column. The Query Optimizer uses statistical information to help
determine the optimal way to execute a query. SQL Server creates
statistical information for the column(s) in the index automatically
when you create an index on a column(s). By default, the Auto Create
Statistics database option in SQL Server is set to a value of True.
This causes SQL Server to create statistical information
automatically on additional columns in your tables as well. In
addition to creating statistical information, SQL Server updates
information periodically. How often it updates the information is
determined by how often the data in the column or index changes and
the volume of data in the column or index. Most often, SQL Server
does an efficient job of both creating and updating the statistical
information for your tables without any manual intervention.
However, sometimes you might find that the necessary statistical
information for your tables either fails to be created or doesn't
get updated often enough.
You can use the DBCC
SHOW_STATISTICS Transact-SQL (T-SQL) command to see what statistical
information exists currently for the columns in an index. This
command takes two parameters. The first parameter is the name of the
table you want to view the statistical information for; the second
is the index's name. Executing this line of SQL in the Query
Analyzer shows you the statistical information available for an
index in the authors table of the Pubs database called
aunmind:
DBCC SHOW_STATISTICS (authors, aunmind)
The information returned by
the DBCC SHOW_STATISTICS command includes the date and time SQL
Server last updated the statistical information and the number of
rows SQL Server sampled for the statistical information.
| |
|
| Figure 2 |
Work With Statistical Information. |
You can use the Query
Analyzer to view information graphically for columns that aren't
part of an index. To do this, open the Query Analyzer and select
Manage Statistics from the Tools menu. Select the database and table
or view for the statistical information you want to manage in the
Manage Statistics dialog box (
see
Figure 3 ). The Query Analyzer populates
any existing statistics in the listbox at the bottom of the screen.
Selecting an existing statistic and clicking on either the Delete or
Update command buttons allows you to remove or modify the
statistical information. The Create Statistic dialog box appears
when you click on the New command button, giving you a chance to
create a new statistic. When creating a new statistic, you have the
opportunity to select the column or columns to include, the
percentage of the data in these columns you want SQL Server to
sample to create the statistical information, and whether you want
SQL Server to update this statistic as needed automatically. It's a
good idea to allow SQL Server to perform the automatic updates,
because SQL Server usually does an efficient job of this and doing
it manually can become a cumbersome task. In addition to creating
and updating statistics using the graphical interface provided by
the Query Analyzer, you can also create or update statistics by
using the CREATE STATISTICS and UPDATE STATISTICS T-SQL commands.
View the Execution Plan One way to determine whether a query is taking
full advantage of indexes and statistical information is to view the
execution plan for a query in the Query Analyzer. Open the Query
Analyzer and load your query to do this. Select Show Execution Plan
from the Query menu and execute your query. Underneath the query
results you see a tab labeled Execution Plan. Click on this tab to
view the execution plan for the query. You can see a projected
execution plan without actually running the query. Selecting Display
Expected Execution Plan from the Query menu creates the diagram
illustrating the estimated execution plan for the query. This
diagram shows you what steps the Query Optimizer took, or will take,
to execute the query. You can see which steps take the longest and
are the most resource-intensive by viewing the details for each
step. You can also see whether the Query Optimizer is using indexes.
The Query Optimizer shows a table's caption in red if it comes
across a table with missing or outdated statistical information.
More information about the step appears in a popup window if you
hover over the table diagram with your mouse (
see
Figure 4 ). The SQL Profiler is a great
tool to help you figure out which queries are causing performance
problems and need to be analyzed further. The SQL Profiler can
identify slow queries and SQL statements by recording your server's
specific activities.
| |
|
| Figure 3 |
Spot Outdated Statistical Information. |
You have many things to
consider when working to optimize your database's performance.
Making sure you have the proper indexes and statistical information
alone isn't enough to guarantee your database is optimized fully.
However, it is an important step that can lead to a significant
improvement in your database's response time. You can fine-tune your
queries to take full advantage of indexes and statistical
information with SQL Server's new abilities and a little programming
time.
Mary V. Hooke is a contributing editor for Visual Basic
Programmer's Journal. In addition to writing for VBPJ,
Mary is the president of Distinctive Development Ltd., a firm that
develops custom Internet and intranet applications in the New York
City area. Reach Mary by e-mail at mailto:mhooke@DistinctiveDevelopment.com
|