DIAL-UP
HIGHSPEED
ISDN
HOSTING
UNIX
Windows
Lotus
Dedicated
Personal Web Space
WEB AUTHORING
SOFTWARE
SPAM
EXTERNAL LINKS
Magma.ca
Webmail
myaccount.magma.ca


SQL Tips

Note to the reader:

The following tips are only a small selection of what to do when "physically" building your database. There are many other performance related issues that a designer must be aware of, but are beyond the scope of this document.

DATABASE DESIGN

Before building a database related product, special care must be given to the core logical design of the database (i.e. using proper relationships with normalization techniques). Bad physical database design generally results in poor performance.

The following points apply mostly to the physical implementation of the database; nonetheless, following and understanding these will help in any situation or state of your database.

Use 'create [dbo].tblName()' and 'create procedure [dbo].usp_storedProcName'

By creating table and stored procedure objects owned by dbo instead of the database user you save a permission lookup for every transaction. To do so make sure you specify [dbo.] when creating every table and stored procedure.

Name all stored procedures with a prefix of usp_ instead of sp_

When the SQL Server looks for a stored procedure starting with sp_ it will look through the system built-in stored procedures (through the master db) first before looking in the current db. This is a costly lookup that can easily be avoided by naming your stored procedure with usp_.

Analyze your SQL statements with SQL Query Analyzer to maximize query efficiency

The SQL Query Analyzer is a great tool for determining the efficiency of your queries, and if your tables are indexed properly. The most useful feature is the option to display the 'Estimated Execution Plan' and 'Show Execution Plan' . The Estimated Execution Plan tool will perform the analysis of the execution plan of your query without executing it. Once you are done fine tuning you can use the Show Execution Plan tool to give you the actual plan that it is using. These two tools can be found in the Query menu. Another great feature of Query Analyzer is the ability to manage all indexes for your database. Found under the 'Tools' menu you can click on 'Manage Indexes' and here you can select which table's indexes you wish to administer.

Aside from the execution plan analyzer the Query Analyzer can also display various performance-related information when tuning queries. The options that you can set are listed below:

SET SHOWPLAN_TEXT ON: Returns estimated (not actual, as the query is not run) detailed information on how the query will run.

SET SHOWPLAN_ALL ON: Returns estimated detailed information on how the query will run, plus additional information, such as the estimated number of rows, I/O, CPU, and the average size of a the query.

SET STATISTICS IO ON: Shows the number of scans, logical reads, and physical reads performed during the query. Returns actual data based on a query that has run.

SET STATISTICS TIME ON: Shows the amount of time (in milliseconds) needed to parse, compile, and execute a query. Returns actual data based on a query that has run.

SET STATISTICS PROFILE ON: Shows a recordset that represents a profile of the query execution. Returns actual data based on a query that has run.

Know when and how to use indexes on tables

Indexes should be considered on all columns frequently used by WHERE, ORDER BY, GROUP BY, TOP, DISTINCT clauses, and in the MIN() and MAX() functions. However, too many indexes can be as bad as too few. Understanding how indexes are used by the engine is crucial.

There are two type of indexes: clustered and non-clustered. A clustered index forces the rows to be stored in an ordered fashion. This makes for quick data retrieval but more costly on inserting. A non-clustered index is less effective but still very useful. It is best for queries that return few rows (including just one) and where the index has good selectivity (at least 95%). Non-clustered indexes are stored separately from the table.

The best way to determine whether an index is truly improving the performance of a query is to run the query through Query Analyzer (using the "Show Execution Plan" option). This will show the performance cost of the query.

Use a proper Fill Factor in your indices to help maintain good performance when there are numerous inserts

The fill factor leaves empty space in your indexes so when you insert more rows throughout the day no page splits are required. This permits faster inserts and the index will remain optimized. However, fill factors can be harmful to query efficiency. If a chosen Fill Factor is too large then the time spent to traverse the fill will hamper lookup times within the index. The best thing to do is consider the number of inserts versus the number of selects and find a good medium.

Fragmentation

As with any system involving a great deal of data insertion and deletion fragmentation will occur and will eventually hamper performance. This is the situation with an index. If your tables get a large number of inserts and deletions you should set up a maintenance schedule that will involve removing the fragmentation on all indices. The specific command to use is DBCC DBREINDEX.

Use proper data types and lengths of columns to store your data

Knowing how to choose the proper data types and lengths is critical in the construction of your tables. Here are some good practices:

  • in general, always specify the narrowest columns you can. Smaller column sizes means the server will be able to store less data and the database I/O will be much faster
  • if you need to store large strings of data (< 8000 chars) use a VARCHAR data type instead of a TEXT data type. The TEXT data type involves a pointer being stored in the table and there will always be an overhead of having to follow that pointer to retrieve the data
  • don't use NCHAR or NVARCHAR types unless you absolutely need to store 16 bit character (Unicode) data and will take up twice as much space as CHAR and VARCHAR types, hence increasing I/O
  • if the text data in a column varies a great deal in length consider using the VARCHAR type instead of CHAR. Although VARCHAR has more overhead than CHAR the amount of space saved can greatly reduce I/O
  • on the other hand, if the text data in a column doesn't vary a great deal, use the CHAR data type to eliminate the overhead that VARCHAR entails. Remember to choose the smallest possible size for the CHAR typed column
  • choose proper numerical data types. For example, if you are going to be storing the number 1 through 10 in a column a data type of TINYINT is much more appropriate than an INT
  • don't use FLOAT or REAL data types for primary keys to avoid the unnecessary overhead
  • always specify NULL or NOT NULL for every column. For best performance, and to reduce bugs, columns should ideally be set to NOT NULL
Use Views to simplify your life

A View is a feature allowing you to simplify your queries by using a virtual table which represents the data from the joined tables in a much more manageable fashion, instead of joining many at once in every query. Views will not enhance the efficiency of your queries as no execution plans are generated and stored, but they will enhance development time. Try to avoid nesting Views as it becomes more difficult to troubleshoot problems. The best way to implement a view is to incorporate stored procedures (which are already highly optimized).

Don't use Diagrams

This is a very nice graphical feature that Microsoft added to its SQL Server but it is a major drain of server resources when it renders the table relationships. If you wish to use Diagrams, please use them on your own development server.

QUERY DESIGN

So far we have discussed performance tips relating to the physical design of the database. Now we can discuss a few of the more important performance issues relating to how queries are developed.

Always include a WHERE clause in your SELECT statement

Without a WHERE clause the SQL server will perform a table scan of the entire table and return all rows. In some cases you will want all rows returned, so a WHERE clause is not appropriate, but in general always include one.

Avoid using SQL Server cursors

In general, cursors use a lot of server resources and reduce the scalability of your applications. If you need row-by-row operations, you might want to consider handling them client-side, or by using temporary tables, derived tables, a sub-query, or by a CASE statement

If using UNION, consider UNION ALL when duplicate rows will never exist

UNION does the equivalent to SELECT DISTINCT on the joined tables' result set. If you know there will not be any duplication of rows created by the UNION, then using the UNION ALL statement will never look for duplicate rows, hence the query will run much faster than UNION.

Carefully evaluate whether the DISTINCT clause is needed

The DISTINCT clause slows down virtually every query that contains it. Where duplicate results are not an issue you should not include it

In queries do not return column data that you do not need

For example you should never use SELECT * FROM tableName if you aren't going to be using the data from every column from tableName

Use EXISTS instead of IN

If you have the choice between using the IN or the EXISTS clause in your statement, you will generally want to use the EXISTS clause as it is more efficient and will perform faster

Instead of temporary tables, use derived tables instead

A derived table is the result of using a SELECT statement in the FROM clause of an existing select statement. Derived tables, compared to temporary tables, can dramatically reduce I/O and improve performance due to existing entirely in memory and generally having a tighter result set to select from than a physical table.

If your application uses many wildcard (LIKE %) text searches, consider a Full Text Index

SQL Server's full-text search option can significantly speed up wildcard searches of text data stored in your tables. Full-text indexes also provide a great deal of functionality with regards to how text can be searched for.