Wednesday, February 15, 2012

SQL Server Best Practices (DOs, DONTs)

Introduction

Although every application is different, in general, the procedures which can be followed to resolve performance problems residing in or around SQL Server are generally the same.

This article suggests several approaches which can be taken to tackle such problems, although it is preferable to try and avoid them in the first place by following a series of best practices and by avoiding some common pitfalls. 

Best Practices
This section lists some best practices which should be applied when writing SQL, and also to ADO/ADO.NET code interfacing with databases. It is broken up into three sections:

Vital - Critical performance issues can result where these guidelines are disregarded. Occasionally there may be situations where these rules cannot be enforced, but these must be viewed as "rare exceptions".

Important - Implementing these guidelines can help avoid or alleviate serious performance issues.

Recommended - Although not always relevant or practical, it is often worth evaluating these guidelines.

Brief justification is given for each rule, however, the underlying issues are often complex and they should not be viewed as conclusive.

Vital
  • DO NOT use "inline" SQL Statements – use Stored Procedures in all points of contact between the database and application layer(s). Stored Procedures have compiled and cached execution plans.
  • DO NOT use dynamic SQL statements. Dynamic SQL must be compiled every time it is executed, wasting resources (and Stored Procedures containing dynamic SQL will also recompile, which can cause object-level locking issues).
  • DO NOT use "Table Valued" functions. These will always process the entire query contained within, and generate extra overhead through table variable population – often much data is discarded upon joining to the results of a function.
  • DO NOT use Cursors against "permanent" tables. The locking strategies employed by cursors can cause rapid lock escalation and some severe contention issues. Use the "while loop" construct, or cursors based on temporary objects instead.
  • DO consider indexing strategy carefully when creating or amending schema. Poorly indexed tables are inefficient – the query optimizer will not be able to efficiently retrieve the rows that it requires.
  • DO include a "WHERE" clause. This is refered to as "SARGability" – Search ARGuments allow SQL Server to prune the result set at the earliest opportunity.
  • DO use table variables instead of standard temporary tables. Temporary tables can cause recompiles due to statistical updates and context switches, leading to problems with object-level locking. Table variables are also stored in heap memory when possible.
  • DO NOT drop temporary tables explicitly. SQL Server will automatically clean these up when they fall "out of scope" (e.g. execution chain has completed). An explicit DROP TABLE operation on a temporary table will cause a recompile event to occur.
  • DO set the "NOCOUNT" option on for all stored procedures which do not need to be aware of the number of records affected. This prevents SQL Server returning extra statistics to the client.
  • DO use disconnected ADO record set objects. Retaining an open record set can cause ADO to run through cursors on the SQL Server when retrieving data.
  • DO close all connection and record set objects as soon as they are no longer required. Retaining an unnecessary open connection can cause errors (only one record set can be open for one connection at a time in some modes of operation), and can also tie up valuable system resources.
  • DO NOT use the OR operator in JOIN predicates and WHERE clauses. This operator can cause poor index use (or table scans), as SQL attempts to match up and merge alternative sets of data. Consider use of UNION [ALL] if this sort of behaviour is definitely required.
  • DO NOT use the IN operator. Make use of "EXISTS" instead, as this allows SQL Server to look up the data in the condition clause, rather than forcing it to spool through a set of values. This is particularly relevant when the clause is a co-related sub-query.
  • DO NOT use distributed transactions unless there is no other option. Locking data in a local (explicit) transaction has serious enough implications, without extending the locking to another database which is physically separated from the current context.
Important
  • DO NOT use scalar functions in queries which process a large number of rows. Due to the way SQL Server processes this sort of operation, long query durations can result (although typically without heavy resource use – a common cause of slow running queries).
  • DO NOT use sub-queries. The entire sub-query result set is often evaluated, even if many of the rows are eliminated immediately by the way the sub-query is joined to the main query.
  • DO prefix the identifiers of all objects with the name of their owner to ensure SQL Server immediately selects the correct context. E.g. Prefix all Stored procedures, Tables, Functions and Views owned by the database owner with "dbo".
  • DO NOT use LEFT JOIN if possible – due to the nature of this operation, SQL Server is often unable to make efficient use of the available indexes. Consider using UNION [ALL] and two INNER JOINed queries instead.
  • DO NOT use the DISTINCT operator. In many cases duplicate rows can be eliminated through restructuring of a query – there are very few circumstances where DISTINCT is actually necessary.
  • DO use UNION ALL when possible, instead of UNION – adding the modifier "ALL" prevents SQL Server from performing a DISTINCT operation on the data sets during merging.
  • DO minimise the length of any explicit transactions. The longer a transaction, and the more objects involved, the worse the impact which it will have on concurrency.
  • DO reference indexed columns in join predicates and where clauses. This will allow SQL Server to leverage indexes and minimise the size of the data set being processed earlier rather than later.
  • DO NOT apply the ORDER BY clause to a set of data unless it is absolutely required (e.g. for front-end presentation). Never sort data into a table where the clustered index sequence differs from the ORDER BY. Sorting data is expensive, and sorting against the clustered index is pointless as SQL Server will automatically re-sequence the data upon insertion.
  • DO NOT use triggers to maintain data in regularly accessed tables. While useful for change-logging, firing triggers at (or from) busy tables can cause serious contention issues.
Recommended
  • DO use table variables to break up large queries, avoid repetition of work throughout a Stored Procedure and minimize contention through locking. Preparing data ahead of time can allow you to complete ACID safe work in a transaction without taking out locks on source data.
  • DO use locking hints to avoid lock escalation when the affected record count is known to be small, or to ensure other scenarios (dirty reads, retained locks, etc). Be careful though, as inappropriate use of locking hints can cause excessive contention (or insufficient contention, resulting in an error when data being read with NOLOCK is modified during the read operation).
  • DO NOT join together datasets where the clustered index contains similar columns in a different order; this can cause a bookmark lookup or heavy sort operation to occur. Remember that clustered indexes define the order of the data in a table.
  • DO keep code as simple as possible. Several short, simple SQL statements will almost always operate more efficiently than one huge query. This also improves maintainability.
  • DO NOT use RPC unless there is no other option. While useful, RPC is phenomenally expensive when compared to a normal query, because both the query execution request and result set (even if only return value or output parameters) are crossing a physical network boundary.

Tuesday, February 7, 2012

Introduction to SQL Server Profiler

Introduction

Microsoft SQL Server Profiler is a graphical user interface to SQL Trace for monitoring T-SQL Statements of Database Engine. We can save and reuse the state at a later point of time.
  • We can do the following using SQL Server Profiler
    • Create a trace
    • Watch the trace results as the trace runs
    • Store the trace results in a table
    • Start, stop, pause, and modify the trace results as necessary
    • Replay the trace results
  • Use SQL Server Profiler to monitor only the events in which you are interested.
Menu Path: Start | All Programs | Microsoft SQL Server 2005 | Performance Tools | SQL Server Profiler.
The following screen will come:

Screenshot - pic1.jpg
Figure 1.0
Click on <Connect> Button. New Properties Screen will come:
Screenshot - properties.jpg
Figure 1.1
It has two selection tabs:
  • General: It is used for general setting for Trace Database Engine.
  • Event: It is used to add or remove some selected event for monitor.
In General Section (as given in Figure 1.1), it is divided into four sections.
Section 1: In this section, you have to just specify the name of your trace, Trace provider name and server name are predefined and based upon your SQL Server.
And it is not editable.
Section 2: It is the template section. You can choose different type of Templates based upon your requirements. It is the configuration for trace. By default, it is "Standard (Default)" templates. Others templates are T-SQL, T-SQL Duration, T-SQL Reply, T-SQL SPs, etc. You can create your own custom Templates by selecting different Events and Event Class. It is saved as ".tdf" Extension.
Section 3: This section is related to save your trace. Either as File (.trc) or in a database. as table. While clicking on Save to file check box, File save dialog box should open and you can save that file (with .trc extension).
If you check the "Save to Table", it will connect with your server and ask you to which database you want to save that trace table information.

Screenshot - savetotable.jpg
Figure 1.2
Section 4: You can stop your trace on a particular time. Check the "Enable trace stop time" checkbox and give the time at which you want to stop track, SQL Server will automatically stop trace on that time.
Now Move To "Event Section" Tab.
Now we need to know some definition with respect to SQL Server Profiler.

What is an Event?

An Event is an action or operation that is performed in your SQL Server 2005 Database Engine.
Some examples of Events are:
    • Transact-SQL SELECT, INSERT, UPDATE, and DELETE statements.
    • User login and logout
    • Execution of Stored procedures
    • Operation with cursor
SQL Server profiler is able to trace all of these events and all these events are categories on a particular Event class.

What is an Event Class?

Event class is a type of event that can be traced.
Some examples are:
  • SQL: BatchCompleted
  • SQL: Batch Starting
  • Audit Login
  • Audit Logout
  • Lock: Acquired
  • Lock: Released
Now you can select events from this screen:

Screenshot - events.jpg
Figure 1.3 
In section 1, we can select the proper check box based upon our requirement, section 2 will display the details of Selected events and Events class. If you check in the check box of section 3, you will get all the list of Events and Columns in Section 1.
Section 4 is something like customization. Just click on the "Column Filter Button". In this section, you can specify some condition (like or Not like).

Screenshot - Filter1.jpg
Figure 1.4 
By clicking on "Organize Column" button, you can change the sequence of order of selected events.
Now Click on the "Run" Button, then Trace window will come:
Screenshot - trace1.jpg
Screenshot - trace2.jpg
Figure 1.5
Using these windows, you will get the detailed time duration of a query and all other events information that you have selected.
You can save this result and use it in future. Or you can extract a particular query from the trace, just right click and click on "Extract Event Data". And save this as a SQL Script.

Reply in SQL Server Profiler

SQL Server profiler has a Reply facility which has the ability to save a trace and replay it later.
Replay is useful to troubleshoot an application. Trace replay supports debugging by using Toggle Breakpoint and the Run to Cursor options on the SQL Server Profiler Replay menu.
Anything changed in SQL Server Management Studio will be traced by the SQL Profiler. So it can basically be used for database performance check. We also have "SQL Server Performance Monitor" to monitor the System and Server performance too.