SQL World: September 2011

Friday, September 16, 2011

10 reasons why SQL Server 2008 is going to rock

10. Plug-in model for SSMS. SSMS 2005 also had a plug-in model, but it was not published, so the few developers that braved that environment were flying blind. Apparently for 2008, the plug-in model will be published and a thousand add-ins will bloom.

9. Inline variable assignment. I often wondered why, as a language, SQL languishes behind the times. I mean, it has barely any modern syntactic sugar. Well, in this version, they are at least scratching the the tip of the iceberg.

Instead of:

DECLARE @myVar int

SET @myVar = 5

you can do it in one line:

DECLARE @myVar int = 5

Sweet.

8. C like math syntax. SET @i += 5. Enough said. They finally let a C# developer on the SQL team.

7. Auditing. It's a 10 dollar word for storing changes to your data for later review, debugging or in response to regulatory laws. It's a thankless and a mundane task and no one is ever excited by the prospect of writing triggers to handle it. SQL Server 2008 introduces automatic auditing, so we can now check one thing off our to do list.

6. Compression. You may think that this feature is a waste of time, but it's not what it sounds like. The release will offer row-level and page-level compression. The compression mostly takes place on the metadata. For instance, page compression will store common data for affected rows in a single place.

The metadata storage for variable length fields is going to be completely crazy: they are pushing things into bits (instead of bytes). For instance, length of the varchar will be stored in 3 bits.

Anyway, I don't really care about space savings - storage is cheap. What I do care about is that the feature promised (key word here "promises") to reduce I/O and RAM utilization, while increasing CPU utilization. Every single performance problem I ever dealt with had to do with I/O overloading. Will see how this plays out. I am skeptical until I see some real world production benchmarks.

5. Filtered Indexes. This is another feature that sounds great - will have to see how it plays out. Anyway, it allows you to create an index while specifying what rows are not to be in the index. For example, index all rows where Status != null. Theoretically, it'll get rid of all the dead weight in the index, allowing for faster queries.

4. Resource governor. All I can say is FINALLY. Sybase has had it since version 12 (that's last millennium, people). Basically it allows the DBA to specify how much resources (e.g. CPU/RAM) each user is entitled to. At the very least, it'll prevent people, with sparse SQL knowledge from shooting off a query with a Cartesian product and bringing down the box.

Actually Sybase is still ahead of MS on this feature. Its ASE server allows you to prioritize one user over another - a feature that I found immensely useful.

3. Plan freezing. This is a solution to my personal pet peeve. Sometimes SQL Server decides to change its plan on you (in response to data changes, etc...). If you've achieved your optimal query plan, now you can stick with it. Yeah, I know, hints are evil, but there are situations when you want to take a hammer to SQL Server - well, this is the chill pill.

2. Processing of delimited strings. This is awesome and I could have used this feature...well, always. Currently, we pass in delimited strings in the following manner:

exec sp_MySproc 'murphy,35;galen,31;samuels,27;colton,42'

Then the stored proc needs to parse the string into a usable form - a mindless task.

In 2008, Microsoft introduced Table Value Parameters (TVP).

CREATE TYPE PeepsType AS TABLE (Name varchar(20), Age int)

DECLARE @myPeeps PeepsType

INSERT @myPeeps SELECT 'murphy', 35

INSERT @myPeeps SELECT 'galen', 31

INSERT @myPeeps SELECT 'samuels', 27

INSERT @myPeeps SELECT 'colton', 42

exec sp_MySproc2 @myPeeps

And the sproc would look like this:

CREATE PROCEDURE sp_MySproc2(@myPeeps PeepsType READONLY) ...

The advantage here is that you can treat the Table Type as a regular table, use it in joins, etc. Say goodbye to all those string parsing routines.

1. Intellisense in the SQL Server Management Studio (SSMS). This has been previously possible in SQL Server 2000 and 2005 with use of 3rd party add-ins like SQL Prompt ($195). But these tools are a horrible hack at best (e.g. they hook into the editor window and try to interpret what the application is doing).

Built-in intellisense is huge - it means new people can easily learn the database schema as they go.

Built In Functions

Built In Functions

As in other languages like C and C++, SQL also provides Built in functions. Built in functions in SQL are classified into following two categories.

1. Scalar Functions

2. Aggregate Functions

3. Ranking Functions

I. Scalar Functions : The built in functions in SQL Server that work on a single value are called as scalar functions. Scalar functions are again classified into following categories based on data type on which they work.

a) Numeric Functions

b) String Functions

c) Date Functions

d) Conversion Functions

e) System Functions

a) Numeric Functions : The scalar functions of SQL that work on numeric type of data are called as numeric functions. Numeric functions in SQL are as follows.

Abs(n) : Returns absolute value of the given n i.e. n value without any sign.
Cos(n) : Returns cosine value of given n in radians.
Ceiling(n) : Returns smallest integer greater than or equal to n.
Floor(n) : Returns Largest integer less than or equal to n.
Radians(n) : Converts the given n in degrees to radians.
Degrees(n) : Converts the given n in radians to degrees.
Exp(n) : Returns e raised to the power of n. e is exponential whose value is constant and is 2.713
Log(n) : Returns natural logarithm of n. i.e. base e logarithm.
Log10(n) : Returns base 10 logarithm of n.
Power(m,n) : Returns m raised to the power of n.
Square(n) : Returns Square of given n.

Sqrt(n) : Returns Square Root of given n.
Sin(n) : Returns sine value of given n in radians.
Tan(n) : Returns tangent value of given n in radians.
Sign(n) : Return -1 if n is negative, 1 if n is positive and 0 if n is zero.
Pi() : Return mathematical constant pi value i.e. 3.14
Round(n,p,t) : Rounds the given in n to p decimal places. If t is specified as 1 then it truncates n to p decimal places instead of round the value.

b) String Functions : The built-in functions of SQL that work on string type of data are called as string functions. String functions in SQL are as follows.

ASCII(char) : Converts the specified character to the equivalent ascii code.
Char(int) : Converts ascii code to the equivalent character.
CharIndex(s1,s2) : returns the starting position where the partial string s1 first occurs in the string s2. Returns zero if s1 does not occur in s2.
Lower(S1) : Converts all uppercase letters of the string s1 to lower case letters.
Upper(s1) : Converts all lowercase letters of the string s1 to upper case letters.
Ltrim(s1) : removes leading blank spaces in the string s1.
Rtrim(S1) : Removes trailing blank spaces in the string s1.
Reverse(S1) : Reverses the string S1.
Right(S1,Len) : returns last Len characters from the string S1.
Left(S1,Len) : returns first len characters from the string S1.
Soundex(a) : Returns how the given string is pronounced in a four character soundex code.
Stuff(s1,a,len,s2) : Replaces the partial string s1 with the partial string s2 starting at position a, replacing len characters of s1.
Substring(S1,a,len) : creates a partial string from string s1 starting at the position a with a length of len.

c) Date Functions : The built-in functions of SQL that work on date type of data are called as date functions and date functions in SQL server are as follows.

GetDate() : Returns the current system date and time.
DatePart(item,D) : returns the specified part item of a date D as an integer.
DateName(item,D) : returns the specified part item of the date D as a string.
DateDiff(item,D1,D2) : calculates the difference between the two date parts D1 and D2 and returns the result as an integer in units specified by the value item.
DateAdd(item,n,D) : Adds the number n of units specified by the value item to the given date D.

d) Conversion Functions : The Built-In functions of SQL Server that are used to convert one type of data to another are called as conversion functions and conversion functions in SQL Server are as follows.

Cast(a as type[(length)]) : converts an expression a into the specified data type type.
Convert(type[(length}],a) : Equivalent to cast, but the arguments are specified differently.

e) System Functions : The SQL Server system functions provide extensive information about database objects.

Object_ID(objname) : returns the identifier of the database object objname.
Object_name(obj_id) : returns the name of the database object with id obj_id.
Col_Name(Obj_Id,S_Id) : returns the name of the column belonging to the object obj_id with the identifier s_id.
Col_Length(obj,Col) : returns the length of the column belonging to a database object obj.
Datalength(s1) : calculates the length of the result of the expression s1.
DB_Id(Db_name) : returns the id of the database db_name.
DB_Name(Db_id) : returns the name of the database with id Db_id.
User_Id(username) : returns the id of specified user name.
user_name(uid) : returns the name of the user with id uid.
Current_TimeStamp : returns the current date and time.

II. Ranking functions

1. Row_number: Displays row numbers

Select row_number() over (order by orderdate desc) as rownum,

Salesorder_id,customer_id,orderdate

From Sales.SalesOrderHeader

Order by OrderDate Desc

Select row_number() over (order by orderdate desc) as rownum,

Salesorder_id,customer_id,orderdate

From Sales.SalesOrderHeader

Where rownum between 101 and 120

Order by OrderDate Desc

Create proc info1 @a int, @b int

Select * from (select row_number() over (order by orderdate desc) as rownum,

Salesorderid,customerid,orderdate,from sales.salesorderheader) 0

Where rownum between @a and @a+@b-1

Order by orderdate desc

Other Ranking functions

Select row_number() over (order by city) as rownum,

Rank() over( order by city) as rank,

Dense_Rank() over (order by city as DenseRank,

Ntile(3) over( order by city) as ntile_3,

Ntile(4) over(order by city) as ntile_4,

City, stateprovincecode

From person.address as address

Inner join person.stateprovince as stateprov

On address.stateprovinceid=stateprov.stateprovinceID where countryregioncode=‘US’ and StateprovinceCode=‘AZ’

Thursday, September 8, 2011

Cloud Computing

1 Cloud Computing Deployment Models

2 What does a Shift Towards Cloud Computing Mean?

3 References

WIKI Cloud computing is a technology that uses the internet and central remote servers to maintain data and applications. Cloud computing allows consumers and businesses to use applications without installation and access their personal files at any computer with internet access. This technology allows for much more efficient computing by centralizing storage, memory, processing and bandwidth.

A simple example of cloud computing is Yahoo email or Gmail etc. You dont need a software or a server to use them. All a consumer would need is just an internet connection and you can start sending emails. The server and email management software is all on the cloud ( internet) and is totally managed by the cloud service provider Yahoo , Google etc. The consumer gets to use the software alone and enjoy the benefits. The analogy is , 'If you only need milk , would you buy a cow ?' All the users or consumers need is to get the benefits of using the software or hardware of the computer like sending emails etc. Just to get this benefit (milk) why should a consumer buy a (cow) software /hardware ?

Cloud computing is broken down into three segments: "applications," "platforms," and "infrastructure." Each segment serves a different purpose and offers different products for businesses and individuals around the world. In June 2009, a study conducted by VersionOne found that 41% of senior IT professionals actually don't know what cloud computing is and two-thirds of senior finance professionals are confused by the concept,^[1] highlighting the young nature of the technology. In Sept 2009, an Aberdeen Group study found that disciplined companies achieved on average an 18% reduction in their IT budget from cloud computing and a 16% reduction in data center power costs.^[2]

Cloud Computing Deployment Models

Public Cloud

Public cloud or external cloud describes cloud computing in the traditional mainstream sense, whereby resources are dynamically provisioned on a fine-grained, self-service basis over the Internet, via web applications/web services, from an off-site third-party provider who bills on a fine-grained utility computing basis. In simple, For use by multiple organizations on a shared basis and hosted and managed by a third party service provider.it is very useful. Its free of cost.

Talk to a broker of cloud solutions, Get independent advice here: http://www.dcx.com.au

Community Cloud

A community cloud may be established where several organizations have similar requirements and seek to share infrastructure so as to realize some of the benefits of cloud computing. The costs are spread over fewer users than a public cloud (but more than a single tenant). This option offers a higher level of privacy, security, and/or policy compliance. In addition, it can be economically attractive as the resources (storage, workstations) utilized and shared in the community are already exploited and have reached their return of investment. Examples of community clouds include Google's Web App hosting service.

It was dark when I woke. This is a ray of sunshine.

Um, are you really just giving this info out for nothing?

What does a Shift Towards Cloud Computing Mean?

A paradigm shift to cloud computing will affect many different sub-categories in computer industry such as software companies, internet service providers (ISPs) and hardware manufacturers. While it is relatively easy to see how the main software and internet companies will be affected by such a shift, it is more difficult to predict how companies in the internet and hardware sectors will be affected.

Who Gains?

Consulting/Software/Hardware and Services companies that could gain from a shift towards cloud computing include:

Software Producers that could gain from a shift towards cloud computing include:

Internet-based companies that could gain from a shift towards cloud computing include:

Internet Infrastructure providers that could gain from a shift towards cloud computing include:

Consulting companies that could gain from a shift towards cloud computing include:

Healthcare providers that have embarked their solutions onto the clouds include:

eKlinik Healthcare Cloud Computing

Who Loses Out?

Traditional software producers that could have some catching up to do if cloud computing ultimately wins out include:

Monday, September 5, 2011

When we will go for clustered and Non-clustered Indexes

1. Indexes.

Microsoft SQL Server index is a structure associated with a table that speeds retrieval of the rows in the table. An index contains keys built from one or more columns in the table. These keys are stored in a structure that allows SQL Server to find the row or rows associated with the key values quickly and efficiently.

If a table is created with no indexes, the data rows are not stored in any particular order. This structure is called a heap.

The two types of SQL Server indexes are:

Clustered

Clustered indexes sort and store the data rows in the table based on their key values. Because the data rows are stored in sorted order on the clustered index key, clustered indexes are efficient for finding rows. There can only be one clustered index per table, because the data rows themselves can only be sorted in one order. The data rows themselves form the lowest level of the clustered index.

The only time the data rows in a table are stored in sorted order is when the table contains a clustered index. If a table has no clustered index, its data rows are stored in a heap.

Nonclustered

Nonclustered indexes have a structure that is completely separate from the data rows. The lowest rows of a nonclustered index contain the nonclustered index key values and each key value entry has pointers to the data rows containing the key value. The data rows are not stored in order based on the nonclustered key.

The pointer from an index row in a nonclustered index to a data row is called a row locator. The structure of the row locator depends on whether the data pages are stored in a heap or are clustered. For a heap, a row locator is a pointer to the row. For a table with a clustered index, the row locator is the clustered index key

2. When we will go for clustered and Non-clustered Indexes.

Indexes assist when a query:

· Searches for rows that match a specific search key value (an exact match query). An exact match comparison is one in which the query uses the WHERE statement to specify a column entry with a given value. For example:

WHERE emp_id = 'VPA30890F'

· Searches for rows with search key values in a range of values (a range query). A range query is one in which the query specifies any entry whose value is between two values. For example:

WHERE job_lvl BETWEEN 9 and 12

or,

WHERE job_lvl >= 9 and job_lvl <= 12

· Searches for rows in a table T1 that match, based on a join predicate, a row in another table T2 (an index nested loops join).

· Produces sorted query output without an explicit sort operation, in particular for sorted dynamic cursors.

· Scans rows in a sorted order to permit an order-based operation, such as merge join and stream aggregation, without an explicit sort operation.

· Scans all rows in a table with better performance than a table scan, due to the reduced column set and overall data volume to be scanned (a covering index for the query at hand).

· Searches for duplicates of new search key values in insert and update operations, to enforce PRIMARY KEY and UNIQUE constraints.

· Searches for matching rows between two tables for which a FOREIGN KEY constraint is defined.

Queries using LIKE comparisons can benefit from an index if the pattern starts with a specific character string, for example 'abc%', but not if the pattern starts with a wildcard search, for example '%xyz'.

Consider using nonclustered indexes for:

· Columns that contain a high number of distinct values, such as a combination of last name and first name (if a clustered index is used for other columns). If there are very few distinct values, such as only 1 and 0, no index should be created.

· Queries that do not return large result sets.

· Columns frequently involved in search conditions of a query (WHERE clause) that return exact matches.

· Decision Support System applications for which joins and grouping are frequently required. Create multiple nonclustered indexes on columns involved in join and grouping operations, and a clustered index on any foreign key columns.

· Covered queries

Consider using a clustered index for:

· Columns that contain a limited number of distinct values, such as a state column that contains only 50 distinct state codes. However, if there are very few distinct values, such as only 1 and 0, no index should be created.

· Queries that return a range of values using operators such as BETWEEN, >, >=, <, and <=.

· Columns that are accessed sequentially.

· Queries that return large result sets.

· Columns that are frequently accessed by queries involving join or GROUP BY clauses; typically these are foreign key columns. An index on the column(s) specified in the ORDER BY or GROUP BY clause eliminates the need for SQL Server to sort the data because the rows are already sorted. This improves query performance.

· OLTP-type applications where very fast single row lookup is required, typically by means of the primary key. Create a clustered index on the primary key.

Clustered indexes are not a good choice for:

· Columns that undergo frequent changes because this results in the entire row moving (because SQL Server must keep the row’s data values in physical order). This is an important consideration in high-volume transaction processing systems where data tends to be volatile.

· Covered queries. The more columns within the search key, the greater the chance for the data in the indexed column to change, resulting in additional I/O.

Index Optimization tips

Every index increases the time in takes to perform INSERTS, UPDATES and DELETES, so the number of indexes should not be very much. Try to use maximum 4-5 indexes on one table, not more. If you have read-only table, then the number of indexes may be increased.
Keep your indexes as narrow as possible. This reduces the size of the index and reduces the number of reads required to read the index.
Try to create indexes on columns that have integer values rather than character values.
If you create a composite (multi-column) index, the orders of the columns in the key are very important. Try to order the columns in the key as to enhance selectivity, with the most selective columns to the left most of the key.
If you want to join several tables, try to create surrogate integer keys for this purpose and create indexes on their columns.
Create surrogate integer primary key (identity for example) if your table will not have many insert operations.
Clustered indexes are more preferable than no clustered, if you need to select by a range of values or you need to sort results set with GROUP BY or ORDER BY.
If your application will be performing the same query over and over on the same table, consider creating a covering index on the table.
You can use the SQL Server Profiler Create Trace Wizard with "Identify Scans of Large Tables" trace to determine which tables in your database may need indexes. This trace will show which tables are being scanned by queries instead of using an index.
You can use sp_MSforeachtable undocumented stored procedure to rebuild all indexes in your database. Try to schedule it to execute during CPU idle time and slow production periods.
sp_MSforeachtable @command1="print '?' DBCC DBREINDEX ('?')"

SQL World

Pages