SqlServer-Query-Optimization-Tips

SqlServer Query Optimization Tips


1.     Tip 1: Always use WHERE Clause in SELECT Queries while we don’t need all the rows to be     returned. This will help to narrow the return rows else it will perform a whole table scan and waste the Sql server resources with increasing the network traffic. While scanning the whole it will lock the Table which may prevent other users to access the table.
Tip 2: It is seen many times developers use codes like  
SELECT * FROM OrderTable WHERE LOWER(UserName)='telsa'
Instead of writing it like the below
SELECT * FROM OrderTable WHERE UserName='telsa'
Infact both the queries does the same work but the 2nd one is better and retrieves rows more speedly than the first query. Because Sql Server is not case sensitive
Tip 3: While running a query, the operators used with the WHERE clause directly affect the performance. The operators shown below are in their decreasing order of their performance.
1.     =
2.     >,>=,<, <=
3.     LIKE
4.     <> 
Tip 4 : When we are writing queries containing NOT IN, then this is going to offer poor performance as the optimizer need to use nested table scan to perform this activity. This can be avoided by using EXISTS or NOT EXISTS.
When there is a choice to use IN or EXIST, we should go with EXIST clause for better performance.
Tip 5: It is always best practice to use the Index seek while the columns are covered by an index, this will force the Query Optimizer to use the index while using IN or OR clauses as a part of our WHERE clause. 
SELECT * FROM OrderTable WHERE Status = 1 AND OrderID IN (406,530,956)
Takes more time than 
SELECT * FROM OrderTable (INDEX=IX_OrderID) WHERE Status = 1 AND OrderID IN (406,530,956)
Tip 6: While we use IN, in the sql query it better to use one or more leading characters in the clause instead of using the wildcard character at the starting.
SELECT * FROM CustomerTable WHERE CustomerName LIKE 'm%'
SELECT * FROM CustomerTable WHERE CustomerName LIKE '%m'
In the first query the Query optimizer is having the ability to use an index to perform the query and there by reducing the load on sql server. But in the second query, no suitable index can be created while running the query.
Tip 7: While there is case to use IN or BETWEEN clause in the query, it is always advisable to use BETWEEN for better result.
SELECT * FROM CustomerTable WHERE CustomerID BETWEEN (5000 AND 5005)
Performs better than
SELECT * FROM CustomerTable WHERE CustomerID IN (5000,5001,5002,5003,5004,5005)
Tip 8: Always avoid the use of SUBSTRING function in the query.
SELECT * FROM CustomerTable WHERE CustomerName LIKE 'n%'
Is much better than writing
SELECT * FROM CustomerTable WHERE SUBSTRING(CustomerName,1,1)='n'
Tip 9 : The queries having WHERE clause connected by AND operators are evaluated from left to right in the order they are written. So certain things should be taken care of like
o    Provide the least likely true expressions first in the AND. By doing this if the AND expression is false at the initial stage the clause will end immediately. So it will save execution time
o    If all the parts of the AND expression are equally like being false then better to put the Complex expression first. So if the complex works are false then less works to be done.
Tip 10: Its sometimes better to combine queries using UNION ALL instead of using many OR clauses.
SELECT CustomerID, FirstName, LastName FROM CustomerTable
WHERE City = 'Wichita' or ZIP = '67201' or State= 'Kansas'
The above query to use and index, it is required to have indexes on all the 3 columns.
The same query can be written as
SELECT CustomerID, FirstName, LastName FROM CustomerTable WHERE City = 'Wichita'
UNION ALL
SELECT CustomerID, FirstName, LastName FROM CustomerTable WHERE ZIP = '67201'
UNION ALL
SELECT CustomerID, FirstName, LastName FROM CustomerTable WHERE State= 'Kansas'
Both the queries will provide same results but if there is only an index on City and no indexes on the zip or state, then the first query will not use the index and a table scan is performed. But the 2nd one will use the index as the part of the query.
Tip 11:  While the select statement contains a HAVING clause, its better to make the WHERE clause to do most of the works (removing the undesired rows) for the Query instead of letting the HAVING clause to do the works.
 e.g. in a SELECT statement with GROUP BY and HAVING clause, things happens like first WHERE clause will select appropriate rows then GROUP BY divide them to group of rows and finally the HAVING clause have less works to perform, which will boost the performance.
Tip 12: Let’s take 2 situations
   A query that takes 30 seconds to run, and then displays all of the required results.
     A query that takes 60 seconds to run, but displays the first screen full of records in less than 1 second.
By looking at the above 2 situations a developer may choose to follow the 1st option, as it uses less resources and faster in performance. But actually the 2nd one is more acceptable by a DBA. An application may provide immediate feedback to the user, but actually this may not be happening at the background.

We can use a hint like
SELECT * FROM CustomerTable WHERE City = 'Wichita' OPTION(FAST n)
where n = number of rows that we want to display as fast as possible. This hint helps to return the specified number of rows as fast as possible without bothering about the time taken by the overall query.

ROLLUP and CUBE.


Let me give you one example where we can use ROLLUP and CUBE.

 I am developing the Employee base application and administrator wants the report which shows Employee Payment. You will say that’s really easy and can write the query as shown below,

create table EMP
( id int not null identity(1,1),
Employeeno varchar(max) NOT NULL,
PAYMENT_DATE varchar(max) NOT NULL,
Amount DECIMAL(7,2) NOT NULL);

INSERT INTO EMP VALUES ( 6, '1980-12-08', 100);
INSERT INTO EMP VALUES ( 44, ' 1981-05-05', 75);
INSERT INTO EMP VALUES ( 104, ' 1984-12-08',50);
INSERT INTO EMP VALUES ( 44, '1984-11-12', 75);
INSERT INTO EMP VALUES ( 8, ' 1980-12-08', 25);
INSERT INTO EMP VALUES ( 44, ' 1980-12-08',25);
INSERT INTO EMP VALUES ( 27, ' 1984-11-12 ', 75);
INSERT INTO EMP VALUES ( 44, ' 1982-12-30', 30);

Fig – (1) Group By clause.
SELECT Employeeno,PAYMENT_DATE,SUM(Amount) as total
FROM EMP GROUP BY Employeeno,PAYMENT_DATE
Which will returns the result as shown Fig – (1)

Here ROLLUP and CUBE comes into the picture and help us. The shown Fig – (2) result is generated using ROLLUP. ROLLUP adds new row for each column used in GROUP BY clause.


CASE WHEN GROUPING(Employeeno)=1 THEN 'TOTAL PAYEE'
ELSE Employeeno END Employeeno,
CASE WHEN GROUPING(PAYMENT_DATE)=1 THEN 'SUB TOTAL'
ELSE PAYMENT_DATE END PAYMENT_DATE,
SUM(Amount)
FROM EMP GROUP BY Employeeno,PAYMENT_DATE WITH ROLLUP

In fig – 2 we have TOTAL PAYEE by employeeno, lets assume you want the TOTAL PAYEE by PAYMENT_DATE also. Here you have to use CUBE as shown below See Result In fig – 3 ,

SELECT
CASE WHEN GROUPING(Employeeno)=1 THEN 'TOTAL PAYEE'
ELSE Employeeno END Employeeno,
CASE WHEN GROUPING(PAYMENT_DATE)=1 THEN 'SUB TOTAL'
ELSE PAYMENT_DATE END PAYMENT_DATE,
SUM(Amount)
FROM EMP GROUP BY Employeeno,PAYMENT_DATE WITH CUBE



Interview Question

Interview Questions 

What is an Index in SQL Server?

An index is a physical structure containing pointers to the data. Indexes are created in an existing table to locate the data more efficiently and quickly (Read more Here)

What is the Difference between Clustered and a Non-clustered Index in SQL Server?

The difference is that, Clustered index is unique for any given table and we can have only one clustered index on a table. The leaf level of a clustered index is the actual data and the data is resorted in case of clustered index. Whereas in case of non-clustered index the leaf level is actually a pointer to the data in rows so we can have as many non-clustered indexes as we can on the db. (Read more Here)

What is Collation?

Collation refers to a set of rules that determine how the data is sorted and compared (Read more Here)

What are Primary Keys and Foreign Keys in SQL Server?

Primary key is used to identify records in table uniquely and it won’t allow null values and duplication of data. Only one primary key is allowed for each table ()
Foreign key in one table points to primary key in another table. Foreign key is used to prevent the actions that would destroy the link between two tables. 

What’s the Difference between a Primary Key and a Unique Key? 

Primary key is used to identify records in table uniquely and it won’t allow null values and duplication of data. Only one primary key is allowed for each table.
Unique key is same as primary key only the difference is unique key allow null values.

First Char is in Uppercase


                  To Find  string is completely in uppercase?
                                                  or
if First Char is in Uppercase

Im my previous Project i need to find that string first char are upper or not and according that need to make that row active or in active
Let me Show you How to do that


---Create Table

create table duplicatetest(
id int identity(1,1),
name varchar(max),
Active int )



  ---Insert Some Record

          insert into duplicatetest(name,Active) values('krishna patil',1)
          insert into duplicatetest(name,Active) values('Mitesh patil',1)
          insert into duplicatetest(name,Active) values('Rahul Joshi',1)
          insert into duplicatetest(name,Active) values('krishna patil',1)
     insert into duplicatetest(name,Active) values('sagar Patel',1)
      insert into duplicatetest(name,Active) values('Mitesh patil',1)

To check first char is in Upper use following Query

update duplicatetest set active=0 where ASCII(substring(name,1,1))=ASCII(upper(substring(name,1,1) ))


difference between DDL, DML and DCL commands

What are the difference between DDL, DML and DCL commands?

DDL

Data Definition Language (DDL) statements are used to define the database structure or schema. Some examples:
CREATE - to create objects in the database
ALTER - alters the structure of the database
DROP - delete objects from the database
TRUNCATE - remove all records from a table, including all spaces allocated for the records are removed
COMMENT - add comments to the data dictionary
RENAME - rename an object

DML

Data Manipulation Language (DML) statements are used for managing data within schema objects. Some examples:
SELECT - retrieve data from the a database
INSERT - insert data into a table
UPDATE - updates existing data within a table
DELETE - deletes all records from a table, the space for the records remain
MERGE - UPSERT operation (insert or update)
CALL - call a PL/SQL or Java subprogram
EXPLAIN PLAN - explain access path to data
LOCK TABLE - control concurrency
DCL

Data Control Language (DCL) statements. Some examples:
GRANT - gives user's access privileges to database
REVOKE - withdraw access privileges given with the GRANT command
TCL

Transaction Control (TCL) statements are used to manage the changes made by DML statements. It allows statements to be grouped together into logical transactions.
COMMIT - save work done
SAVEPOINT - identify a point in a transaction to which you can later roll back
ROLLBACK - restore database to original since the last COMMIT
SET TRANSACTION - Change transaction options like isolation level and what rollback segment to use

Query To Get Count of Duplicate Records

 Query To Get Count of Duplicate Records

Here I will explain How to find duplicate records in SQL server.


---Create Table


create table duplicatetest
(
id int identity(1,1),
name varchar(max)
)

                            ---Insert Some Record

                        insert into duplicatetest(name) values('krishna patil')
                        insert into duplicatetest(name) values('Mitesh patil')
                        insert into duplicatetest(name) values('Rahul Joshi')
                        insert into duplicatetest(name) values('krishna patil')
                        insert into duplicatetest(name) values('Sagar Patel')
                        insert into duplicatetest(name) values('Mitesh patil')


                        select * from duplicatetest


Now I want to know the duplicate records count that means how many times each duplicate record exists in datatable for that I written a query like this

SELECT Name, COUNT(*) as duplicate
FROM duplicatetest
GROUP BY Name
HAVING COUNT(*) > 1 Order By Name



After run above query we will get each record duplicate count that would be like this





PATINDEX Example

                   Let me Show Use of  PATINDEX in Sql


PATINDEX Example 

DECLARE @myString varchar(100);
DECLARE @largestInt int;

SET @myString = 'R2D2456778'

SET @mystring = @myString;
select  left(@myString, PATINDEX('%[a-z]%',substring(@myString,1,1)))