Feeds:
Posts
Comments

Archive for August, 2011

RAID

RAID stands for Redundant Array of Inexpensive (or Independent) Disks. It is a collection of disk drives working together to optimize fault tolerance and performance. There are various RAID levels, but only the RAID levels significant to SQL Server are described here.

RAID0 (simple striping): Simplest configuration of disks that stripe the data. RAID0 does not provide any redundancy or fault tolerance. Data striping refers to sequentially writing data in a round-robin style up to a certain stripe size, a multiple of a disk sector (usually 512 bytes). Data striping yields good performance because multiple disks are concurrently servicing the I/O requests. The positive points for RAID0 are the cost, performance, and storage efficiency. The negative impact of no redundancy can outweigh its positive points.

RAID1 (simple mirroring): This configuration creates an exact copy or mirror of all the data on two or more disks. This RAID level gives good redundancy and fault tolerance, but poor storage efficiency. To fully take advantage RAID1 redundancy, it is recommended to use independent disk controllers (referred to as duplexing or splitting). Duplexing or splitting removes single-point failures and allows multiple redundant paths.

RAID5 (striping with parity): RAID5 uses block-level striping where parity is distributed among the disks. RAID5 is the most popular RAID level used in the industry. This RAID level gives fault tolerance and storage efficiency. However, RAID5 gives a larger negative impact for all write operations, especially sequential writes.

RAID10 (stripe of mirrors): RAID10 is essentially many sets of RAID1 or mirrored drives in a RAID0 configuration. This configuration combines the best attributes of striping and mirroring: high performance and good fault tolerance. For these reasons, we recommend using this RAID level. However, the high performance and reliability level is the trade-off for storage capacity.

Note that in all levels of RAID configuration, the storage efficiency is always limited to a multiple of the smallest drive size.

Summarized I/O Activity of RAID Levels and Their Recommended Use

RAID Levels

RAID0

RAID1

RAID5

RAID10

Reliability

Lowest.

Lack of fault tolerance results in data loss.

Very good.

Even better with duplexing.

Good.

Can tolerate single machine fault.

Excellent.

Storage Efficiency

100%

50%

>50%, <100%

(#drives -1/#drives)

50%

Random Read

Excellent

Fair

Worst of the RAID levels but better than a single drive.

Excellent.

Excellent.

Random Write

Excellent.

Fair.

Worse than a single drive but better than some RAID levels.

Fair.

Generally better with larger stripe sizes.

Very good.

Sequential Read

Excellent.

Fair.

Comparable to a single drive.

Very good.

Generally, better with smaller stripe sizes

Excellent.

Sequential Write

Excellent.

Good.

Better than other RAID levels.

Fair.

Very good.

Cost

Lowest.

Moderate.

Relatively high cost due to redundant drives; however, no expensive controller required

Moderate.

High.

Recommended use

Good for non-critical data or stagnantly updated data that gets backed up regularly or any data requiring fast write performance at very low cost. Great for testing.

Good for data that requires high fault tolerance at relatively low hardware cost (redundancy using parity requires more expensive hardware). Best for log files.

Very good for Read only data.

Data requiring high performance for both read and write and excellent reliability while trading off storage efficiency and cost.

Reference : http://technet.microsoft.com/en-us/library/cc966414.aspx#_RAID

Happy learning!

Read Full Post »

Here, take a look at Itzik Ben-Gan’s flow chart of Logical Query Processing. It is also explained in “Inside Microsoft Sql server 2008 t-sql Querying” book.

Logical Query Processing

Logical Query Processing Phases

Interesting Observation of ON Clause on LEFT JOIN – How ON Clause Effects Resultset in LEFT JOIN described here by Pinal.

Microsoft SQL Server Query Processor Internals and Architecture

A very useful article on Microsoft SQL Server Query Processor Internals and Architecture on MSDN site. Know how SQL Server processes your queries, and how and when it compiles them or re-compiles them, will allow you to write better applications.

Read Full Post »

A great post on Horizontal partitioning in SQL 2008 by Vishal. This is step by step instruction to create partition in SQL.

Difference Between Vertical Partition and Horizontal Partition

Vertical partitioning
• Case: Large Patient table. Lots of columns, including picture.
• You could: Have a table for critical columns.
• Leave all the less used and BLOB columns in a second table.

Horizontal partitioning
• Case: Sales table for last few years.
• Problem: Weekly full backup of 1 billion rows.
• Problem: INSERTs blocked by queries on historical data.
• You could: Create multiple tables with partition views (PV).
• Uses constraints to implement the partitioning rules.
• You could: Create partitioned tables (PT) with read-only file groups.
• Uses a partition function and a partition scheme.
• Saves in management, backup. Could have different SLAs per year.

For more on partitioning Click here.

Read Full Post »

To auditing the data we are using triggers to keep history of old and new values and saving in different table as per our need. For that we have to use inserted, updated and deleted syntax in triggers.

But SQL 2005 has introduced one awesome feature “Output” clause. We can use “Output” clause also with insert, update and delete statements. I have never used it before. I don’t want to miss the chance to use it right now.

Below is code for Sample DB & tables.

-- Create database
CREATE DATABASE SampleDB
GO

USE SampleDB
GO

-- Create table

CREATE TABLE MainTable
(
 EmpId INT,
 EmpName VARCHAR(10)
)

GO

-- Using Output clause with Insert statement.
-- Output clause displays data after insertion of records

INSERT INTO MainTable
Output	Inserted.EmpId,
		Inserted.EmpName
Values
	(1,'David'),
	(2,'Jon')

GO

-- Output clause used to Insert records in AuditTable

-- Create table
CREATE TABLE AuditTable
(
  EmpId  INT,
  EmpName VARCHAR(10),
  ActionDate Date,
  AuditType Varchar(10)
)

GO

INSERT INTO MainTable
output   
	inserted.EmpId,
    inserted.EmpName,
    GETDATE(),
    'Insert'
INTO AuditTable
SELECT 3,
   'Peter'
   
INSERT INTO MainTable
output   
	inserted.EmpId,
    inserted.EmpName,
    GETDATE(),
    'Insert'
INTO AuditTable
SELECT 4,
   'Chris'
   
GO

-- Check data in both tables
SELECT *
FROM  MainTable

GO

SELECT *
FROM  AuditTable

GO

-- Output clause used with Update statement
-- Insert new Emp Names in AuditTable

UPDATE MainTable
SET  EmpName = 'Bell'
output inserted.EmpId,
   inserted.EmpName,
   GETDATE(),
   'Update'
INTO AuditTable
WHERE EmpId = 1

-- Insert old Emp Names in AuditTable. In other words History data.
UPDATE MainTable
SET  EmpName = 'Andy'
output deleted.EmpId,
   deleted.EmpName,
   GETDATE(),
   'Update'
INTO AuditTable
WHERE EmpId = 2

-- Check records in AuditTable
SELECT *
FROM  MainTable

GO

SELECT *
FROM  AuditTable

GO

-- Output clause used with Delete statement

-- Old Emp Names are being insert in AuditTable

DELETE FROM MainTable
output deleted.EmpId,
   deleted.EmpName,
   GETDATE(),
   'Delete'
INTO AuditTable
WHERE EmpId = 3

DELETE FROM MainTable
output deleted.EmpId,
   deleted.EmpName,
   GETDATE(),
   'Delete'
INTO AuditTable
WHERE EmpId = 4

-- Chek the records in AuditTable & MainTable
SELECT *
FROM  MainTable

GO

SELECT *
FROM  AuditTable

GO
--

Try out, all is yours. 🙂

Read Full Post »


/*
Create the sample table, and insert a few test records for demonstration purposes.
*/

CREATE TABLE TestRecords
(
	RecordId INT IDENTITY(1,1),
	PID INT,
	EntryDate DATETIME,
	Remarks TEXT,
	CategoryID Varchar(5)
)
GO

INSERT INTO TestRecords
(PID, EntryDate, Remarks,CategoryID)
VALUES
	(1, '10/1/2008', 'Joined for Cat 1)','C1'),
	(1, '9/28/2008', 'Joined for Cat 1)','C2'),
	(2, '10/15/2008', 'Joined for Cat 2)','C1'),
	(2, '10/15/2009', 'Joined for Cat 3)','C2'),
	(3, '12/5/2008', 'Joined for Cat 3)','C2')
GO

/*
Use the Common Table Expression [CTE] to get the list of ranked
entries from the system, partition by to group by PID
*/
WITH cteRank AS
(
	SELECT
		TR.RecordId,
		TR.PID,
		TR.EntryDate,
		TR.Remarks,
		CategoryID,
		Rank() OVER (PARTITION BY TR.PID
		ORDER BY TR.EntryDate DESC) AS EntryRank
	FROM 
		TestRecords TR
)
/* Now select results */
SELECT
	cteRank.RecordId,
	cteRank.PID,
	cteRank.EntryDate,
	cteRank.Remarks
FROM
	cteRank
WHERE
	cteRank.EntryRank= 1
ORDER BY
	PID
GO

Read Full Post »

Find missing records


Create table T1 
(Col1 Varchar(2),Col2 int)

--Insert records in sequence. I have used Row_Number() to create sequence.

Insert into T1  
select top 15 'AA', ROW_NUMBER() over (order by object_id) as RID From sys.objects
select top 10 'BB', ROW_NUMBER() over (order by object_id) as RID From sys.objects

--Check the maximum records for col1
select Col1,MAX(col2) from t1 Group by Col1

--Delete any two records for particular Col1 value
Delete From T1 Where Col1 = 'AA' and col2 in (3,7)

--Execute the below query to get the deleted records for particular Col1 Value
Select 
	RID 
From
	(select * from T1 where Col1='AA') T1 Right Outer Join 
	(select top 10 ROW_NUMBER() over (order by object_id) as RID From sys.objects) T2 On T1.Col2=T2.RID
Where Col2 is null

By using above query you can make it dynamic to find out missing records for different value in the same query.


🙂

Read Full Post »


WITH DimTime as
(
SELECT 
	Cast ('2000-01-01' as DateTime) Date --Start Date
UNION ALL -- usage of recursion
SELECT 
	Date + 1
FROM 
	DimTime
WHERE 
	Date + 1  <= '2020-12-31' --End date
)

SELECT
	Row_Number() OVER (ORDER BY Date) as ROWID
	,Date
	,YEAR (date) as Year
	,DatePart ( qq, date) as Quarter
	,MONTH (date) as MonthOfYear
	,DateDiff(mm,DateAdd(qq,DateDiff(qq,0,date),0),date)+1 as MonthOfQuarter
	,DatePart (wk, Date) as WeekOfYear
	,datediff(wk,dateadd(qq,datediff(qq,0,date),0),date)+1 as WeekOfQuarter
	,datediff(wk,dateadd(mm,datediff(mm,0,date),0),date)+1 as WeekOfMonth
	,DatePart (dy, date) as DayOfYear
    ,CASE 
		WHEN DatePart(dy, MinAdDate) Between 1 and 30 Then '1 - 30' 
		WHEN DatePart(dy, MinAdDate) Between 31 and 60 Then '31 - 60' 
		WHEN DatePart(dy, MinAdDate) Between 61 and 90 Then '61 - 90' 
		WHEN DatePart(dy, MinAdDate) Between 91 and 120 Then '91 - 120' 
		WHEN DatePart(dy, MinAdDate) Between 121 and 150 Then '121 - 150' 
		WHEN DatePart(dy, MinAdDate) Between 151 and 180 Then '151 - 180' 
		ELSE '181+'														
	END as DayRangeOfYear
	,datediff(dd,dateadd(qq,datediff(qq,0,date),0),date)+1 as DayOfQuarter
	,DAY (date) as DayOfMonth
	,DatePart (dw, date) as DayOfWeek
	,DateName (mm, date) as MonthName
	,LEFT ( DateName (mm, date), 3) MonthAbbreviation
	,DateName (dw, date) as DayName
	,LEFT (DateName (dw, date), 3) as DayAbbreviation
	,CONVERT(VarChar(10),date,112) as YYYYMMDD
	,CONVERT(VarChar(10),date,111) as [YYYY/MM/DD]
	,CONVERT(VarChar(11),date,100) as [mon dd yyyy]
	,CONVERT(VarChar(11),date,120) as [yyyy-mm-dd]
FROM
	DimTime
OPTION (MAXRECURSION 0) -- Set Max Recursion

Read Full Post »