Saturday, 10 May 2008

DDD Scotland: Fourth Session

Making the leap into advanced SQL

Tony Rogerson

Sub Queries

Sub queries (in the select statement) execute once per each row

Derived tables

select *
from (
select type, name
from sys.objects
where type='U'
) as Syst ( type, name)

This can be used to overcome issues with unions etc.

  • Nested Functions in sql 2000 causes the function to be called twice.

  • Logically derived tables are a virtual table.

  • There are no statistics held on them.

  • Its self contained.

Common Table Expressions

Can be used for recursion

;with cteName (name, type, rowpos)
as (
select name,type, rowpos = RowNumber() over (partition by type order by name)
from sys.objects

Unlike derived tables this does let you self join e.g.

declare @HowMany int
set @HowMay = 10
with loopIT(i, descr)
AS (
select i=1, descr=CAST('Anchor' as varchar(100))
union all
select i=i+1, descr = CAST('recuresed' as varchar(100))
Where i<@howmany

select i, descr
from LoopID

Like derived table there are no stats, and they are expanded - but they do allow self join unlike derived tables. By default you can only recurse 100 times - you can extend this with MAXRECURSION.

 Temporary tables/Table Variables

#Tables hold stats - you can tell it to refresh them and then the plan is more accurate.

Table Variables are only meant for < 1000 rows - so you get no stats but you do get the option to recompile - option (recompile). However it is not very accurate

The Sql plan uses the order of the column to compute selects not the most optimal index as it would use on a #table.

Table variable is not in memory, even local variables are stored on disk.

Hash tables do cause a compile lock so this causes an overhead, where as table variables dont - so they are better!

Cumulative Totals

You can use sub queries for this

select *,
tytd = (select sum(accounts) where xxx),
mytd = (select sum(accounts) where xxx)
from accounts a

The one pass approach uses the case expression

select *,
tytp = sum(case when xxx then amount else 0),
mytp = sum(case when xxx then amount else 0)
from accounts a

You can use pivots tables also, and CTE tables


You shouldn't use Duration - it is has a high duration but a low number of reads that means there is a problem somewhere. Reads, CPU are the two columns you should read.

  • Sub query = 6.5k reads

  • 1 pass = 417 reads

  • union all = 1200 reads

  • and the pivot 1201 reads

Running Totals

CTE can do it by adding a row number, then self join look for the previous row and sum as you go along. However the query gets copied into each query + it doesnt filter rows correctly so this costs a lot in reads

Better approach is to copy it into a temporary table, index it then make your CTE read from the temporary table. goes from 20k reads to 7k.

No comments: