Saturday, 10 May 2008

DDD Scotland: Fourth Session

Making the leap into advanced SQL

Tony Rogerson

Sub Queries

Sub queries (in the select statement) execute once per each row

Derived tables

select *
from (
select type, name
from sys.objects
where type='U'
) as Syst ( type, name)



This can be used to overcome issues with unions etc.




  • Nested Functions in sql 2000 causes the function to be called twice.


  • Logically derived tables are a virtual table.


  • There are no statistics held on them.


  • Its self contained.



Common Table Expressions



Can be used for recursion




;with cteName (name, type, rowpos)
as (
select name,type, rowpos = RowNumber() over (partition by type order by name)
from sys.objects
)



Unlike derived tables this does let you self join e.g.




declare @HowMany int
set @HowMay = 10
with loopIT(i, descr)
AS (
select i=1, descr=CAST('Anchor' as varchar(100))
union all
select i=i+1, descr = CAST('recuresed' as varchar(100))
from LOOPIT
Where i<@howmany
)

select i, descr
from LoopID



Like derived table there are no stats, and they are expanded - but they do allow self join unlike derived tables. By default you can only recurse 100 times - you can extend this with MAXRECURSION.



 Temporary tables/Table Variables



#Tables hold stats - you can tell it to refresh them and then the plan is more accurate.



Table Variables are only meant for < 1000 rows - so you get no stats but you do get the option to recompile - option (recompile). However it is not very accurate



The Sql plan uses the order of the column to compute selects not the most optimal index as it would use on a #table.



Table variable is not in memory, even local variables are stored on disk.



Hash tables do cause a compile lock so this causes an overhead, where as table variables dont - so they are better!



Cumulative Totals



You can use sub queries for this




select *,
tytd = (select sum(accounts) where xxx),
mytd = (select sum(accounts) where xxx)
from accounts a



The one pass approach uses the case expression




select *,
tytp = sum(case when xxx then amount else 0),
mytp = sum(case when xxx then amount else 0)
from accounts a



You can use pivots tables also, and CTE tables



Performance



You shouldn't use Duration - it is has a high duration but a low number of reads that means there is a problem somewhere. Reads, CPU are the two columns you should read.




  • Sub query = 6.5k reads


  • 1 pass = 417 reads


  • union all = 1200 reads


  • and the pivot 1201 reads



Running Totals



CTE can do it by adding a row number, then self join look for the previous row and sum as you go along. However the query gets copied into each query + it doesnt filter rows correctly so this costs a lot in reads



Better approach is to copy it into a temporary table, index it then make your CTE read from the temporary table. goes from 20k reads to 7k.

No comments: