Making the leap into advanced SQL
Sub Queries
Sub queries (in the select statement) execute once per each row
Derived tables
select *
from (
select type, name
from sys.objects
where type='U'
) as Syst ( type, name)
This can be used to overcome issues with unions etc.
- Nested Functions in sql 2000 causes the function to be called twice.
- Logically derived tables are a virtual table.
- There are no statistics held on them.
- Its self contained.
Common Table Expressions
Can be used for recursion
;with cteName (name, type, rowpos)
as (
select name,type, rowpos = RowNumber() over (partition by type order by name)
from sys.objects
)
Unlike derived tables this does let you self join e.g.
declare @HowMany int
set @HowMay = 10
with loopIT(i, descr)
AS (
select i=1, descr=CAST('Anchor' as varchar(100))
union all
select i=i+1, descr = CAST('recuresed' as varchar(100))
from LOOPIT
Where i<@howmany
)
select i, descr
from LoopID
Like derived table there are no stats, and they are expanded - but they do allow self join unlike derived tables. By default you can only recurse 100 times - you can extend this with MAXRECURSION.
Temporary tables/Table Variables
#Tables hold stats - you can tell it to refresh them and then the plan is more accurate.
Table Variables are only meant for < 1000 rows - so you get no stats but you do get the option to recompile - option (recompile). However it is not very accurate
The Sql plan uses the order of the column to compute selects not the most optimal index as it would use on a #table.
Table variable is not in memory, even local variables are stored on disk.
Hash tables do cause a compile lock so this causes an overhead, where as table variables dont - so they are better!
Cumulative Totals
You can use sub queries for this
select *,
tytd = (select sum(accounts) where xxx),
mytd = (select sum(accounts) where xxx)
from accounts a
The one pass approach uses the case expression
select *,
tytp = sum(case when xxx then amount else 0),
mytp = sum(case when xxx then amount else 0)
from accounts a
You can use pivots tables also, and CTE tables
Performance
You shouldn't use Duration - it is has a high duration but a low number of reads that means there is a problem somewhere. Reads, CPU are the two columns you should read.
- Sub query = 6.5k reads
- 1 pass = 417 reads
- union all = 1200 reads
- and the pivot 1201 reads
Running Totals
CTE can do it by adding a row number, then self join look for the previous row and sum as you go along. However the query gets copied into each query + it doesnt filter rows correctly so this costs a lot in reads
Better approach is to copy it into a temporary table, index it then make your CTE read from the temporary table. goes from 20k reads to 7k.
No comments:
Post a Comment