Monday, March 12, 2012

Locally Partitioned Tables for OLAP

Forgive me as I'm approaching a problem as someone familliar with
Oracle, so may be trying to solve it using techniques not applicable
for SQL Server.
I wonder if SQL Server allows me to partition tables. There's lots of
talk about distributed partitioning, but none about locally
partitioned tables. I wonder if SQL Server's query engine works in a
way that it isn't needed.
The reasons I'm looking for local partitioning are thus:
I can load the new fact data into one partition. It may not already
be empty, for example if I'm doing monthly loads into an annual
partition, but its going to be somewhat smaller than the whole fact
table. The index generation hit should be smaller with the partition.
I wonder if DTS/SQL Server supports bulk insert for extra speed.
My queries should take advantage of the partition. If I create
parallel partitions in Analysis Services, things should become faster
to maintain. For example - I only reprocess the cube for the partition
I've just loaded.
When data becomes too old to be interesting, I can just extract the
partition to a backup medium somewhere and drop it - easy as that. Say
goodbye to the partition in both SQL Server and in the Analysis
Services cube without any reprocessing hit on the rest of the data.
(Rolling sums become interesting on the boundary condition, or do
they, as I'm not reprocessing back there? hmmm. At my last employer I
made sure I kept data around long enough to satisfy the rolling sums,
so we queried to 3 years ago, and I kept 4 years online to satisfy
that)
Also, what editions of SQL Server support this kind of technology. Is
this an Enterprise Server type thing? At what point of fine graindness
do partitions become a heavy hit? Assuming I can automate all this
well, does it become sensible to use monthly partitions, or does the
cross partition query performance (12 of them for a Rolling 12 Month
or a Year To Date) drag things down?
Thanks
- RichardIn the current release of SQL Server, you can use partitioned views to
accomplish what you need. You'll need a partitioning column - enforced with
a check constraint - as part of the primary key in each member table. In
your case, it would be the relevant date. You then construct a view that
includes all members, hooked together with UNION ALL's. As you need to
add/drop a member table, simply alter the view to eliminate the old and add
the new from the definition. Then drop the un-needed table.
I don't believe you'll need the enterprise edition, since you're looking at
local partitioned views.
--
Tom
---
Thomas A. Moreau, BSc, PhD, MCSE, MCDBA
SQL Server MVP
Columnist, SQL Server Professional
Toronto, ON Canada
www.pinnaclepublishing.com
"Richard Corfield" <richard.corfield@.gmail.com> wrote in message
news:c58fd37c.0410190647.cd2e11c@.posting.google.com...
Forgive me as I'm approaching a problem as someone familliar with
Oracle, so may be trying to solve it using techniques not applicable
for SQL Server.
I wonder if SQL Server allows me to partition tables. There's lots of
talk about distributed partitioning, but none about locally
partitioned tables. I wonder if SQL Server's query engine works in a
way that it isn't needed.
The reasons I'm looking for local partitioning are thus:
I can load the new fact data into one partition. It may not already
be empty, for example if I'm doing monthly loads into an annual
partition, but its going to be somewhat smaller than the whole fact
table. The index generation hit should be smaller with the partition.
I wonder if DTS/SQL Server supports bulk insert for extra speed.
My queries should take advantage of the partition. If I create
parallel partitions in Analysis Services, things should become faster
to maintain. For example - I only reprocess the cube for the partition
I've just loaded.
When data becomes too old to be interesting, I can just extract the
partition to a backup medium somewhere and drop it - easy as that. Say
goodbye to the partition in both SQL Server and in the Analysis
Services cube without any reprocessing hit on the rest of the data.
(Rolling sums become interesting on the boundary condition, or do
they, as I'm not reprocessing back there? hmmm. At my last employer I
made sure I kept data around long enough to satisfy the rolling sums,
so we queried to 3 years ago, and I kept 4 years online to satisfy
that)
Also, what editions of SQL Server support this kind of technology. Is
this an Enterprise Server type thing? At what point of fine graindness
do partitions become a heavy hit? Assuming I can automate all this
well, does it become sensible to use monthly partitions, or does the
cross partition query performance (12 of them for a Rolling 12 Month
or a Year To Date) drag things down?
Thanks
- Richard

No comments:

Post a Comment