Free Online Courses for Software Developers - MrBool
× Please, log in to give us a feedback. Click here to login
×

You must be logged to download. Click here to login

×

MrBool is totally free and you can help us to help the Developers Community around the world

Yes, I'd like to help the MrBool and the Developers Community before download

No, I'd like to download without make the donation

×

MrBool is totally free and you can help us to help the Developers Community around the world

Yes, I'd like to help the MrBool and the Developers Community before download

No, I'd like to download without make the donation

Aggregating and Pivoting Data in SQL Server

In this article we will talk about a new class of aggregations that have similar behavior to SQL standard aggregations, but which produce tuples with a flat outline.

Aggregation is a rank of function to give aggregated columns in a straight outline. Nearly all theories need datasets with straight outline as case as enter with a number of proceedings and one capricious or dimensions per columns. Managing large datasets except DBMS support can be a difficult job. Trying different subsets of data points and dimensions is more convenient, faster and easier to do inside a relational database with SQL queries than outside with alternative handler. Horizontal aggregation can be performing by using handler, this will be able to simply be alive or implement within a doubt analyzer, a large amount similar to a choose, plan and unite. Hinge handler lying on columnar facts that swapping rows, enable data transformations useful in data modeling, data analysis, and data representation. There are many existing procedures and operators for aggregation in Structured Query Language. The most commonly used aggregation is the sum of a piece and additional summative handlers give the standard, greatest, smallest or line calculates above club of tuples.

Introduction:

I'll introduce a new class of aggregations that have similar behavior to SQL standard aggregations, but which produce tuples with a flat outline. In dissimilarity, we describe normal SQL summations vertical summations because they manufacture tuples with a perpendicular plan. Straight summations just necessitate a petite grammar conservatory to sum procedure called in a SELECT query. Alternatively, horizontal described hinge, two handlers on columnar facts that switch over rows and columns. Haixun Wang [14] implemented ATLaS, to develop complete data-intensive applications in SQL-by writing new aggregates and table functions in SQL, it includes query rewriting, optimization techniques and the data stream management module.Carlos Ordonez [1] introduced techniques to efficiently compute fundamental statistical models inside a DBMS exploiting User-Defined Functions (UDFs).

SQL Server in MS SQL Server 2005, 2008 pivoting is achieved using CASE/GROUP BY statements, i.e. the same way as we do for other RDBMS. Sentence structure for it is as in example.

Listing 1: Sample showing the query structure

SELECT ClintID
, 
SUM 
(
CASE WHEN Datediff
(
day
,
Billdate
,
getDate
())
>
1 
AND
Datediff
(
day
,
Billdate
,
getDate
())
<=
15 THEN 1 ELSE 0 END
)
as Days15
,
SUM 
(
CASE WHEN Datediff
(
day
,
Billdate
,
getDate
())
>
15 
AND
Datediff
(
day
,
Billdate
,
getDate
())
<=
30 THEN 1 ELSE 0 END
)
as Days30
,
SUM 
(
CASE WHEN Datediff
(
day
,
Billdate
,
getDate
())
>
30 
AND
Datediff
(
day
,
Billdate
,
getDate
())
<=
45 THEN 1 ELSE 0 END
)
as Days45
,
SUM 
(
CASE WHEN Datediff
(
day
,
Billdate
,
getDate
())
>
45 THEN 1 ELSE 0 END
)
as Morethan45 
FROM dbo.Bill 
WHERE paiddflag 
=
0 
GROUP BY ClintID 
ORDER BY ClintID 
GO

Execution plan for above statement (abridged):

Listing 2: Sample showing execution plan

|--Sort
(ORDER BY:([DCIPHR].[dbo].[Bill].[CLINTID] A
SC)) 
|--Hash Match
(Aggregate, HASH:([DCIPHR].[dbo].[Bill].[CLINTID]) 
DEFINE:([Expr1003]=SUM(CASE WHEN datediff(day,[DCIPHR].[dbo].[Bill].[INVICE_DT],getdate())>(1)
AND datediff(day,[DCIPHR].[dbo].[Bill].[Bill
_DATE],getdate())<=(15) THEN (1) ELSE (0) END))) 
|--Clustered Index Scan 
(OBJECT:([DCIPHR].[dbo].[Bill].[PKBill]), 
WHERE:([DCIPHR].[dbo].[Bill].[PaidFlag]=(0)))

Everybody may observe from over, the implementation preparation displays to facilitate the analyzer perform a cluster catalog examine and afterward a confusion equivalent (for summation) and finally cataloging. Allow us run the question using PIVOT handler incorporated in SQL Server 2005. We are presumptuous thus since toward you be alert by common grammar of PIVOT. For detailed information you can refer BOL.

Listing 3: Sample showing pivot handler

SELECT ClintId
,
[1] as Days15
,
[2] days30
,
[3] days45
,
[4] morethan45 
FROM 
(
SELECT ClintID
,
(
CASE 
WHEN Datediff
(
day
,
InviceDt
,
getDate
())
>
1 
AND
Datediff
(
day
,
Invice_Dt
,
getDate
())
<=
15 THEN 1 
WHEN Datediff
(
day
,
Invice_Dt
,
getDate
())
>
15 
AND
Datediff
(
day
,
Invice_Dt
,
getDate
())
<=
30 THEN 2 
WHEN Datediff
(
day
,
Invice_Dt
,
getDate
())
>
30 
AND
Datediff
(
day
,
Invic_Dt
,
getDate
())
<=
45 THEN 3 
WHEN Datediff
(
day
,
Invice_Dt
,
getDate
())
>
45 THEN 4 
END 
)
as days 
FROM DBO.Invice 
WHERE paiddflag 
=
0 
)
p 
pivot
(
COUNT
(
days
)
for Days 
IN
(
[1]
,
[2]
,
[3]
,
[4]
)
)
as pvt 
GO 

Let us analyze now execution plan for the query. Following is the abridged version of plan.

Listing 4: Sample showing different execution plan

|
--Compute Scalar 
(DEFINE:([Exp104]=CONVRT_IMPLCIT(int,[ glblag 1010],0), 
[Exp105]=CONVRT_IMPLCIT(int,[glblag10012],0))
) 
|
--Stream Aggregate
(GROUP BY :([DCIPHR].[dbo].[ Invice].[CLENTID]) 
DEFINE :(( [glblag 1010] =SUM ([prtalag1009]))) 
|
--Sort
(ORDER BY:([DCIPHR].[dbo].[ Invice].[CLENTID] A
SC)) 
|
--Hash Match
(Aggregate, HASH:([DCIPHR].[dbo].[ Invice].[CLENTID]) 
DEFINE:([prtalag 1009]=COUNT(CASE WHEN [Expr1003]
=(1) THEN [Expr1003] ELSE NULL END))) 
|
--Compute Scalar
(DEFINE:([Expr1003]=CASE WHEN 
datediff(day,[DCIPHR].[dbo].[ Invice].[ Invice _DT
TE],getdate())>(1)
|
--Clustered Index Scan 
(OBJECT:([DCIPHR].[dbo].[ Invice].[PK_ Invice]) 

From above execution plan we can see that some extra plans have been performed when we used query with PIVOT operator. We ran test with 400806 rows on machine with 4G RAM and Pentium core i3 CPU, OS Windows 2003 Server SPI. Equally analysis comes reverse by means of results in concerning one succeeding. In fact query with PIVOT operator took a little more time.

Restrictions of the Gyrate/Ungyrate operators:

Even though the pivot operator has several advantages and is very useful, it makes some limitations which we are listing below:

  • Value of pivoting columns can be defined by only “IN” expression.
  • Also all the values should be known at the time of production of the query. So it is very much static. If column values are not known then we need to resort back to stored procedure approach where we have to build it dynamically and choice ably. That means that one will need to write up a stored procedure that takes in a query, the row-column list and dynamically do the pivoting. It can become a performance issue and not only that, one will be exposing their code to SQL transaction issues if one does that (though there are ways to mitigate it). Now let use convert row data into column data. Make sure that PRODUCT table is created and inhabited with facts conventional SQL to exchange line facts into piece facts is as below with UNIONALL handler. It will work on SQL Server every version after SQL Server 2000.

Listing 5: Sample showing Gyrate/Ungyrate operators

SELECT PRDUCTID
,
'STYLE' AS ATTRIBUTE
,
STYLE AS ATTRIBUTEVALUE 
FROM DBO.PRDUCT WHERE PRDUCTID IN (1,2) 
UNION 
ALL
SELECT PRDUCTID
,
'COLOR'
,
COLOR 
FROM DBO.PRDUCT WHERE PRDUCTID IN (1,2) 
UNION 
ALL
SELECT PRDUCTID
,
'FLAMMABLE'
,
FLAMMABLE 
FROM DBO.PRDUCT WHERE PRDUCTID IN (1,2) 

PIVOT and UNPIVOT in SQL:

It is possible to start pivoting in standard SQL, though the syntax is cumbersome and its performance is generally poor. One method to express pivoting uses scalar sub queries in the projection list. Each pivoted column is created through a separate (but nearly identical) sub query. For database uses that do not support PIVOT, users could employ this technique to perform pivoting operations.

Possible PIVOT Syntax:

Alas, this approach has limitations that restrict the power of pivoting. Each column has redundant syntax, which is cumbersome as the number of pivoted columns increases. These syntaxes are also potentially tough to optimize. For this syntax, the query optimizer is presented with a number of sub-queries, making it harder to identify that this whole operation represents a “Pivot” on a single table. In practice, this is not an easy operation, making pivot-specific optimizations very difficult. The common problem is that the intent of the query is difficult to infer from the syntax or common relational algebra representation. Therefore, we propose the following syntax for PIVOT as an additional option under the

rule of the ANSI SQL grammar. This syntax is easier to read and better captures the intent of the desired operation. Repetition is eliminated, making queries easier to ready, write, and maintain. It shows that this approach also enables additional query optimization techniques.

Listing 6: Sample showing PIVOT Syntax

Base Case (no pivoted rows):
Inductive Case (one or more pivoted value(s) z):
PIVOT(D,cv, i in W + {z}, Y)
J
PIVOT(D,c v, i in W, Y)
J
LeftOuterJoin(D=D’)
GroupBy(D’, wi = Y(cv'))
J’
s (p’

Conclusion:

We introduce two new data handling operators, Pivot and Unpivot, for use inside the Relational Database Management System. These make it better by many existing user scenarios and enable several new ones. Further, this paper outlines the basic syntactic, semantic, and implementation issues necessary to add this functionality to an existing Relational Database Management System based on numerical, charge based output and numerical statistics stream implementation. Pivot is an extendable part of Group By with unique restrictions and optimization opportunities, and this makes it very convenient to implement increasingly on high of existing grouping implementations. Finally, we represent a number of axioms of algebraic transforms useful in an implementation of Pivot and Unpivot.



Website: www.techalpine.com Have 16 years of experience as a technical architect and software consultant in enterprise application and product development. Have interest in new technology and innovation area along with technical...

What did you think of this post?
Services
[Close]
To have full access to this post (or download the associated files) you must have MrBool Credits.

  See the prices for this post in Mr.Bool Credits System below:

Individually – in this case the price for this post is US$ 0,00 (Buy it now)
in this case you will buy only this video by paying the full price with no discount.

Package of 10 credits - in this case the price for this post is US$ 0,00
This subscription is ideal if you want to download few videos. In this plan you will receive a discount of 50% in each video. Subscribe for this package!

Package of 50 credits – in this case the price for this post is US$ 0,00
This subscription is ideal if you want to download several videos. In this plan you will receive a discount of 83% in each video. Subscribe for this package!


> More info about MrBool Credits
[Close]
You must be logged to download.

Click here to login