SQL Pivot – Static or Dynamic Columns

Pivot Arrow This is going to be another one of those posts that will surely help some people, but it is far more likely to help me later 😉 I have culled these techniques from a variety of sources across the Internet, but I’m not entirely sure who taught me what in order to give proper credit.

If you want to follow along, I went to the Microsoft Sql Product Sample Site and downloaded “Adventure Works 2014 Full Database Backup.zip” and unzipped and restored it onto my database server. That gives me the OLTP version (rather than the OLAP) of the 2014 Adventure Works database. To start off, let’s take a look at the following query:

USE AdventureWorks2014
GO

SELECT c.AccountNumber, DATENAME(month, h.OrderDate) as OrderMonthName, SUM(d.LineTotal) as TotalSales
FROM Sales.SalesOrderHeader h 
INNER JOIN Sales.SalesOrderDetail d ON h.SalesOrderID = d.SalesOrderID
INNER JOIN Sales.Customer c ON h.CustomerID = c.CustomerID
WHERE YEAR(h.OrderDate) = 2011
GROUP BY c.AccountNumber, DATENAME(month, h.OrderDate)
ORDER BY c.AccountNumber

This returns results like these:
Initial Month-based Results

However, what if you wanted to easily see how each of those customers’ sales looked month over month in comparison to other customers? Typically, a data analyst might take a result set like that and use a tool like Microsoft Excel to create a pivot table. That would let your rows be per customer and your columns would be per month. Reporting wonks could also use a tool like SSRS to pivot the data for presentation to users. However, if you are using Sql Server 2005 or later, you could also use the PIVOT command.

If your data that you’d like to pivot on has a finite and known quantity, this is very easy. I’m going to change the query above to show how to do that:

USE AdventureWorks2014
GO

;WITH SalesData AS (
	SELECT c.AccountNumber, DATENAME(month, h.OrderDate) as OrderMonthName, CAST(SUM(d.LineTotal) AS NUMERIC(12,2)) as TotalSales
	FROM Sales.SalesOrderHeader h 
	INNER JOIN Sales.SalesOrderDetail d ON h.SalesOrderID = d.SalesOrderID
	INNER JOIN Sales.Customer c ON h.CustomerID = c.CustomerID
	WHERE YEAR(h.OrderDate) = 2011
	GROUP BY c.AccountNumber, DATENAME(month, h.OrderDate))

SELECT *
FROM SalesData
PIVOT (SUM(TotalSales) FOR OrderMonthName IN (January, February, March, April, May, June, July, August, September, October, November, December)) pvt

A few peculiar things to notice. First, the values that become the new columns are not quoted. They are treated like literals. I didn’t have to say ‘January’, ‘February’, etc. Secondly, notice that I had to name the pivot table (as pvt here). Even though I don’t reference it, if I omit it, the Sql statement won’t parse and execute. Look at the results below. Notice a third strange thing. Even though I selected out all of the columns (using SELECT *), it did not actually return all columns like you would normally see in a join situation. It showed the AccountNumber column from the SalesData CTE, but then it ignored the OrderMonthNames and TotalSales columns from SalesData. The PIVOT was smart enough to know that those values were being represented in other ways in our new dataset.

Pivoted Month Results

If you want/need to get rid of the NULLS in the results, you can do it the same way that you would do it in a regular query. I could change the “SELECT *” to be “SELECT AccountNumber, ISNULL(January, 0.00) AS January, ISNULL(February, 0.00) AS February” and just continue that pattern for each month.

Simple enough and that gets us pretty far along the way. But, what if I wanted to instead see customer sales by product category? I could see every product category that exists in the system right now and do a static pivot. However, everytime someone updated the system with a new product category, we would have to modify our query (potentially many queries across the enterprise). Is there a way to do that?

No, not directly.

However, we can use Dynamic Sql to dynamically determine our static columns, if that makes sense 😉 Let’s take a look at the query needed to just return the standard results, grouped by customer and product category.

USE AdventureWorks2014
GO

SELECT c.AccountNumber, pc.Name + '-' + ps.Name AS CategoryName, CAST(SUM(d.LineTotal) AS NUMERIC(12,2)) as TotalSales
FROM Sales.SalesOrderHeader h 
INNER JOIN Sales.SalesOrderDetail d ON h.SalesOrderID = d.SalesOrderID
INNER JOIN Production.Product p ON d.ProductID = p.ProductID
INNER JOIN Production.ProductSubcategory ps ON p.ProductSubcategoryID = ps.ProductSubcategoryID
INNER JOIN Production.ProductCategory pc ON ps.ProductCategoryID = pc.ProductCategoryID
INNER JOIN Sales.Customer c ON h.CustomerID = c.CustomerID
WHERE YEAR(h.OrderDate) = 2011
GROUP BY c.AccountNumber, pc.Name, ps.Name

This returns the following results:
Initial Dynamic Pivot Results

There are 8 unique category – subcategory combinations in these results, but it doesn’t take much imagination to see how new products can be added all the time. Probably several new subcategories a year. How can we handle that? The key is those columns listed out in the PIVOT operator definition. Let’s see how we can build that dynamically using a temp table to make working with the data easier.

USE AdventureWorks2014
GO

SELECT c.AccountNumber, pc.Name + '-' + ps.Name AS CategoryName, CAST(SUM(d.LineTotal) AS NUMERIC(12,2)) as TotalSales
INTO #SalesData
FROM Sales.SalesOrderHeader h 
INNER JOIN Sales.SalesOrderDetail d ON h.SalesOrderID = d.SalesOrderID
INNER JOIN Production.Product p ON d.ProductID = p.ProductID
INNER JOIN Production.ProductSubcategory ps ON p.ProductSubcategoryID = ps.ProductSubcategoryID
INNER JOIN Production.ProductCategory pc ON ps.ProductCategoryID = pc.ProductCategoryID
INNER JOIN Sales.Customer c ON h.CustomerID = c.CustomerID
WHERE YEAR(h.OrderDate) = 2011
GROUP BY c.AccountNumber, pc.Name, ps.Name

DECLARE @PivotColumns NVARCHAR(MAX), @SqlStatement NVARCHAR(MAX)

SELECT @PivotColumns= ISNULL(@PivotColumns + ',','') + QUOTENAME(CategoryName) FROM (SELECT DISTINCT CategoryName FROM #SalesData) AS Categories

SET @SqlStatement = 
	N'SELECT AccountNumber, ' + @PivotColumns + '
    FROM #SalesData
    PIVOT(SUM(TotalSales) 
          FOR CategoryName IN (' + @PivotColumns + ')) AS pvt'

EXEC sp_executesql @SqlStatement

DROP TABLE #SalesData

The bulk of the work here is just being done in this line

SELECT @PivotColumns= ISNULL(@PivotColumns + ',','') + QUOTENAME(CategoryName) FROM (SELECT DISTINCT CategoryName FROM #SalesData) AS Categories

That just select a distinct list of categories and builds a comma delimited string out of them. The value in that variable looks like this:
[Bikes-Road Bikes],[Clothing-Socks],[Clothing-Jerseys],[Clothing-Caps],[Components-Mountain Frames],[Accessories-Helmets],[Bikes-Mountain Bikes],[Components-Road Frames]

Because of the spacing in the categories, that’s why we used QUOTENAME. That makes sure the name will work in SQL statements. Ultimately, the executed query looks like this:

SELECT AccountNumber, [Bikes-Road Bikes],[Clothing-Socks],[Clothing-Jerseys],[Clothing-Caps],[Components-Mountain Frames],[Accessories-Helmets],[Bikes-Mountain Bikes],[Components-Road Frames]
FROM #SalesData
PIVOT(SUM(TotalSales) FOR CategoryName IN ([Bikes-Road Bikes],[Clothing-Socks],[Clothing-Jerseys],[Clothing-Caps],[Components-Mountain Frames],[Accessories-Helmets],[Bikes-Mountain Bikes],[Components-Road Frames])) AS pvt

You can see how that is now very much like our initial static example with the months. The query still has the NULL problem, though, so to fix that you can create another variable so the SELECTed columns can contain ISNULL but the PIVOTed columns will not. That looks like this:

USE AdventureWorks2014
GO

SELECT c.AccountNumber, pc.Name + '-' + ps.Name AS CategoryName, CAST(SUM(d.LineTotal) AS NUMERIC(12,2)) as TotalSales
INTO #SalesData
FROM Sales.SalesOrderHeader h 
INNER JOIN Sales.SalesOrderDetail d ON h.SalesOrderID = d.SalesOrderID
INNER JOIN Production.Product p ON d.ProductID = p.ProductID
INNER JOIN Production.ProductSubcategory ps ON p.ProductSubcategoryID = ps.ProductSubcategoryID
INNER JOIN Production.ProductCategory pc ON ps.ProductCategoryID = pc.ProductCategoryID
INNER JOIN Sales.Customer c ON h.CustomerID = c.CustomerID
WHERE YEAR(h.OrderDate) = 2011
GROUP BY c.AccountNumber, pc.Name, ps.Name


DECLARE @PivotColumns NVARCHAR(MAX), @SelectColumns NVARCHAR(MAX), @SqlStatement NVARCHAR(MAX)

SELECT @PivotColumns= ISNULL(@PivotColumns + ',','') + QUOTENAME(CategoryName) FROM (SELECT DISTINCT CategoryName FROM #SalesData) AS Categories
SELECT @SelectColumns = ISNULL(@SelectColumns + ',','') + CategoryName FROM (SELECT DISTINCT 'ISNULL(' + QUOTENAME(CategoryName) + ', ''0.00'') AS ' + QUOTENAME(CategoryName) AS CategoryName FROM #SalesData) AS Categories

SET @SqlStatement = 
	N'SELECT AccountNumber, ' + @SelectColumns + '
    FROM #SalesData
    PIVOT(SUM(TotalSales) 
          FOR CategoryName IN (' + @PivotColumns + ')) AS pvt'

EXEC sp_executesql @SqlStatement

DROP TABLE #SalesData

Basically, I just made the distinct list of category names to be a distinct list of category names wrapped in ISNULL. Then I used the @SelectColumns variable in the select statement and the @PivotColumns variable in the pivot list. The resulting Sql looks like this:

SELECT AccountNumber, ISNULL([Accessories-Helmets], '0.00') AS [Accessories-Helmets],ISNULL([Components-Mountain Frames], '0.00') AS [Components-Mountain Frames],ISNULL([Clothing-Caps], '0.00') AS [Clothing-Caps],ISNULL([Components-Road Frames], '0.00') AS [Components-Road Frames],ISNULL([Bikes-Road Bikes], '0.00') AS [Bikes-Road Bikes],ISNULL([Clothing-Socks], '0.00') AS [Clothing-Socks],ISNULL([Clothing-Jerseys], '0.00') AS [Clothing-Jerseys],ISNULL([Bikes-Mountain Bikes], '0.00') AS [Bikes-Mountain Bikes]
    FROM #SalesData
    PIVOT(SUM(TotalSales) 
          FOR CategoryName IN ([Bikes-Road Bikes],[Clothing-Socks],[Clothing-Jerseys],[Clothing-Caps],[Components-Mountain Frames],[Accessories-Helmets],[Bikes-Mountain Bikes],
[Components-Road Frames])) AS pvt

And the resulting data looks like this:
Dynamic Pivot Final Results

Hopefully you find this helpful (I’m sure future me will be back here). If you have any questions, let me know in the comments.

Pete on Software

SQL Pivot – Static or Dynamic Columns

Pete

Leave a Reply

Pete on Software

Related posts:

Pete

Leave a Reply