Pivoting Rows into Columns¶
Let us understand how we can pivot rows into columns in Postgres.
Actual results
order_date |
order_status |
count |
---|---|---|
2013-07-25 00:00:00 |
CANCELED |
1 |
2013-07-25 00:00:00 |
CLOSED |
20 |
2013-07-25 00:00:00 |
COMPLETE |
42 |
2013-07-25 00:00:00 |
ON_HOLD |
5 |
2013-07-25 00:00:00 |
PAYMENT_REVIEW |
3 |
2013-07-25 00:00:00 |
PENDING |
13 |
2013-07-25 00:00:00 |
PENDING_PAYMENT |
41 |
2013-07-25 00:00:00 |
PROCESSING |
16 |
2013-07-25 00:00:00 |
SUSPECTED_FRAUD |
2 |
2013-07-26 00:00:00 |
CANCELED |
3 |
2013-07-26 00:00:00 |
CLOSED |
29 |
2013-07-26 00:00:00 |
COMPLETE |
87 |
2013-07-26 00:00:00 |
ON_HOLD |
19 |
2013-07-26 00:00:00 |
PAYMENT_REVIEW |
6 |
2013-07-26 00:00:00 |
PENDING |
31 |
2013-07-26 00:00:00 |
PENDING_PAYMENT |
59 |
2013-07-26 00:00:00 |
PROCESSING |
30 |
2013-07-26 00:00:00 |
SUSPECTED_FRAUD |
5 |
Pivoted results
order_date |
CANCELED |
CLOSED |
COMPLETE |
ON_HOLD |
PAYMENT_REVIEW |
PENDING |
PENDING_PAYMENT |
PROCESSING |
SUSPECTED_FRAUD |
---|---|---|---|---|---|---|---|---|---|
2013-07-25 |
1 |
20 |
42 |
5 |
3 |
13 |
41 |
16 |
2 |
2013-07-26 |
3 |
29 |
87 |
19 |
6 |
31 |
59 |
30 |
5 |
We need to use
crosstab
as part ofFROM
clause to pivot the data. We need to pass the main query tocrosstab
function.We need to install
tablefunc
as Postgres superuser to expose functions like crosstab -CREATE EXTENSION tablefunc;
Note
If you are using environment provided by us, you don’t need to install tablefunc
. If you are using your own environment run this command by logging in as superuser into postgres server to install tablefunc
.
CREATE EXTENSION tablefunc;
However, in some cases you might have to run scripts in postgres. Follow official instructions by searching around.
%load_ext sql
%env DATABASE_URL=postgresql://itversity_retail_user:retail_password@localhost:5432/itversity_retail_db
%%sql
SELECT order_date,
order_status,
count(1)
FROM orders
GROUP BY order_date,
order_status
ORDER BY order_date,
order_status
LIMIT 18
%%sql
SELECT * FROM crosstab(
'SELECT order_date,
order_status,
count(1) AS order_count
FROM orders
GROUP BY order_date,
order_status',
'SELECT DISTINCT order_status FROM orders ORDER BY 1'
) AS (
order_date DATE,
"CANCELED" INT,
"CLOSED" INT,
"COMPLETE" INT,
"ON_HOLD" INT,
"PAYMENT_REVIEW" INT,
"PENDING" INT,
"PENDING_PAYMENT" INT,
"PROCESSING" INT,
"SUSPECTED_FRAUD" INT
)
LIMIT 10