{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "## Selecting or Projecting Data\n", "\n", "Let us understand different aspects of projecting data. We primarily using `SELECT` to project the data." ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "tags": [ "remove-cell" ] }, "outputs": [ { "data": { "text/html": [ "\n" ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "%%HTML\n", "" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "* We can project all columns using `*` or some columns using column names.\n", "* We can provide aliases to a column or expression using `AS` in `SELECT` clause.\n", "* `DISTINCT` can be used to get the distinct records from selected columns. We can also use `DISTINCT *` to get unique records using all the columns.\n", "* As part of `SELECT` clause we can have aggregate functions such as `count`, `sum` etc." ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "The sql extension is already loaded. To reload it, use:\n", " %reload_ext sql\n" ] } ], "source": [ "%load_ext sql" ] }, { "cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "env: DATABASE_URL=postgresql://itversity_retail_user:retail_password@localhost:5432/itversity_retail_db\n" ] } ], "source": [ "%env DATABASE_URL=postgresql://itversity_retail_user:retail_password@localhost:5432/itversity_retail_db" ] }, { "cell_type": "code", "execution_count": 16, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " * postgresql://itversity_retail_user:***@localhost:5432/itversity_retail_db\n", "10 rows affected.\n" ] }, { "data": { "text/html": [ "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
order_idorder_dateorder_customer_idorder_status
12013-07-25 00:00:0011599CLOSED
22013-07-25 00:00:00256PENDING_PAYMENT
32013-07-25 00:00:0012111COMPLETE
42013-07-25 00:00:008827CLOSED
52013-07-25 00:00:0011318COMPLETE
62013-07-25 00:00:007130COMPLETE
72013-07-25 00:00:004530COMPLETE
82013-07-25 00:00:002911PROCESSING
92013-07-25 00:00:005657PENDING_PAYMENT
102013-07-25 00:00:005648PENDING_PAYMENT
" ], "text/plain": [ "[(1, datetime.datetime(2013, 7, 25, 0, 0), 11599, 'CLOSED'),\n", " (2, datetime.datetime(2013, 7, 25, 0, 0), 256, 'PENDING_PAYMENT'),\n", " (3, datetime.datetime(2013, 7, 25, 0, 0), 12111, 'COMPLETE'),\n", " (4, datetime.datetime(2013, 7, 25, 0, 0), 8827, 'CLOSED'),\n", " (5, datetime.datetime(2013, 7, 25, 0, 0), 11318, 'COMPLETE'),\n", " (6, datetime.datetime(2013, 7, 25, 0, 0), 7130, 'COMPLETE'),\n", " (7, datetime.datetime(2013, 7, 25, 0, 0), 4530, 'COMPLETE'),\n", " (8, datetime.datetime(2013, 7, 25, 0, 0), 2911, 'PROCESSING'),\n", " (9, datetime.datetime(2013, 7, 25, 0, 0), 5657, 'PENDING_PAYMENT'),\n", " (10, datetime.datetime(2013, 7, 25, 0, 0), 5648, 'PENDING_PAYMENT')]" ] }, "execution_count": 16, "metadata": {}, "output_type": "execute_result" } ], "source": [ "%sql SELECT * FROM orders LIMIT 10" ] }, { "cell_type": "code", "execution_count": 17, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " * postgresql://itversity_retail_user:***@localhost:5432/itversity_retail_db\n", "4 rows affected.\n" ] }, { "data": { "text/html": [ "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
table_catalogtable_schematable_namecolumn_nameordinal_positioncolumn_defaultis_nullabledata_typecharacter_maximum_lengthcharacter_octet_lengthnumeric_precisionnumeric_precision_radixnumeric_scaledatetime_precisioninterval_typeinterval_precisioncharacter_set_catalogcharacter_set_schemacharacter_set_namecollation_catalogcollation_schemacollation_namedomain_catalogdomain_schemadomain_nameudt_catalogudt_schemaudt_namescope_catalogscope_schemascope_namemaximum_cardinalitydtd_identifieris_self_referencingis_identityidentity_generationidentity_startidentity_incrementidentity_maximumidentity_minimumidentity_cycleis_generatedgeneration_expressionis_updatable
itversity_retail_dbpublicordersorder_id1NoneNOintegerNoneNone3220NoneNoneNoneNoneNoneNoneNoneNoneNoneNoneNoneNoneitversity_retail_dbpg_catalogint4NoneNoneNoneNone1NONONoneNoneNoneNoneNoneNONEVERNoneYES
itversity_retail_dbpublicordersorder_date2NoneNOtimestamp without time zoneNoneNoneNoneNoneNone6NoneNoneNoneNoneNoneNoneNoneNoneNoneNoneNoneitversity_retail_dbpg_catalogtimestampNoneNoneNoneNone2NONONoneNoneNoneNoneNoneNONEVERNoneYES
itversity_retail_dbpublicordersorder_customer_id3NoneNOintegerNoneNone3220NoneNoneNoneNoneNoneNoneNoneNoneNoneNoneNoneNoneitversity_retail_dbpg_catalogint4NoneNoneNoneNone3NONONoneNoneNoneNoneNoneNONEVERNoneYES
itversity_retail_dbpublicordersorder_status4NoneNOcharacter varying45180NoneNoneNoneNoneNoneNoneNoneNoneNoneNoneNoneNoneNoneNoneNoneitversity_retail_dbpg_catalogvarcharNoneNoneNoneNone4NONONoneNoneNoneNoneNoneNONEVERNoneYES
" ], "text/plain": [ "[('itversity_retail_db', 'public', 'orders', 'order_id', 1, None, 'NO', 'integer', None, None, 32, 2, 0, None, None, None, None, None, None, None, None, None, None, None, None, 'itversity_retail_db', 'pg_catalog', 'int4', None, None, None, None, '1', 'NO', 'NO', None, None, None, None, None, 'NO', 'NEVER', None, 'YES'),\n", " ('itversity_retail_db', 'public', 'orders', 'order_date', 2, None, 'NO', 'timestamp without time zone', None, None, None, None, None, 6, None, None, None, None, None, None, None, None, None, None, None, 'itversity_retail_db', 'pg_catalog', 'timestamp', None, None, None, None, '2', 'NO', 'NO', None, None, None, None, None, 'NO', 'NEVER', None, 'YES'),\n", " ('itversity_retail_db', 'public', 'orders', 'order_customer_id', 3, None, 'NO', 'integer', None, None, 32, 2, 0, None, None, None, None, None, None, None, None, None, None, None, None, 'itversity_retail_db', 'pg_catalog', 'int4', None, None, None, None, '3', 'NO', 'NO', None, None, None, None, None, 'NO', 'NEVER', None, 'YES'),\n", " ('itversity_retail_db', 'public', 'orders', 'order_status', 4, None, 'NO', 'character varying', 45, 180, None, None, None, None, None, None, None, None, None, None, None, None, None, None, None, 'itversity_retail_db', 'pg_catalog', 'varchar', None, None, None, None, '4', 'NO', 'NO', None, None, None, None, None, 'NO', 'NEVER', None, 'YES')]" ] }, "execution_count": 17, "metadata": {}, "output_type": "execute_result" } ], "source": [ "%%sql \n", "\n", "SELECT * FROM information_schema.columns \n", "WHERE table_catalog = 'itversity_retail_db' \n", " AND table_name = 'orders'" ] }, { "cell_type": "code", "execution_count": 18, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " * postgresql://itversity_retail_user:***@localhost:5432/itversity_retail_db\n", "10 rows affected.\n" ] }, { "data": { "text/html": [ "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
order_customer_idorder_dateorder_status
115992013-07-25 00:00:00CLOSED
2562013-07-25 00:00:00PENDING_PAYMENT
121112013-07-25 00:00:00COMPLETE
88272013-07-25 00:00:00CLOSED
113182013-07-25 00:00:00COMPLETE
71302013-07-25 00:00:00COMPLETE
45302013-07-25 00:00:00COMPLETE
29112013-07-25 00:00:00PROCESSING
56572013-07-25 00:00:00PENDING_PAYMENT
56482013-07-25 00:00:00PENDING_PAYMENT
" ], "text/plain": [ "[(11599, datetime.datetime(2013, 7, 25, 0, 0), 'CLOSED'),\n", " (256, datetime.datetime(2013, 7, 25, 0, 0), 'PENDING_PAYMENT'),\n", " (12111, datetime.datetime(2013, 7, 25, 0, 0), 'COMPLETE'),\n", " (8827, datetime.datetime(2013, 7, 25, 0, 0), 'CLOSED'),\n", " (11318, datetime.datetime(2013, 7, 25, 0, 0), 'COMPLETE'),\n", " (7130, datetime.datetime(2013, 7, 25, 0, 0), 'COMPLETE'),\n", " (4530, datetime.datetime(2013, 7, 25, 0, 0), 'COMPLETE'),\n", " (2911, datetime.datetime(2013, 7, 25, 0, 0), 'PROCESSING'),\n", " (5657, datetime.datetime(2013, 7, 25, 0, 0), 'PENDING_PAYMENT'),\n", " (5648, datetime.datetime(2013, 7, 25, 0, 0), 'PENDING_PAYMENT')]" ] }, "execution_count": 18, "metadata": {}, "output_type": "execute_result" } ], "source": [ "%%sql \n", "\n", "SELECT order_customer_id, order_date, order_status \n", "FROM orders \n", "LIMIT 10" ] }, { "cell_type": "code", "execution_count": 19, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " * postgresql://itversity_retail_user:***@localhost:5432/itversity_retail_db\n", "10 rows affected.\n" ] }, { "data": { "text/html": [ "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
order_customer_idto_charorder_status
115992013-07CLOSED
2562013-07PENDING_PAYMENT
121112013-07COMPLETE
88272013-07CLOSED
113182013-07COMPLETE
71302013-07COMPLETE
45302013-07COMPLETE
29112013-07PROCESSING
56572013-07PENDING_PAYMENT
56482013-07PENDING_PAYMENT
" ], "text/plain": [ "[(11599, '2013-07', 'CLOSED'),\n", " (256, '2013-07', 'PENDING_PAYMENT'),\n", " (12111, '2013-07', 'COMPLETE'),\n", " (8827, '2013-07', 'CLOSED'),\n", " (11318, '2013-07', 'COMPLETE'),\n", " (7130, '2013-07', 'COMPLETE'),\n", " (4530, '2013-07', 'COMPLETE'),\n", " (2911, '2013-07', 'PROCESSING'),\n", " (5657, '2013-07', 'PENDING_PAYMENT'),\n", " (5648, '2013-07', 'PENDING_PAYMENT')]" ] }, "execution_count": 19, "metadata": {}, "output_type": "execute_result" } ], "source": [ "%%sql \n", "\n", "SELECT order_customer_id, \n", " to_char(order_date, 'yyyy-MM'), \n", " order_status \n", "FROM orders \n", "LIMIT 10" ] }, { "cell_type": "code", "execution_count": 20, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " * postgresql://itversity_retail_user:***@localhost:5432/itversity_retail_db\n", "10 rows affected.\n" ] }, { "data": { "text/html": [ "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
order_customer_idorder_monthorder_status
115992013-07CLOSED
2562013-07PENDING_PAYMENT
121112013-07COMPLETE
88272013-07CLOSED
113182013-07COMPLETE
71302013-07COMPLETE
45302013-07COMPLETE
29112013-07PROCESSING
56572013-07PENDING_PAYMENT
56482013-07PENDING_PAYMENT
" ], "text/plain": [ "[(11599, '2013-07', 'CLOSED'),\n", " (256, '2013-07', 'PENDING_PAYMENT'),\n", " (12111, '2013-07', 'COMPLETE'),\n", " (8827, '2013-07', 'CLOSED'),\n", " (11318, '2013-07', 'COMPLETE'),\n", " (7130, '2013-07', 'COMPLETE'),\n", " (4530, '2013-07', 'COMPLETE'),\n", " (2911, '2013-07', 'PROCESSING'),\n", " (5657, '2013-07', 'PENDING_PAYMENT'),\n", " (5648, '2013-07', 'PENDING_PAYMENT')]" ] }, "execution_count": 20, "metadata": {}, "output_type": "execute_result" } ], "source": [ "%%sql \n", "\n", "SELECT order_customer_id, \n", " to_char(order_date, 'yyyy-MM') AS order_month, \n", " order_status \n", "FROM orders \n", "LIMIT 10" ] }, { "cell_type": "code", "execution_count": 21, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " * postgresql://itversity_retail_user:***@localhost:5432/itversity_retail_db\n", "13 rows affected.\n" ] }, { "data": { "text/html": [ "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
order_month
2014-01
2014-05
2013-12
2013-11
2014-04
2014-07
2014-03
2013-08
2013-10
2013-07
2014-02
2013-09
2014-06
" ], "text/plain": [ "[('2014-01',),\n", " ('2014-05',),\n", " ('2013-12',),\n", " ('2013-11',),\n", " ('2014-04',),\n", " ('2014-07',),\n", " ('2014-03',),\n", " ('2013-08',),\n", " ('2013-10',),\n", " ('2013-07',),\n", " ('2014-02',),\n", " ('2013-09',),\n", " ('2014-06',)]" ] }, "execution_count": 21, "metadata": {}, "output_type": "execute_result" } ], "source": [ "%%sql \n", "\n", "SELECT DISTINCT to_char(order_date, 'yyyy-MM') AS order_month \n", "FROM orders" ] }, { "cell_type": "code", "execution_count": 22, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " * postgresql://itversity_retail_user:***@localhost:5432/itversity_retail_db\n", "1 rows affected.\n" ] }, { "data": { "text/html": [ "\n", " \n", " \n", " \n", " \n", " \n", " \n", "
count
68883
" ], "text/plain": [ "[(68883,)]" ] }, "execution_count": 22, "metadata": {}, "output_type": "execute_result" } ], "source": [ "%sql SELECT count(1) FROM orders" ] }, { "cell_type": "code", "execution_count": 23, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " * postgresql://itversity_retail_user:***@localhost:5432/itversity_retail_db\n", "1 rows affected.\n" ] }, { "data": { "text/html": [ "\n", " \n", " \n", " \n", " \n", " \n", " \n", "
distinct_month_count
13
" ], "text/plain": [ "[(13,)]" ] }, "execution_count": 23, "metadata": {}, "output_type": "execute_result" } ], "source": [ "%%sql \n", "\n", "SELECT count(DISTINCT to_char(order_date, 'yyyy-MM')) AS distinct_month_count \n", "FROM orders" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.12" } }, "nbformat": 4, "nbformat_minor": 4 }