Registers: CURRENT_DATE, CURRENT_TIME: automatically get the current timing when running the script
Sub-queries and Nested Selects
Sub-query
A query inside another query, there are column expressions and Table Expressions (Derived Tables)
Column Expressions: Substitue column name with a sub-query
Table Expressions: Substitue TABLE name with a sub-query
Multiple Tables
Ways to access multiple tables in the same query:
Sub-queries
Implicit JOIN
The result is a full join (cartesian join), every row in the 1st table is joined with every row in 2nd tabel, the result set have more rows than both tables
We can limit the result set:
JOIN operators: INNER JOIN, OUTER JOIN, …
Types of Joins:
Inner join: returns only intercept (matches) of 2 or more tables
Outer join
Left Outer Join: Takes all rows from the left table and the matching rows from right table
- Right Outer Join: Takes all rows from the right table and the matching rows from left table
- Full Outer Join: All rows from 2 tables
Access Database using Python
fromdbmoduleimportconnect# create connection object
connection_db=connect('databasename','username','pswd')# Create a cursor object
cursor=connect_db.cursor()# Run queries
cursor.execute('select * from mytabel')results=cursor.fetchall()# Free resources
cursor.close()connection_db.close()
Use pandas
importpandasaspdimportdb_module# create connection object
connection_db=db_module.connection(...)df=pandas.read_sql('SELECT * FROM TABLE_NAME',connection_db)
Writing sql queries in jupyter notebook:
%%sql# the whole block will be interpreted as a sql query
%sql# the single line will be interpreted as a sql query
Real World Databases
May need to use string and exact letter for column names
Column names with spaces: spaces are mapped to underscores, similar for other characters
Access list of tables and their properties from query system catalog