SQL Alchemy

10 Feb 2019

seriously? are we halfway into february ALREADY?! i feel like new year’s was like, just last week?! this is insane.

also i have been neglecting blogging - only 3 posts since the start of december! oh no

ok let’s actually do something useful today - SQL Alchemy!

credits to here for the below code: https://towardsdatascience.com/sqlalchemy-python-tutorial-79a577141a91

and because i am using Presto at work: https://github.com/dropbox/PyHive

# SQLAlchemy with Presto & LDAP
import sqlalchemy as db
engine = db.create_engine(
    connect_args={'auth': 'LDAP'},

# connect and get metadata
connection = engine.connect()
metadata = db.MetaData()

# autoload a table called census
census = db.Table('census', metadata, autoload=True, autoload_with=engine)

# print column names

# print full census table metadata

# equivalent to 'SELECT * FROM census'
query = db.select([census])

# execute the query and returns an object
ResultProxy = connection.execute(query)

# get the results from the object
ResultSet = ResultProxy.fetchall()

# see top 3 results

# convert to dataframe
df = pd.DataFrame(ResultSet)
df.columns = ResultSet[0].keys()

and the below is a single example of the SQL code in SQLAlchemy. for more examples definitely check out https://towardsdatascience.com/sqlalchemy-python-tutorial-79a577141a91

and this is the actual repo: https://github.com/vinaykudari/hacking-datascience/tree/master/notebooks/sqlalchemy

sample code

SELECT state, sex
FROM census
WHERE state IN (Texas, New York)

SQLAlchemy :
db.select([census.columns.state, census.columns.sex]).where(census.columns.state.in_(['Texas', 'New York']))

get raw sql

and if you need to get the raw SQL out (SO IMPORTANT!!!)

thanks to code from here: https://stackoverflow.com/questions/2128717/sqlalchemy-printing-raw-sql-from-create

# example for CREATE table
# compile with engine

# compile without engine

unit test

testing…!! also from the same link

from sqlalchemy import create_engine
from sqlalchemy.schema import CreateTable
from model import Foo

sql_url = "sqlite:///:memory:"    
db_engine = create_engine(sql_url)

table_sql = CreateTable(Foo.table).compile(db_engine)
self.assertTrue("CREATE TABLE foos" in str(table_sql))

dbschema changes

not yet sure how this comes into play but keeping the link here for rainy days lol


creating views


nested queries



‘with’ query


now after reading all this i have an inkling that this does not solve my problem at all…