**This will only work on Windows PC Devices and Microsoft Outlook email client**
Why you should use Python to Automate SQL Queries in Trino (Presto)
In the world of big data, SQL is a critical tool for managing and analyzing large datasets. Many data analysts find themselves in the same cycle:
Run SQL, Copy and Paste Results into Excel, Format Report, and Repeat.
This process of writing and executing SQL queries can be time-consuming and tedious. That’s where automation comes in, and Python is the perfect language for automating SQL tasks. In this article, we’ll show you how to use Python to run SQL in Trino (formerly Presto), a distributed SQL query engine that’s particularly well-suited to handling big data.
We recommend running this in a Jupyter Notebook, so you can easily run each code block, section by section.
First, import the prestodb package
import prestodb
Next, Gather details about your database and credentials.
Next, you’ll need to connect to Presto using Python. You can do this using the prestodb package which allows you to automate SQL queries by connecting to Presto via Python. You’ll need to specify the Presto coordinator URL and any necessary authentication details.
Need Help Finding Database Details?
Assuming you already have a successful connection in another SQL platform, you can find details on your host url, port in your connection settings.
Finally, you’re ready to run!
Input a test SQL query in the code above and run! It should return your SQL output as a dataframe.
import pandas as pd query = f''' select * from table ''' conn=prestodb.dbapi.connect( host='HOST URL HERE', port= 1234, user='YOUR_USERNAME_HERE' catalog='CATALOG HERE', schema='SCHEMA HERE', http_scheme='https', auth=prestodb.auth.BasicAuthentication('YOUR_USERNAME_HERE','YOUR_PASSWORD_HERE'),) cur = conn.cursor() cur.execute(query) conn.commit() query_results = cur.fetchall() #returns result as dataframe df = pd.DataFrame.from_records(query_results, columns = [i[0] for i in cur.description]) print(df)
Leave a Reply