In this article, we will explore how to connect Python to Snowflake . We will walk you through the necessary steps to set up the connection, Whether you’re a data engineer or data scientist this guide will provide you with the knowledge you need to get started with Python on Snowflake.
Step 1. Installing snowflake.connector in Python
For installing the snowflake connector on Python you need to have Python version 3.7 or later.
After checking the version, you will need to run the following command on your venv.
pip install snowflake-connector-python
If you need to install a different version you can run the following command
pip install snowflake-connector-python==<version>
We will need to install an aditional library for working on Pandas, this library will help us to use commands as fetch_pandas_all() and write_pandas, those commands will help us to read/insert elements from Snowflake into a pandas dataframe.
pip install "snowflake-connector-python[pandas]"
Step 2. Using Default Authenticator
The first thing you need to do is to import the snowflake connector and pandas library.
import snowflake.connector import pandas as pd
Then you can connect to snowflake by providing the following information.
conn = sf.connect(user='user', password='password', account='zk85032.south-central-us.azure', role='ACCOUNTADMIN', warehouse='COMPUTE_WH', database='SNOWFLAKE_SAMPLE_DATA', schema='TPCH_SF1' )
If you want to go into details, for this connection string you need to provide the following credentials.
First the user and the password, you can create one but don’t forget to give access rights for being able to work.
Then you have to provide the account name, if you are not sure about the account name, you can go to your snowflake web console and “Sign out”. Then go to “Sign in” into your snowflake account and use part of the url that you have there as follows.
For Role and warehouse you can “Sign in” into snowsight and check the roles and warehouses available for that account. If you don’t have any role or warehouse with that account, you will have to assign or create it.
Then we are going to create a cursor and a query which we are going to run.
cursor = conn.cursor() str_query = f""" SELECT * FROM SNOWFLAKE_SAMPLE_DATA.TPCH_SF1.CUSTOMER """ cursor.execute(str_query)
If connection has no errors you will get a message similar to the one that I got referencing snowflake.connector.cursor.
If you want to go an step further and pull data from snowflake into pandas, you can use the following query.
df = cursor.fetch_pandas_all() display(df)
This will put the information from the query that we wrote into a pandas dataframe.
As a final step, we are going to close the cursor.
Advanced connector management
Working with a context manager
A context manager usually takes care of setting up some resource as opening a connection, it automatically handles the clean up when we are done with it.
We can avoid some issues that can occur while working with the snowflake connector by using a context manager, on this way we would not need to close the cursor after executing the query.
As you can see in this script, this part do the same as the query we used before but the main difference is that by using the context manager, we are avoiding having some issues related with closing the cursor if the query fails.
with sf.connect(user='user', password='password', account='zk85032.south-central-us.azure', role='ACCOUNTADMIN', warehouse='COMPUTE_WH', database='SNOWFLAKE_SAMPLE_DATA', schema='TPCH_SF1' ) as conn: cursor = conn.cursor() str_query = f""" SELECT * FROM SNOWFLAKE_SAMPLE_DATA.TPCH_SF1.CUSTOMER """ cursor.execute(str_query) df = cursor.fetch_pandas_all() display(df)
Sometimes an error in the connection can occur, for that we can implement the following script.
import snowflake.connector as sf import pandas as pd import time tries = 3 for i in range(tries): try: with sf.connect(user='user', password='password', account='zk85032.south-central-us.azure', role='ACCOUNTADMIN', warehouse='COMPUTE_WH', database='SNOWFLAKE_SAMPLE_DATA', schema='TPCH_SF1' ) as conn: cursor = conn.cursor() str_query = f""" SELECT * FROM SNOWFLAKE_SAMPLE_DATA.TPCH_SF1.CUSTOMER """ cursor.execute(str_query) df = cursor.fetch_pandas_all() except Exception as e: error = f"Failed to read data from snowflake table " print(error) if i < tries - 1: time.sleep(5) continue break
This script includes a number of attempts to reconnect to snowflake, this in case of a network failure or other kind of issue that can happen during Python/Snowflake connection.
I recommend using this script as it is going to handle some errors that can be present during the connection.
Also it is not a good practice to display the credentials in the main script, for managing this you will need to use configuration files for hidding your credentials.
By using a configuration file you can ignore that file on git, on that way you are not going to expose your credentials by sending them into the repository.
I presented my credentials exposed due educational purposes as this practice is not recommended.