5 min read

How to connect Python to a local SQL Server

1 Introduction

Recently, I have been dealing with a wide variety of topics in the field of SQL. You can find my blog posts here: Michael Fuchs SQL

I recommend you to read it as a preparation if you are not familiar with SQL yet.

In this post I want to show how to access a SQL database (here SQL Server from Microsoft) using Python and run appropriate queries.

2 Import the Libraries and Preparations

To make SQL queries via Python we need the library pyodbc. This can be installed as follows:

pip install pyodbc

import pandas as pd
import pyodbc

In preparation for this post, I created a database with tables via SQL Server Management Studio (SSMS). How to do that I have described in detail in this post: Create a Database

3 Connection to the DB

A local SQL server can be reached with the following syntax. Of course, the name of the respective database must be adjusted.

cnxn = pyodbc.connect(driver='{SQL Server}', server='(local)', database='MyDB',               
               trusted_connection='yes')

cursor = cnxn.cursor()

Or this version written a little nicer:

conn_str = (
    r'driver={SQL Server};'
    r'server=(local);'
    r'database=MyDB;'
    r'trusted_connection=yes;'
    )

cnxn = pyodbc.connect(conn_str)

cursor = cnxn.cursor()

4 Exploration of the respective DB

The first thing I do when I connect to a new database via Python is to see which tables are stored here. I have described here which possibilities there are for this: Data Wrangling / Get an Overview of the Data

query = "SELECT * FROM SYSOBJECTS WHERE xtype = 'U';"

df_existing_tables = pd.read_sql(query, cnxn)
df_existing_tables

Or a bit nicer:

query_mod = '''
            SELECT * 
                FROM SYSOBJECTS 
                WHERE xtype = 'U'
            ;
            '''

df_existing_tables = pd.read_sql(query_mod, cnxn)
df_existing_tables

As we can see, two tables are stored in the ‘MyDB’ database.

Now I would like to see which columns are stored in the table ‘Customer’:

query = "EXEC sp_columns 'Customer';"

df_existing_columns = pd.read_sql(query, cnxn)
df_existing_columns

5 Loading tables from DB

Of course, now I have the option here to load the stored tables in the database in Python. This goes as follows:

5.1 Complete Dataframe

query = '''
        SELECT * 
            FROM Customer
        ;
            '''

df_Customer = pd.read_sql(query, cnxn)
df_Customer

5.2 Selected Data

Of course, I can also specify my data query:

with a WHERE Statement

query = '''
        SELECT * 
            FROM Customer
            WHERE ID_Customer < 3
        ;
            '''

df_Customer_selected = pd.read_sql(query, cnxn)
df_Customer_selected

with a newly generated column

query = '''
        SELECT First_Name, Last_Name,
               CONCAT(First_Name, ' ', Last_Name) AS Full_Name
            FROM Customer
        ;
        '''

df_Customer_modified = pd.read_sql(query, cnxn)
df_Customer_modified

with joined data

query = '''
        SELECT t1.ID_Customer,
               t1.First_Name,
               t1.Last_Name,
               t2.Nationality
            FROM Customer AS t1
            LEFT JOIN Customer_metadata AS t2
                ON t1.ID_Customer = t2.ID_Customer
        ;
        '''

df_Customer_joined = pd.read_sql(query, cnxn)
df_Customer_joined

and so on …

I can run all sorts of SQL queries here and load them into Python as I need the data.

6 Data Manipulation in SQL Server using Python

From Python, I can also manipulate the contents of existing tables on a SQL server.

6.1 Insert Values into SQL Server Table

cursor.execute('''
               INSERT INTO Customer VALUES ('No', 'Name')
               ;
               ''')
cnxn.commit()
query = '''
        SELECT * 
            FROM Customer
        ;
            '''

df_Customer_manipulated = pd.read_sql(query, cnxn)
df_Customer_manipulated

Worked. A new row was added to the existing record.

Let’s take a look at the SQL Server itself:

6.2 Delete Records in SQL Server

Now we will delete this column again.

cursor.execute('''
               DELETE FROM Customer
                    WHERE First_Name = 'No'
               ;
               ''')
cnxn.commit()
query = '''
        SELECT * 
            FROM Customer
        ;
            '''

df_Customer_original = pd.read_sql(query, cnxn)
df_Customer_original

Let’s take another look at SQL Server itself:

7 Inserting a Python Dataframe into SQL Server

Now I may want to write a new or edited record to a database. This is also possible from here.

Here is an example data set that I have created in Python:

df = pd.DataFrame({'Name': ['Maria', 'Marc', 'Julia'],
                   'Age': [32,22,62],
                   'Height': [162, 184, 170],
                   'Gender': ['female', 'male', 'female']})
df

In order to write data to a new table on the SQL Server, I need to create it on the server first.

cursor.execute('''
               CREATE TABLE NewDF
                    (Name VARCHAR(100) NOT NULL,
                    Age INT NOT NULL,
                    Height INT NOT NULL,
                    Gender VARCHAR(100) NOT NULL)
               ;
               ''')
cnxn.commit()

Let’s take a quick look via SQL Server Object Explorer at the newly created table:

It is now listed as we can see. Let’s try to run a query on this table.

query = '''
        SELECT * 
            FROM NewDF
        ;
            '''

df_NewDF = pd.read_sql(query, cnxn)
df_NewDF

Fits, there are no data stored yet but we want to insert them in the next step:

for index, row in df.iterrows():
     cursor.execute("INSERT INTO dbo.NewDF (Name,Age,Height,Gender) values(?,?,?,?)", row.Name, row.Age, row.Height, row.Gender)
cnxn.commit()

or written a bit more clearly:

for index, row in df.iterrows():
     cursor.execute('''
                    INSERT INTO dbo.NewDF 
                    (Name,Age,Height,Gender) 
                    values(?,?,?,?)
                    ''', 
                    row.Name, 
                    row.Age, 
                    row.Height, 
                    row.Gender)
cnxn.commit()

Let’s start the query again:

query = '''
        SELECT * 
            FROM NewDF
        ;
            '''

df_NewDF = pd.read_sql(query, cnxn)
df_NewDF

Worked. The data from our table created in Python is now stored on the SQL server!

8 Conclusion

How to make queries with SQL is part of the basic knowledge if you want to work in the field of Data Science.

How to connect to a SQL Server via Python and execute queries on it I have shown in this post.

Here is the SQL syntax I used to create the database and the tables:

SET LANGUAGE ENGLISH


CREATE DATABASE MyDB;

USE MyDB;


CREATE TABLE Customer
    (ID_Customer INT NOT NULL IDENTITY(1,1) PRIMARY KEY,
    First_Name VARCHAR(100) NOT NULL,
    Last_Name VARCHAR(100) NOT NULL)
    ;


INSERT INTO Customer VALUES ('Max', 'Steel')
INSERT INTO Customer VALUES ('Jessy', 'Williams')
INSERT INTO Customer VALUES ('Marc', 'Pike')
INSERT INTO Customer VALUES ('Emily', 'Taylor')

SELECT * FROM Customer




CREATE TABLE Customer_metadata
    (ID INT NOT NULL IDENTITY(1,1) PRIMARY KEY,
     ID_Customer INT NOT NULL FOREIGN KEY REFERENCES Customer(ID_Customer),
     Nationality VARCHAR(100) NOT NULL)
    ;


INSERT INTO Customer_metadata VALUES (1, 'German')
INSERT INTO Customer_metadata VALUES (2, 'French')
INSERT INTO Customer_metadata VALUES (3, 'English')
INSERT INTO Customer_metadata VALUES (4, 'Spanish')


SELECT * FROM Customer_metadata



SELECT * FROM NewDF