Skip to content

Latest commit

 

History

History
148 lines (112 loc) · 5.5 KB

File metadata and controls

148 lines (112 loc) · 5.5 KB
title Insert Python dataframe into SQL table
titleSuffix SQL machine learning
description How to insert data from a dataframe into SQL table.
author WilliamDAssafMSFT
ms.author wiassaf
ms.date 09/16/2021
ms.service sql
ms.subservice machine-learning
ms.topic how-to
monikerRange >=sql-server-2017||>=sql-server-linux-ver15||=azuresqldb-mi-current||=azuresqldb-current

Insert Python dataframe into SQL table

[!INCLUDESQL Server SQL DB SQL MI]

This article describes how to insert a pandas dataframe into a SQL database using the pyodbc package in Python.

Prerequisites

::: moniker range=">=sql-server-2017||>=sql-server-linux-ver15"

::: moniker range="=azuresqldb-current"

::: moniker range="=azuresqldb-mi-current"

Install Python packages

  1. In Azure Data Studio, open a new notebook and connect to the Python 3 kernel.

  2. Select Manage Packages.

    :::image type="content" source="../media/python-dataframe-sql-server/manage-packages.png" alt-text="Manage packages":::

  3. In the Manage Packages pane, select the Add new tab.

  4. For each of the following packages, enter the package name, click Search, then click Install.

    • pyodbc
    • pandas

Create a sample CSV file

Copy the following text and save it to a file named department.csv.

DepartmentID,Name,GroupName,
1,Engineering,Research and Development,
2,Tool Design,Research and Development,
3,Sales,Sales and Marketing,
4,Marketing,Sales and Marketing,
5,Purchasing,Inventory Management,
6,Research and Development,Research and Development,
7,Production,Manufacturing,
8,Production Control,Manufacturing,
9,Human Resources,Executive General and Administration,
10,Finance,Executive General and Administration,
11,Information Services,Executive General and Administration,
12,Document Control,Quality Assurance,
13,Quality Assurance,Quality Assurance,
14,Facilities and Maintenance,Executive General and Administration,
15,Shipping and Receiving,Inventory Management,
16,Executive,Executive General and Administration

Create a new database table

  1. Follow the steps in Connect to a SQL Server to connect to the AdventureWorks database.

  2. Create a table named HumanResources.DepartmentTest. The SQL table will be used for the dataframe insertion.

    CREATE TABLE [HumanResources].[DepartmentTest](
    [DepartmentID] [smallint] NOT NULL,
    [Name] [dbo].[Name] NOT NULL,
    [GroupName] [dbo].[Name] NOT NULL
    )
    GO

Load a dataframe from the CSV file

Use the Python pandas package to create a dataframe, load the CSV file, and then load the dataframe into the new SQL table, HumanResources.DepartmentTest.

  1. Connect to the Python 3 kernel.

  2. Paste the following code into a code cell, updating the code with the correct values for server, database, username, password, and the location of the CSV file.

    import pyodbc
    import pandas as pd
    # insert data from csv file into dataframe.
    # working directory for csv file: type "pwd" in Azure Data Studio or Linux
    # working directory in Windows c:\users\username
    df = pd.read_csv("c:\\user\\username\department.csv")
    # Some other example server values are
    # server = 'localhost\sqlexpress' # for a named instance
    # server = 'myserver,port' # to specify an alternate port
    server = 'yourservername' 
    database = 'AdventureWorks' 
    username = 'username' 
    password = 'yourpassword' 
    cnxn = pyodbc.connect('DRIVER={SQL Server};SERVER='+server+';DATABASE='+database+';UID='+username+';PWD='+ password)
    cursor = cnxn.cursor()
    # Insert Dataframe into SQL Server:
    for index, row in df.iterrows():
         cursor.execute("INSERT INTO HumanResources.DepartmentTest (DepartmentID,Name,GroupName) values(?,?,?)", row.DepartmentID, row.Name, row.GroupName)
    cnxn.commit()
    cursor.close()
  3. Run the cell.

Confirm data in the database

Connect to the SQL kernel and AdventureWorks database and run the following SQL statement to confirm the table was successfully loaded with data from the dataframe.

SELECT count(*) from HumanResources.DepartmentTest;

Results

(No column name)
16

Next steps