Skip to content

Latest commit

 

History

History
62 lines (39 loc) · 3.48 KB

File metadata and controls

62 lines (39 loc) · 3.48 KB
title In-database Python analytics for SQL developers | Microsoft Docs
description Learn how to embed Python code in SQL Server stored procedures and T-SQL functions.
ms.prod sql
ms.technology machine-learning
ms.date 10/29/2018
ms.topic tutorial
author HeidiSteen
ms.author heidist
manager cgronlun

Tutorial: In-Database Python analytics for SQL developers

[!INCLUDEappliesto-ss-xxxx-xxxx-xxx-md-winonly]

In this tutorial for SQL programmers, you gain hands-on experience using the Python language to build and deploy a machine learning solution by wrapping Python code in stored procedures.

This tutorial uses a subset of a well-known public dataset, based on trips in New York City taxis. To make the sample code run quicker, we created a representative 1% sampling of the data. You'll use this data to build a binary classification model that predicts whether a particular trip is likely to get a tip or not, based on columns such as the time of day, distance, and pick-up location.

Note

This tutorial is available in both R and Python. For the R version, see In-database analytics for R developers.

Overview

The process of building a machine learning solution is a complex one that can involve multiple tools, and the coordination of subject matter experts across several phases:

  • obtaining and cleaning data
  • exploring the data and building features useful for modeling
  • training and tuning the model
  • deployment to production

Development and testing of the actual code is best performed using a dedicated development environment. However, after the script is fully tested, you can easily deploy it to [!INCLUDEssNoVersion] using [!INCLUDEtsql] stored procedures in the familiar environment of [!INCLUDEssManStudio].

Whether you are a SQL programmer new to Python, or a Python developer new to SQL, this multi-part tutorial introduces a typical workflow for conducting in-database analytics with Python and SQL Server.

After the model has been saved to the database, call the model for predictions from [!INCLUDEtsql] by using stored procedures.

Prerequisites

All tasks can be done using [!INCLUDEtsql] stored procedures in [!INCLUDEssManStudio].

This tutorial assumes familiarity with basic database operations such as creating databases and tables, importing data, and writing SQL queries. It does not assume you know Python. As such, all Python code is provided.

Next steps

[!div class="nextstepaction"] Explore and visualize the data using Python