| title | Tutorial for in-database analytics using R and SQL Server Machine Learning | Microsoft Docs |
|---|---|
| description | Learn how to embed R programming language code in SQL Server stored procedures and T-SQL functions. |
| ms.prod | sql |
| ms.technology | machine-learning |
| ms.date | 10/29/2018 |
| ms.topic | tutorial |
| author | HeidiSteen |
| ms.author | heidist |
| manager | cgronlun |
[!INCLUDEappliesto-ss-xxxx-xxxx-xxx-md-winonly]
In this tutorial for SQL programmers, you gain hands-on experience using the R language to build and deploy a machine learning solution by wrapping R code in stored procedures.
This tutorial uses a well-known public dataset, based on trips in New York City taxis. To make the sample code run quicker, we created a representative 1% sampling of the data. You'll use this data to build a binary classification model that predicts whether a particular trip is likely to get a tip or not, based on columns such as the time of day, distance, and pick-up location.
Note
This tutorial is available in both R and Python. For the Python version, see In-database analytics for Python developers.
The process of building a machine learning solution is a complex one that can involve multiple tools, and the coordination of subject matter experts across several phases:
- obtaining and cleaning data
- exploring the data and building features useful for modeling
- training and tuning the model
- deployment to production
Development and testing of the actual code is best performed using a dedicated development environment. However, after the script is fully tested, you can easily deploy it to [!INCLUDEssNoVersion] using [!INCLUDEtsql] stored procedures in the familiar environment of [!INCLUDEssManStudio].
Whether you are a SQL programmer new to R, or an R developer new to SQL, this multi-part tutorial is introduces a typical workflow for conducting in-database analytics with R and SQL Server.
-
Lesson 3: Train and save an R model using functions and stored procedures
-
Lesson 4: Predict potential outcomes using an R model in a stored procedure
After the model has been saved to the database, call the model for predictions from [!INCLUDEtsql] by using stored procedures.
All tasks can be done using [!INCLUDEtsql] stored procedures in [!INCLUDEssManStudio].
This tutorial assumes familiarity with basic database operations such as creating databases and tables, importing data, and writing SQL queries. It does not assume you know R. As such, all R code is provided.
[!div class="nextstepaction"] Set up the NYC Taxi database