| title | Python tutorial: Ski rentals |
|---|---|
| description | In part three of this four-part tutorial series, you'll build a linear regression model in Python to predict ski rentals in SQL Server Machine Learning Services. |
| ms.prod | sql |
| ms.technology | machine-learning |
| ms.date | 09/03/2019 |
| ms.topic | tutorial |
| author | dphansen |
| ms.author | davidph |
| ms.custom | seo-lt-2019 |
| monikerRange | >=sql-server-2017||>=sql-server-linux-ver15||=sqlallproducts-allversions |
[!INCLUDEappliesto-ss-xxxx-xxxx-xxx-md]
In this four-part tutorial series, you will use Python and linear regression in SQL Server Machine Learning Services to predict the number of ski rentals. The tutorial use a Python notebook in Azure Data Studio, but you can also use your own Python integrated development environment (IDE).
Imagine you own a ski rental business and you want to predict the number of rentals that you'll have on a future date. This information will help you get your stock, staff, and facilities ready.
In the first part of this series, you'll get set up with the prerequisites. In parts two and three, you'll develop some Python scripts in a Jupyter notebook to prepare your data and train a machine learning model. Then, in part three, you'll run those Python scripts inside SQL Server using T-SQL stored procedures.
In this article, you'll learn how to:
[!div class="checklist"]
- Import a sample database into SQL Server
In part two, you'll learn how to load the data from SQL Server into a Python data frame, and prepare the data in Python.
In part three, you'll learn how to train a linear regression model in Python.
In part four, you'll learn how to store the model to SQL Server, and then create stored procedures from the Python scripts you developed in parts two and three. The stored procedures will run in SQL Server to make predictions based on new data.
-
SQL Server Machine Learning Services - For how to install Machine Learning Services, see the Windows installation guide or the Linux installation guide.
-
Python IDE - This tutorial uses a Python notebook in Azure Data Studio. For more information, see How to use notebooks in Azure Data Studio.
You can also use your own Python IDE, such as a Jupyter notebook or Visual Studio Code with the Python extension and the mssql extension.
-
revoscalepy package - The revoscalepy package is included in SQL Server Machine Learning Services. To use the package on a client computer, see Set up a data science client for Python development for options to install this package locally.
If you're using a Python notebook in Azure Data Studio, follow these additional steps to use revoscalepy:
- Open Azure Data Studio
- From the File menu, select Preferences and then Settings
- Expand Extensions and select Notebook configuration
- Under Python Path, enter the path where you installed the libraries (for example,
C:\path-to-python-for-mls) - Make sure Use Existing Python is checked
- Restart Azure Data Studio
If you're using a different Python IDE, follow similar steps for your IDE.
-
SQL query tool - This tutorial assumes you're using Azure Data Studio. You can also use SQL Server Management Studio (SSMS).
-
Additional Python packages - The examples in this tutorial series use Python packages that you may or may not have installed. Use the following pip commands to install these packages if necessary.
pip install pandas pip install sklearn pip install pickle
The sample dataset used in this tutorial has been saved to a .bak database backup file for you to download and use.
-
Download the file TutorialDB.bak.
-
Follow the directions in Restore a database from a backup file in Azure Data Studio, using these details:
- Import from the TutorialDB.bak file you downloaded
- Name the target database "TutorialDB"
-
You can verify that the dataset exists after you have restored the database by querying the dbo.rental_data table:
USE TutorialDB; SELECT * FROM [dbo].[rental_data];
In part one of this tutorial series, you completed these steps:
- Installed the prerequisites
- Import a sample database into an SQL Server
To prepare the data from the TutorialDB database, follow part two of this tutorial series:
[!div class="nextstepaction"] Python Tutorial: Prepare data to train a linear regression model