Skip to content

Commit 70e47a0

Browse files
authored
Merge pull request #7810 from MicrosoftDocs/master
10/23 AM Publish
2 parents 38f35b2 + 6270fe0 commit 70e47a0

27 files changed

Lines changed: 413 additions & 331 deletions

.openpublishing.redirection.json

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -30,6 +30,11 @@
3030
"redirect_url": "/sql/advanced-analytics/tutorials/demo-data-nyctaxi-in-sql",
3131
"redirect_document_id": false
3232
},
33+
{
34+
"source_path": "docs/advanced-analytics/tutorials/sqldev-py2-import-data-to-sql-server-using-powershell.md",
35+
"redirect_url": "/sql/advanced-analytics/tutorials/demo-data-nyctaxi-in-sql",
36+
"redirect_document_id": false
37+
},
3338
{
3439
"source_path": "docs/advanced-analytics/r/configuration-sql-server-r-services.md",
3540
"redirect_url": "/sql/advanced-analytics/r/managing-and-monitoring-r-solutions",

docs/advanced-analytics/toc.yml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -72,6 +72,8 @@
7272
items:
7373
- name: Data
7474
items:
75+
- name: Airline data set
76+
href: tutorials/demo-data-airlinedemo-in-sql.md
7577
- name: Iris data set
7678
href: tutorials/demo-data-iris-in-sql.md
7779
- name: NYC Taxi data set
@@ -84,8 +86,6 @@
8486
- name: Learn in-database analytics
8587
href: tutorials/sqldev-in-database-python-for-sql-developers.md
8688
items:
87-
- name: Import data
88-
href: tutorials/sqldev-py2-import-data-to-sql-server-using-powershell.md
8989
- name: Explore and visualize data
9090
href: tutorials/sqldev-py3-explore-and-visualize-the-data.md
9191
- name: Create data features
Lines changed: 62 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,62 @@
1+
---
2+
title: Airline flight arrival and delay demo data set for SQL Server Python and R tutorials | Microsoft Docs
3+
Description: Create a database containing the Airline dataset from R and Python. This dataset is used in exercises showing how to wrap R language or Python code in a SQL Server stored procedure.
4+
ms.prod: sql
5+
ms.technology: machine-learning
6+
7+
ms.date: 10/22/2018
8+
ms.topic: tutorial
9+
author: HeidiSteen
10+
ms.author: heidist
11+
manager: cgronlun
12+
---
13+
# Airline flight arrival demo data for SQL Server Python and R tutorials
14+
[!INCLUDE[appliesto-ss-xxxx-xxxx-xxx-md-winonly](../../includes/appliesto-ss-xxxx-xxxx-xxx-md-winonly.md)]
15+
16+
In this exercise, create a SQL Server database to store imported data from R or Python built-in Airline demo data sets. R and Python distributions provide equivalent data, which you can import to a SQL Server database using Management Studio.
17+
18+
To complete this exercise, you should have [SQL Server Management Studio](https://docs.microsoft.com/sql/ssms/download-sql-server-management-studio-ssms?view=sql-server-2017) or another tool that can run T-SQL queries.
19+
20+
Tutorials and quickstarts using this data set include the following:
21+
22+
+ [Create a Python model using revoscalepy](use-python-revoscalepy-to-create-model.md)
23+
24+
## Create the database
25+
26+
1. Start SQL Server Management Studio, connect to a database engine instance that has R or Python integration.
27+
28+
2. In Object Explorer, right-click **Databases** and create a new database called **flightdata**.
29+
30+
3. Right-click **flightdata**, click **Tasks**, click **Import Flat File**.
31+
32+
4. Open the AirlineDemoData.csv file provided in the R or Python distribution, depending on which language you installed.
33+
34+
For R, look for **AirlineDemoSmall.csv** at C:\Program Files\Microsoft SQL Server\MSSQL14.MSSQLSERVER\R_SERVICES\library\RevoScaleR\SampleData
35+
36+
For Python, look for **AirlineDemoSmall.csv** at C:\Program Files\Microsoft SQL Server\MSSQL14.MSSQLSERVER\PYTHON_SERVICES\Lib\site-packages\revoscalepy\data\sample_data
37+
38+
When you select the file, default values are filled in for table name and schema.
39+
40+
![Import flat file wizard showing airline demo defaults](media/import-airlinedemosmall.png)
41+
42+
Click through the remaining pages, accepting the defaults, to import the data.
43+
44+
45+
## Query the data
46+
47+
As a validation step, run a query to confirm the data was uploaded.
48+
49+
1. In Object Explorer, under Databases, right-click the **flightdata** database, and start a new query.
50+
51+
2. Run some simple queries:
52+
53+
```sql
54+
SELECT TOP(10) * FROM AirlineDemoSmall;
55+
SELECT COUNT(*) FROM AirlineDemoSmall;
56+
```
57+
58+
## Next steps
59+
60+
In the following lesson, you will create a linear regression model based on this data.
61+
62+
+ [Create a Python model using revoscalepy](use-python-revoscalepy-to-create-model.md)

docs/advanced-analytics/tutorials/demo-data-iris-in-sql.md

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
---
2-
title: Iris demo data set for SQL Server | Microsoft Docs
3-
Description: Create a database containing the Iris dataset and a table for storing models. This dataset is used in exercises showing how to wrap Python code in a SQL Server stored procedure.
2+
title: Iris demo data set for SQL Server Python and R tutorials | Microsoft Docs
3+
Description: Create a database containing the Iris dataset and a table for storing models. This dataset is used in exercises showing how to wrap R language or Python code in a SQL Server stored procedure.
44
ms.prod: sql
55
ms.technology: machine-learning
66

@@ -10,18 +10,18 @@ author: HeidiSteen
1010
ms.author: heidist
1111
manager: cgronlun
1212
---
13-
# Iris demo data for SQL Server
13+
# Iris demo data for SQL Server Python and R tutorials
1414
[!INCLUDE[appliesto-ss-xxxx-xxxx-xxx-md-winonly](../../includes/appliesto-ss-xxxx-xxxx-xxx-md-winonly.md)]
1515

16-
In this exercise, prepare a SQL Server database containing tables for the [Iris flower data set](https://en.wikipedia.org/wiki/Iris_flower_data_set) and model storage. Iris data is included in both the R and Python distributions installed by SQL Server. It's used in machine learning tutorials for SQL Server.
16+
In this exercise, create a SQL Server database to store data from the [Iris flower data set](https://en.wikipedia.org/wiki/Iris_flower_data_set) and models based on the same data. Iris data is included in both the R and Python distributions installed by SQL Server, and is used in machine learning tutorials for SQL Server.
1717

1818
To complete this exercise, you should have [SQL Server Management Studio](https://docs.microsoft.com/sql/ssms/download-sql-server-management-studio-ssms?view=sql-server-2017) or another tool that can run T-SQL queries.
1919

2020
Tutorials and quickstarts using this data set include the following:
2121

2222
+ [Use a Python model in SQL Server for training and scoring](train-score-using-python-in-tsql.md)
2323

24-
## Prepare the database and tables
24+
## Create the database
2525

2626
1. Start SQL Server Management Studio, and open a new **Query** window.
2727

@@ -134,7 +134,7 @@ You can obtain built-in Iris data from either R or Python. You can use Python or
134134
> To modify the stored procedure later, you don't need to drop and recreate it. Use the [ALTER PROCEDURE](https://docs.microsoft.com/sql/t-sql/statements/alter-procedure-transact-sql) statement.
135135

136136

137-
## Query data for verification
137+
## Query the data
138138

139139
As a validation step, run a query to confirm the data was uploaded.
140140

docs/advanced-analytics/tutorials/demo-data-nyctaxi-in-sql.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
---
22
title: Download NYC Taxi demo data and scripts for embedded R and Python (SQL Server Machine Learning) | Microsoft Docs
3-
description: Instructions for downloading New York City taxi sample data and creating a database. Data is used in SQL Server tutorials showing how to embed R and Python in SQL Server stored procedures and T-SQL functions.
3+
description: Instructions for downloading New York City taxi sample data and creating a database. Data is used in SQL Server Python and R language tutorials showing how to embed script in SQL Server stored procedures and T-SQL functions.
44
ms.prod: sql
55
ms.technology: machine-learning
66

@@ -10,7 +10,7 @@ author: HeidiSteen
1010
ms.author: heidist
1111
manager: cgronlun
1212
---
13-
# NYC Taxi demo data for SQL Server
13+
# NYC Taxi demo data for SQL Server Python and R tutorials
1414
[!INCLUDE[appliesto-ss-xxxx-xxxx-xxx-md-winonly](../../includes/appliesto-ss-xxxx-xxxx-xxx-md-winonly.md)]
1515

1616
This article explains how to set up a sample database consisting of public data from the [New York City Taxi and Limousine Commission](http://www.nyc.gov/html/tlc/html/about/trip_record_data.shtml). This data is used in several R and Python tutorials for in-database analytics in SQL Server. The sample data is one percent of the public data set. On your system, the database backup file is slightly over 90 MB, providing 1.7 million rows in the primary data table.
@@ -21,7 +21,7 @@ Tutorials and quickstarts using this data set include the following:
2121

2222
+ [Use a Python model in SQL Server for training and scoring](train-score-using-python-in-tsql.md)
2323

24-
## Download demo database
24+
## Download files
2525

2626
The sample database is a backup file hosted by Microsoft. File download begins immediately when you click the link.
2727

@@ -61,7 +61,7 @@ The following table summarizes the objects created in the NYC Taxi demo database
6161
|**PredictTipSingleMode** |stored procedure| Created by the PredictTipSingleMode.sql script. Calls the trained model to create predictions using the model. This stored procedure accepts a new observation as input, with individual feature values passed as in-line parameters, and returns a value that predicts the outcome for the new observation. This stored procedure is used in [Operationalize the R model](sqldev-operationalize-the-model.md).|
6262
|**TrainTipPredictionModel** |stored procedure|Created by the TrainTipPredictionModel.sql script. Trains a logistic regression model by calling an R package. The model predicts the value of the tipped column, and is trained using a randomly selected 70% of the data. The output of the stored procedure is the trained model, which is saved in the table nyc_taxi_models. This stored procedure is used in [Train and save a model](sqldev-train-and-save-a-model-using-t-sql.md).|
6363

64-
## Query data for verification
64+
## Query the data
6565

6666
As a validation step, run a query to confirm the data was uploaded.
6767

docs/advanced-analytics/tutorials/machine-learning-services-tutorials.md

Lines changed: 1 addition & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -18,8 +18,6 @@ This article provides a comprehensive list of the tutorials, demos, and sample a
1818

1919
+ [R tutorials](../tutorials/sql-server-r-tutorials.md)
2020

21-
For more information about requirements and how to get set up, see [Prerequisites](#bkmk_prerequisites).
22-
2321
## Samples and solutions
2422

2523
+ [Samples](#bkmk_samples)
@@ -53,7 +51,7 @@ The Microsoft Data Science Team has provided solution templates that can be used
5351

5452
For more information, see [Machine Learning Templates with SQL Server 2016 R Services](https://blogs.technet.microsoft.com/machinelearning/2016/03/23/machine-learning-templates-with-sql-server-2016-r-services/).
5553

56-
## More resources and reading
54+
## Recommended reading
5755

5856
+ [Why did we build it?](https://blogs.msdn.microsoft.com/sqlserverstorageengine/2017/01/10/sql-server-r-services-why-did-we-build-it/)
5957

54.8 KB
Loading

docs/advanced-analytics/tutorials/sqldev-in-database-python-for-sql-developers.md

Lines changed: 4 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -32,30 +32,23 @@ The data is from the well-known NYC Taxi data set. To make this walkthrough quic
3232

3333
All tasks can be done using [!INCLUDE[tsql](../../includes/tsql-md.md)] stored procedures in the familiar environment of [!INCLUDE[ssManStudio](../../includes/ssmanstudio-md.md)]
3434

35-
- [Step 1: Download the sample data](demo-data-nyctaxi-in-sql.md)
3635

37-
Download the sample dataset and all script files to a local computer.
38-
39-
- [Step 2: Import data to SQL Server using PowerShell](sqldev-py2-import-data-to-sql-server-using-powershell.md)
40-
41-
Execute a PowerShell script that creates a database and a table on the specified instance, and loads the sample data to the table.
42-
43-
- [Step 3: Explore and visualize the data using Python](sqldev-py3-explore-and-visualize-the-data.md)
36+
- [Explore and visualize the data using Python](sqldev-py3-explore-and-visualize-the-data.md)
4437

4538
Perform basic data exploration and visualization, by calling Python from [!INCLUDE[tsql](../../includes/tsql-md.md)] stored procedures.
4639

47-
- [Step 4: Create data features using Python in T-SQL](sqldev-py5-train-and-save-a-model-using-t-sql.md)
40+
- [Create data features using Python in T-SQL](sqldev-py5-train-and-save-a-model-using-t-sql.md)
4841

4942
Create new data features using custom SQL functions.
5043

51-
- [Step 5: Train and save a Python model using T-SQL](sqldev-py5-train-and-save-a-model-using-t-sql.md)
44+
- [Train and save a Python model using T-SQL](sqldev-py5-train-and-save-a-model-using-t-sql.md)
5245

5346
Build and save the machine learning model, using Python in stored procedures.
5447

5548
This walkthrough demonstrates how to perform a binary classification task; you could also use the data to build models for regression or multiclass classification.
5649

5750

58-
- [Step 6: Operationalize the Python model](sqldev-py6-operationalize-the-model.md)
51+
- [ Operationalize the Python model](sqldev-py6-operationalize-the-model.md)
5952

6053
After the model has been saved to the database, call the model for prediction using [!INCLUDE[tsql](../../includes/tsql-md.md)].
6154

docs/advanced-analytics/tutorials/sqldev-py2-import-data-to-sql-server-using-powershell.md

Lines changed: 0 additions & 100 deletions
This file was deleted.

0 commit comments

Comments
 (0)