Skip to content

Commit ac1a540

Browse files
committed
Links, clarifications
1 parent 2a9dbc6 commit ac1a540

4 files changed

Lines changed: 58 additions & 41 deletions

File tree

docs/advanced-analytics/r/how-to-do-realtime-scoring.md

Lines changed: 9 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -21,9 +21,9 @@ The following table summarizes the scoring frameworks for forecasting and predic
2121

2222
| Methodology | Interface | Library requirements | Processing speeds |
2323
|-----------------------|-------------------|----------------------|----------------------|
24-
| Extensibility framework | R: [rxPredict](https://docs.microsoft.com/machine-learning-server/r-reference/revoscaler/rxpredict) <br/>Python: [rx_predict](https://docs.microsoft.com/machine-learning-server/python-reference/revoscalepy/rx-predict) | None. Models can be based on any R or Python function | Hundreds of milliseconds. <br/>Loading a runtime environment has a fixed cost, averaging three to six hundred milliseconds, before any new data is scored. |
25-
| Real-time scoring CLR extension | [sp_rxPredict](https://docs.microsoft.com//sql/relational-databases/system-stored-procedures/sp-rxpredict-transact-sql) on a serialized model | R: RevoScaleR, MicrosoftML <br/>Python: revoscalepy, microsoftml | Tens of milliseconds, on average. |
26-
| Native scoring C++ extension| [PREDICT T-SQL function](https://docs.microsoft.com/sql/t-sql/queries/predict-transact-sql) on a serialized model | R: RevoScaleR <br/>Python: revoscalepy | Less than 20 milliseconds, on average. |
24+
| Extensibility framework | [rxPredict (R)](https://docs.microsoft.com/machine-learning-server/r-reference/revoscaler/rxpredict) <br/>[rx_predict (Python)](https://docs.microsoft.com/machine-learning-server/python-reference/revoscalepy/rx-predict) | None. Models can be based on any R or Python function | Hundreds of milliseconds. <br/>Loading a runtime environment has a fixed cost, averaging three to six hundred milliseconds, before any new data is scored. |
25+
| [Real-time scoring CLR extension](../real-time-scoring.md) | [sp_rxPredict](https://docs.microsoft.com//sql/relational-databases/system-stored-procedures/sp-rxpredict-transact-sql) on a serialized model | R: RevoScaleR, MicrosoftML <br/>Python: revoscalepy, microsoftml | Tens of milliseconds, on average. |
26+
| [Native scoring C++ extension](../sql-native-scoring.md) | [PREDICT T-SQL function](https://docs.microsoft.com/sql/t-sql/queries/predict-transact-sql) on a serialized model | R: RevoScaleR <br/>Python: revoscalepy | Less than 20 milliseconds, on average. |
2727

2828
Speed of processing and not substance of the output is the differentiating feature. Assuming the same functions and inputs, the scored output should not vary based on the approach you use.
2929

@@ -39,12 +39,13 @@ _Scoring_ is a two-step process. First, you specify an already trained model to
3939

4040
Taking a step back, the overall process of preparing the model and then generating scores can be summarized this way:
4141

42-
1. Create a model using a supported algorithm.
43-
2. Serialize the model using a special binary format.
44-
3. Make the model available to SQL Server. Typically this means storing the serialized model in a SQL Server table.
45-
4. Call the function or stored procedure, specifying the model and input data as parameters.
42+
1. Create a model using a supported algorithm. Support varies by the scoring methodology you choose.
43+
2. Train the model.
44+
3. Serialize the model using a special binary format.
45+
3. Save the model to SQL Server. Typically this means storing the serialized model in a SQL Server table.
46+
4. Call the function or stored procedure, specifying the model and data inputs as parameters.
4647

47-
When the input includes many rows of data, it is usually faster to insert the prediction values into a table as part of the scoring process. Generating a single score is more typical in a scenario where you get input values from a form or user request, and return the score to a client application. To improve performance when generating successive scores, SQL Server might cache the model so that it can be reloaded into memory.
48+
When the input includes many rows of data, it is usually faster to insert the prediction values into a table as part of the scoring process. Generating a single score is more typical in a scenario where you get input values from a form or user request, and return the score to a client application. To improve performance when generating successive scores, SQL Server might cache the model so that it can be reloaded into memory.
4849

4950
## Compare methods
5051

docs/advanced-analytics/real-time-scoring.md

Lines changed: 6 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -14,20 +14,20 @@ manager: cgronlun
1414
# Real-time scoring with sp_rxPredict in SQL Server machine learning
1515
[!INCLUDE[appliesto-ss-xxxx-xxxx-xxx-md-winonly](../includes/appliesto-ss-xxxx-xxxx-xxx-md-winonly.md)]
1616

17-
Real-time scoring uses the CLR extension capabilities in SQL Server for high-performance predictions or scores in forecasting workloads. Because real-time scoring is language-agnostic, it executes with no dependencies on R or Python run times. Assuming a model created from Microsoft functions, trained, and serialized to a binary format in SQL Server, you can use real-time scoring to generate predicted outcomes on new data inputs on SQL Server instances that do not have the R or Python add-on features installed.
17+
Real-time scoring uses the [sp_rxPredict](https://docs.microsoft.com//sql/relational-databases/system-stored-procedures/sp-rxpredict-transact-sql) system stored procedure and the CLR extension capabilities in SQL Server for high-performance predictions or scores in forecasting workloads. Real-time scoring is language-agnostic and executes with no dependencies on R or Python run times. Assuming a model created and trained using Microsoft functions, and then serialized to a binary format in SQL Server, you can use real-time scoring to generate predicted outcomes on new data inputs on SQL Server instances that do not have the R or Python add-on installed.
1818

1919
## How real-time scoring works
2020

21-
Real-time scoring is supported in both SQL Server 2017 and SQL Server 2016, on specific model types based on RevoScaleR or MicrosoftML functions such as [rxLinMod (RevoScaleR)](https://docs.microsoft.com/machine-learning-server/r-reference/revoscaler/rxlinmod) or [rxNeuralNet (MicrosoftML)](https://docs.microsoft.com/machine-learning-server/r-reference/microsoftml/rxneuralnet). It uses native C++ libraries to generate scores, based on user input provided to a machine learning model stored in a special binary format.
21+
Real-time scoring is supported in both SQL Server 2017 and SQL Server 2016, on [supported model types](#bkmk_py_supported_algos) for linear and logistic regression and decision tree modeling. It uses native C++ libraries to generate scores, based on user input provided to a machine learning model stored in a special binary format.
2222

2323
Because a trained model can be used for scoring without having to call an external language runtime, the overhead of multiple processes is reduced. This supports much faster prediction performance for production scoring scenarios. Because the data never leaves SQL Server, results can be generated and inserted into a new table without any data translation between R and SQL.
2424

2525
Real-time scoring is a multi-step process:
2626

2727
1. The stored procedure that does scoring must be enabled on a per-database basis.
2828
2. You load the pre-trained model in binary format.
29-
3. You provide new input data, either tabular or single rows, as input to the model.
30-
4. To generate scores, call the sp_rxPredict stored procedure.
29+
3. You provide new input data to be scored, either tabular or single rows, as input to the model.
30+
4. To generate scores, call the [sp_rxPredict](https://docs.microsoft.com//sql/relational-databases/system-stored-procedures/sp-rxpredict-transact-sql) stored procedure.
3131

3232
> [!TIP]
3333
> For an example of real-time scoring in action, see [End to End Loan ChargeOff Prediction Built Using Azure HDInsight Spark Clusters and SQL Server 2016 R Service](https://blogs.msdn.microsoft.com/rserver/2017/06/29/end-to-end-loan-chargeoff-prediction-built-using-azure-hdinsight-spark-clusters-and-sql-server-2016-r-service/)
@@ -42,6 +42,8 @@ Real-time scoring is a multi-step process:
4242

4343
+ Serialize the model using [rxSerialize](https://docs.microsoft.com/machine-learning-server/r-reference/revoscaler/rxserializemodel) for R, and [rx_serialize_model](https://docs.microsoft.com/machine-learning-server/python-reference/revoscalepy/rx-serialize-model) for Python. These serialization functions have been optimized to support fast scoring.
4444

45+
+ Save the model to the database engine instance from which you want to call it. This instance is not required to have the R or Python runtime extension.
46+
4547
> [!Note]
4648
> Real-time scoring is currently optimized for fast predictions on smaller data sets, ranging from a few rows to hundreds of thousands of rows. On big datasets, using [rxPredict](https://docs.microsoft.com/machine-learning-server/r-reference/revoscaler/rxpredict) might be faster.
4749

docs/advanced-analytics/sql-native-scoring.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -14,9 +14,9 @@ manager: cgronlun
1414
# Native scoring using the PREDICT T-SQL function
1515
[!INCLUDE[appliesto-ss-asdb-xxxx-xxx-md](../includes/appliesto-ss-asdb-xxxx-xxx-md.md)]
1616

17-
Native scoring leverages the native C++ extension capabilities in SQL Server 2017 to generate prediction values or *scores* for new data inputs in near-real-time. This methodology offers the fastest processing speed of forecasting and prediction workloads, but comes with platform and library requirements: only functions from RevoScaleR and revoscalepy have C++ implementations.
17+
Native scoring uses [PREDICT T-SQL function](https://docs.microsoft.com/sql/t-sql/queries/predict-transact-sql) and the native C++ extension capabilities in SQL Server 2017 to generate prediction values or *scores* for new data inputs in near-real-time. This methodology offers the fastest possible processing speed of forecasting and prediction workloads, but comes with platform and library requirements: only functions from RevoScaleR and revoscalepy have C++ implementations.
1818

19-
Native scoring requires that you have an already trained model. In SQL Server 2017 Windows or Linux, or in Azure SQL Database, you can use the PREDICT function in Transact-SQL to invoke native scoring. The PREDICT function takes a pre-trained model and generates scores over data inputs you provide.
19+
Native scoring requires that you have an already trained model. In SQL Server 2017 Windows or Linux, or in Azure SQL Database, you can call the PREDICT function in Transact-SQL to invoke native scoring against new data that you provide as an input parameter. The PREDICT function returns scores over data inputs you provide.
2020

2121
## How native scoring works
2222

docs/relational-databases/system-stored-procedures/sp-rxpredict-transact-sql.md

Lines changed: 41 additions & 27 deletions
Original file line numberDiff line numberDiff line change
@@ -25,13 +25,11 @@ manager: cgronlun
2525
# sp_rxPredict
2626
[!INCLUDE[tsql-appliesto-ss2016-xxxx-xxxx-xxx-md](../../includes/tsql-appliesto-ss2016-xxxx-xxxx-xxx-md.md)]
2727

28-
Generates a predicted value for a given input based on a machine learning model stored in a binary format in a SQL Server database.
28+
Generates a predicted value for a given input consisting of a machine learning model stored in a binary format in a SQL Server database.
2929

30-
Provides scoring on R and Python machine learning models in near real-time. `sp_rxPredict` is a stored procedure provided as a wrapper for the `rxPredict` R function in [RevoScaleR](https://docs.microsoft.com/r-server/r-reference/revoscaler/revoscaler) and [MicrosoftML](https://docs.microsoft.com/r-server/r-reference/microsoftml/microsoftml-package), and the [rx_predict](https://docs.microsoft.com/machine-learning-server/python-reference/revoscalepy/rx-predict) Python function in [revoscalepy](https://docs.microsoft.com/machine-learning-server/python-reference/revoscalepy/revoscalepy-package) and [microsoftml](https://docs.microsoft.com/machine-learning-server/python-reference/microsoftml/microsoftml-package). It is written in C+ and is optimized specifically for scoring operations.
30+
Provides scoring on R and Python machine learning models in near real-time. `sp_rxPredict` is a stored procedure provided as a wrapper for the `rxPredict` R function in [RevoScaleR](https://docs.microsoft.com/r-server/r-reference/revoscaler/revoscaler) and [MicrosoftML](https://docs.microsoft.com/r-server/r-reference/microsoftml/microsoftml-package), and the [rx_predict](https://docs.microsoft.com/machine-learning-server/python-reference/revoscalepy/rx-predict) Python function in [revoscalepy](https://docs.microsoft.com/machine-learning-server/python-reference/revoscalepy/revoscalepy-package) and [microsoftml](https://docs.microsoft.com/machine-learning-server/python-reference/microsoftml/microsoftml-package). It is written in C++ and is optimized specifically for scoring operations.
3131

32-
**This topic applies to**:
33-
- [SQL Server 2017 Machine Learning Services (In-Database) with R](https://docs.microsoft.com/sql/advanced-analytics/install/sql-machine-learning-services-windows-install)
34-
- [SQL Server 2016 R Services](https://docs.microsoft.com/sql/advanced-analytics/install/sql-r-services-windows-install), with [upgraded R components](https://docs.microsoft.com/sql/advanced-analytics/r/use-sqlbindr-exe-to-upgrade-an-instance-of-sql-server) providing the MicrosoftML library
32+
Although the model must be created using R or Python, once it is serialized and stored in a binary format on a target database engine instance, it can be consumed from that database engine instance even when R or Python integration is not installed. For more information, see [Real-time scoring with sp_rxPredict](https://docs.microsoft.com/sql/advanced-analytics/real-time-scoring).
3533

3634

3735
## Syntax
@@ -60,32 +58,25 @@ Additional score columns, such as confidence interval, can be returned if the al
6058
To enable use of the stored procedure, SQLCLR must be enabled on the instance.
6159

6260
> [!NOTE]
63-
> Consider the security implications before you enable this option.
61+
> There are security implications to enabing this option. Use an alternative implementation, such as the [Transact-SQL PREDICT](https://docs.microsoft.com/sql/t-sql/queries/predict-transact-sql?view=sql-server-2017) function, if SQLCLR cannot be enabled on your server.
6462
6563
The user needs `EXECUTE` permission on the database.
6664

6765

68-
### Supported platforms
69-
70-
- SQL Server 2017 Machine Learning Services (includes R Server 9.2)
71-
- SQL Server 2017 Machine Learning Server (Standalone)
72-
- SQL Server R Services 2016, with an upgrade of the R Services instance to R Server 9.1.0 or later
73-
74-
7566
### Supported algorithms
7667

77-
+ RevoScaleR models
68+
To create and train model, use one of the supported algorithms for R or Python, provided by [SQL Server 2016 R Services](https://docs.microsoft.com/sql/advanced-analytics/r/sql-server-r-services?view=sql-server-2017), [SQL Server 2016 R Server (Standalone)](https://docs.microsoft.com/sql/advanced-analytics/r/r-server-standalone?view=sql-server-2016), [SQL Server 2017 Machine Learning Services (R or Python)](https://docs.microsoft.com//sql/advanced-analytics/what-is-sql-server-machine-learning?view=sql-server-2017), or [SQL Server 2017 Server (Standalone) (R or Python)](https://docs.microsoft.com/sql/advanced-analytics/r/r-server-standalone?view=sql-server-2017).
7869

7970

80-
+ [rxLinMod](https://docs.microsoft.com/machine-learning-server/r-reference/revoscaler/rxlinmod) \*
81-
+ [rxLogit](https://docs.microsoft.com/machine-learning-server/r-reference/revoscaler/rxlogit) \*
82-
+ [rxBTrees](https://docs.microsoft.com/machine-learning-server/r-reference/revoscaler/rxbtrees) \*
83-
+ [rxDtree](https://docs.microsoft.com/machine-learning-server/r-reference/revoscaler/rxdtree) \*
84-
+ [rxdForest](https://docs.microsoft.com/machine-learning-server/r-reference/revoscaler/rxdforest) \*
85-
86-
Models marked with \* also support native scoring with the PREDICT function.
71+
#### R: RevoScaleR models
8772

88-
+ MicrosoftML models
73+
+ [rxLinMod](https://docs.microsoft.com/machine-learning-server/r-reference/revoscaler/rxlinmod)
74+
+ [rxLogit](https://docs.microsoft.com/machine-learning-server/r-reference/revoscaler/rxlogit)
75+
+ [rxBTrees](https://docs.microsoft.com/machine-learning-server/r-reference/revoscaler/rxbtrees)
76+
+ [rxDtree](https://docs.microsoft.com/machine-learning-server/r-reference/revoscaler/rxdtree)
77+
+ [rxdForest](https://docs.microsoft.com/machine-learning-server/r-reference/revoscaler/rxdforest)
78+
79+
#### R: MicrosoftML models
8980

9081
+ [rxFastTrees](https://docs.microsoft.com/machine-learning-server/r-reference/microsoftml/rxfasttrees)
9182
+ [rxFastForest](https://docs.microsoft.com/machine-learning-server/r-reference/microsoftml/rxfastforest)
@@ -94,23 +85,46 @@ The user needs `EXECUTE` permission on the database.
9485
+ [rxNeuralNet](https://docs.microsoft.com/machine-learning-server/r-reference/microsoftml/rxneuralnet)
9586
+ [rxFastLinear](https://docs.microsoft.com/machine-learning-server/r-reference/microsoftml/rxfastlinear)
9687

97-
+ Transformations supplied by MicrosoftML
88+
#### R: Transformations supplied by MicrosoftML
9889

9990
+ [featurizeText](https://docs.microsoft.com/machine-learning-server/r-reference/microsoftml/rxfasttrees)
10091
+ [concat](https://docs.microsoft.com/machine-learning-server/r-reference/microsoftml/concat)
10192
+ [categorical](https://docs.microsoft.com/machine-learning-server/r-reference/microsoftml/categorical)
10293
+ [categoricalHash](https://docs.microsoft.com/machine-learning-server/r-reference/microsoftml/categoricalHash)
10394
+ [selectFeatures](https://docs.microsoft.com/machine-learning-server/r-reference/microsoftml/selectFeatures)
10495

96+
#### Python: revoscalepy models
97+
98+
+ [rx_lin_mod](https://docs.microsoft.com/machine-learning-server/python-reference/revoscalepy/rx-lin-mod)
99+
+ [rx_logit](https://docs.microsoft.com/machine-learning-server/python-reference/revoscalepy/rx-logit)
100+
+ [rx_btrees](https://docs.microsoft.com/machine-learning-server/python-reference/revoscalepy/rx-btrees)
101+
+ [rx_dtree](https://docs.microsoft.com/machine-learning-server/python-reference/revoscalepy/rx-dtree)
102+
+ [rx_dforest](https://docs.microsoft.com/machine-learning-server/python-reference/revoscalepy/rx-dforest)
103+
104+
105+
#### Python: microsoftml models
106+
107+
+ [rx_fast_trees](https://docs.microsoft.com/machine-learning-server/python-reference/microsoftml/rx-fast-trees)
108+
+ [rx_fast_forest](https://docs.microsoft.com/machine-learning-server/python-reference/microsoftml/rx-fast-forest)
109+
+ [rx_logistic_regression](https://docs.microsoft.com/machine-learning-server/python-reference/microsoftml/rx-logistic-regression)
110+
+ [rx_oneclass_svm](https://docs.microsoft.com/machine-learning-server/python-reference/microsoftml/rx-oneclass-svm)
111+
+ [rx_neural_network](https://docs.microsoft.com/machine-learning-server/python-reference/microsoftml/rx-neural-network)
112+
+ [rx_fast_linear](https://docs.microsoft.com/machine-learning-server/python-reference/microsoftml/rx-fast-linear)
113+
114+
#### Python: Transformations supplied by microsoftml
115+
116+
+ [featurize_text](https://docs.microsoft.com/machine-learning-server/python-reference/microsoftml/rx-fast-trees)
117+
+ [concat](https://docs.microsoft.com/machine-learning-server/python-reference/microsoftml/concat)
118+
+ [categorical](https://docs.microsoft.com/machine-learning-server/python-reference/microsoftml/categorical)
119+
+ [categorical_hash](https://docs.microsoft.com/machine-learning-server/python-referencee/microsoftml/categorical-hash)
120+
105121
### Unsupported model types
106122

107123
The following model types are not supported:
108124

109-
+ Models containing other, unsupported types of R transformations
110125
+ Models using the `rxGlm` or `rxNaiveBayes` algorithms in RevoScaleR
111-
+ PMML models
112-
+ Models created using other R libraries from CRAN or other repositories
113-
+ Models containing any other kind of R transformation other than those listed here
126+
+ PMML models in R
127+
+ Models created using other third-party libraries
114128

115129
## Examples
116130

0 commit comments

Comments
 (0)