Skip to content

Latest commit

 

History

History
448 lines (247 loc) · 20.3 KB

File metadata and controls

448 lines (247 loc) · 20.3 KB
title Updated - Advanced Analytics for SQL Server docs | Microsoft Docs
description Display snippets of updated content for recently changed in documentation, for Advanced Analytics for Microsoft SQL Server.
manager craigg
author MightyPen
ms.author genemi
ms.topic article
ms.custom UpdArt.exe
ms.suite sql
ms.technology release-landing
ms.prod sql
ms.prod_service sql-non-specified
ms.component advanced-analytics
ms.date 04/28/2018

New and Recently Updated: Advanced Analytics for SQL Server

Nearly every day Microsoft updates some of its existing articles on its Docs.Microsoft.com documentation website. This article displays excerpts from recently updated articles. Links to new articles might also be listed.

This article is generated by a program that is rerun periodically. Occasionally an excerpt can appear with imperfect formatting, or as markdown from the source article. Images are never displayed here.

Recent updates are reported for the following date range and subject:

  • Date range of updates:   2018-02-03   -to-   2018-04-28
  • Subject area:   Advanced Analytics for SQL Server.

 

New Articles Created Recently

The following links jump to new articles that have been added recently.

  1. Install SQL Server 2017 Machine Learning Services (In-Database) on Windows
  2. Install SQL Server 2017 Machine Learning Server (Standalone) on Windows
  3. Install SQL Server machine learning components from the command line
  4. Install SQL Server machine learning components without internet access
  5. Install SQL Server 2016 R Services (In-Database)
  6. Install SQL Server 2016 R Server (Standalone)
  7. Set up Python client tools for use with SQL Server Machine Learning
  8. Use Python model in SQL for training and scoring
  9. Wrap Python code in a stored procedure
  10. What is SQL Server Machine Learning Services?
  11. Extended events for monitoring PREDICT statements

 

Updated Articles with Excerpts

This section displays the excerpts of updates gathered from articles that have recently experienced a large update.

The excerpts displayed here appear separated from their proper semantic context. Also, sometimes an excerpt is separated from important markdown syntax that surrounds it in the actual article. Therefore these excerpts are for general guidance only. The excerpts only enable you to know whether your interests warrant taking the time to click and visit the actual article.

For these and other reasons, do not copy code from these excerpts, and do not take as exact truth any text excerpt. Instead, visit the actual article.

 

This compact list provides links to all the updated articles that are listed in the Excerpts section.

  1. Viewing R or Python packages installed on SQL Server
  2. Install pre-trained machine learning models on SQL Server
  3. Set up a data-science client for R development on SQL Server
  4. SQL Server Machine Learning and R Services (In-Database)
  5. Run Python using T-SQL
  6. Use Python with revoscalepy to create a model

 

 

Updated: 2018-04-19           (Next)

 

Python

This example returns the list of folders included in the Python sys.path variable. The list includes the current directory, and the standard library path.

EXEC sp_execute_external_script
  @language =N'Python',
  @script=N'import sys; print("\n".join(sys.path))'

Results

C:\Program Files\Microsoft SQL Server\MSSQL14.MSSQLSERVER\PYTHON_SERVICES\python35.zip
C:\Program Files\Microsoft SQL Server\MSSQL14.MSSQLSERVER\PYTHON_SERVICES\DLLs
C:\Program Files\Microsoft SQL Server\MSSQL14.MSSQLSERVER\PYTHON_SERVICES\lib
C:\Program Files\Microsoft SQL Server\MSSQL14.MSSQLSERVER\PYTHON_SERVICES
C:\Program Files\Microsoft SQL Server\MSSQL14.MSSQLSERVER\PYTHON_SERVICES\lib\site-packages
C:\Program Files\Microsoft SQL Server\MSSQL14.MSSQLSERVER\PYTHON_SERVICES\lib\site-packages\Sphinx-1.5.4-py3.5.egg
C:\Program Files\Microsoft SQL Server\MSSQL14.MSSQLSERVER\PYTHON_SERVICES\lib\site-packages\win32
C:\Program Files\Microsoft SQL Server\MSSQL14.MSSQLSERVER\PYTHON_SERVICES\lib\site-packages\win32\lib
C:\Program Files\Microsoft SQL Server\MSSQL14.MSSQLSERVER\PYTHON_SERVICES\lib\site-packages\Pythonwin
C:\Program Files\Microsoft SQL Server\MSSQL14.MSSQLSERVER\PYTHON_SERVICES\lib\site-packages\setuptools-27.2.0-py3.5.egg

For more information about the variable sys.path and how it is used to set the interpreter's search path for modules, see the Python documentation

 

 


Updated: 2018-04-12           (Previous | Next)

 

  1. Launch the separate Windows-based installer for either R Server or Machine Learning Server.

  2. Select the languages that you wish to update, and select the Pre-trained Models option.

    [!TIP] If you have previously run the installer to update R Server (Standalone), and just want to add the pre-trained models, leave all previous selections as is, and select just the Pre**-trained Models** option. Do not deselect any previously selected options; if you do so, the installer removes the components.

    We recommend that you accept default settings for the model locations.

  3. Click Continue.

  4. Accept all other prompts, including license agreements.

After installation is complete, you must perform some additional steps to register the pre-trained models.

  1. Open a Windows command prompt as administrator.

  2. Navigate to the setup bootstrap folder for R Server (Standalone), which also contains the Microsoft R installer.

  3. Run RSetup.exe and indicate the component to install, the version, and the folder containing the model source files, using this syntax:

    RSetup.exe /install /component MLM /version <version> /language 1033 /destdir "~\R_SERVER\library\MicrosoftML\mxLibs\x64"

    The following values are supported for the version parameter:

 

 


Updated: 2018-04-12           (Previous | Next)

 

R tools

When you install R with SQL Server, you get the same R tools that are installed with any base installation of R, such as RGui, Rterm, and so forth. Therefore technically, you have all the tools you need to develop and test R code.

The following standard R tools are included in a base installation of R, and therefore are installed by default.

  • RTerm: A command-line terminal for running R scripts

  • RGui.exe: A simple interactive editor for R. The command-line arguments are the same for RGui.exe and RTerm.

  • RScript: A command-line tool for running R scripts in batch mode.

To locate these tools, determine the R library that was installed when you set up SQL Server or the standalone machine learning feature. For example, in a default installation, the R tools are located in these folders:

  • SQL Server 2016 R Services: ~\Program Files\Microsoft SQL Server\MSSQL13.<instancename>\R_SERVICES\bin\x64
  • Microsoft R Server Standalone: ~\Program Files\Microsoft R\R_SERVER\bin\x64
  • SQL Server 2017 Machine Learning Services: ~\Program Files\Microsoft SQL Server\MSSQL14.<instancename>\R_SERVICES\bin\x64
  • Machine Learning Server (Standalone): ~\Program Files\Microsoft\ML Server\R_SERVER\bin\x64

If you need help with the R tools, just open RGui, click Help, and select one of the options

Microsoft R Client

Microsoft R Client is a free download that gives you access to the RevoScaleR packages for development use. By installing R Client, you can create R solutions that can be run in all supported compute contexts, including SQL Server in-database analytics, and distributed R computing on Hadoop, Spark, or Linux using Machine Learning Server.

If you have already installed a different R development environment, such as RStudio, be sure to reconfigure the environment to use the libraries and executables provided by Microsoft R Client. By doing so you can use all the features of the RevoScaleR package, although performance will be limited.

 

 


Updated: 2018-04-12           (Previous | Next)

 

Choose the best language for the task. R is best for statistical computations that are difficult to implement using SQL. For set-based operations over data, leverage the power of {Included-Content-Goes-Here} to achieve maximum performance. Use the in-memory database engine for very fast computations over columns.

Step 4: Optimize your solution

When the model is ready to scale on enterprise data, the data scientist often works with the DBA or SQL developer to optimize processes such as:

  • Feature engineering
  • Data ingestion and data transformation
  • Scoring

Traditionally data scientists using R have had problems with both performance and scale, especially when using large dataset. That is because the common runtime implementation is single-threaded and can accommodate only those data sets that fit into the available memory on the local computer. Integration with SQl Server Machine Learning Services provides multiple features for better performance, with more data:

  • RevoScaleR: This R package contains implementations of some of the most popular R functions, redesigned to provide parallelism and scale. The package also includes functions that further boost performance and scale by pushing computations to the {Included-Content-Goes-Here} computer, which typically has far greater memory and computational power.

  • revoscalepy. This Python library, available in SQL Server 2017, implements the most popular functions in RevoScaleR, such as remote compute contexts, and many algorithms that support distributed processing.

Resources

  • [Performance Case Study]
  • [R and Data Optimization]

Step 5: Deploy and Consume

 

 


Updated: 2018-04-11           (Previous | Next)

 

  1. Note that the index values aren't output, even if you use the index to get specific values from the data.frame.

    Results

    ResultValue
    0.5
    2

Output values into data.frame using an index

Let's see how conversion to a data.frame works with our two series containing the results of simple math operations. The first has an index of sequential values generated by Python. The second uses an arbitrary index of string values.

  1. This example gets a value from the series that uses an integer index.
    EXECUTE sp_execute_external_script
    @language = N'Python',
    @script = N'
    import pandas as pd
    a = 1
    b = 2
    c = a/b
    d = a*b
    s = pandas.Series([c,d])
    print(s)
    df = pd.DataFrame(s, index=[1])
    OutputDataSet = df
    '
    WITH RESULT SETS (( ResultValue float ))

Remember that the auto-generated index starts at 0. Try using an out of range index value and see what happens.

  1. Now let's get a single value from the other data frame that has a string index.
    EXECUTE sp_execute_external_script
    @language = N'Python',
    @script = N'
    import pandas as pd
    a = 1
    b = 2
    c = a/b
    s = pandas.Series(c, index =["simple math example 1", "simple math example 2"])
    print(s)
    df = pd.DataFrame(s, index=["simple math example 1"])
    OutputDataSet = df
    '
    WITH RESULT SETS (( ResultValue float ))

Results

ResultValue
0.5

If you try to use a numeric index to get a value from this series, you get an error.

 

 


Updated: 2018-04-11           (Previous)

 

If you installed a pre-release version of SQL Server 2017, you should update to at least the RTM version. Later service releases have continued to expand and improve Python functionality. Some features of this tutorial might not work in early pre-release versions.

  • This example uses a predefined Python environment, named PYTEST_SQL_SERVER. The environment has been configured to contain revoscalepy and other required libraries.

    If you do not have an environment configured to run Python, you must do so separately. A discussion of how to create or modify Python environments is out of scope for this tutorial. For more information about how to set up a Python client that contains the correct libraries, see Install Python client and Link Python to tools.

Remote compute contexts and revoscalepy

This sample demonstrates the process of creating a Python model in a remote compute context, which lets you work from a client, but choose a remote environment, such as SQL Server, Spark, or Machine Learning Server, where the operations are actually performed. Using compute contexts makes it easier to write code once and deploy it to any supported environment.

To execute Python code in SQL Server requires the revoscalepy package. This is a special Python package provided by Microsoft, similar to the RevoScaleR package for the R language. The revoscalepy package supports the creation of compute contexts, and provides the infrastructure for passing data and models between a local workstation and a remote server. The revoscalepy function that supports in-database code execution is RxInSqlServer.

Similar articles about new or updated articles

This section lists very similar articles for recently updated articles in other subject areas, within our public GitHub.com repository: MicrosoftDocs/sql-docs.

Subject areas that do have new or recently updated articles

Subject areas that do not have any new or recently updated articles