Skip to content

Latest commit

 

History

History
264 lines (157 loc) · 16.2 KB

File metadata and controls

264 lines (157 loc) · 16.2 KB
title Setup and Configuration | Microsoft Docs
ms.custom
ms.date 03/30/2017
ms.prod sql-server-2016
ms.reviewer
ms.suite
ms.technology
r-services
ms.tgt_pltfrm
ms.topic article
author jeannt
ms.author jeannt
manager jhubbard

Set up Python Machine Learning Services (In-Database)

You install the components required for using Python by running the [!INCLUDEssNoVersion] setup wizard.

You can also use command-line options for SQL Server setup and specify the Python options, as described here.

Here is the overall process:

  • In setup, be sure to install the database engine. An instance of SQL Server is required to run Python scripts in-database.
  • Choose the Machine Learning Services feature, and select Python as the language.
  • After installation is complete, reconfigure the instance to allow execution of scripts that use an external executable.
  • You might need to make some additional configuration of the secured account used to run Python. Each configuration change requires a restart of the Trusted Launchpad service.

Prerequisites

  • SQL Server vNext is required. Python integration is not supported on previous versions of SQL Server.

  • Additional prerequisites, such as .NET Core, are installed as part of the Python component setup.

  • We recommend that you do not install Machine Learning Services using both the R and Python options. To manage resources, it is probably easier to have separate instances used for R and Python.

  • The Shared Features section contains a separate installation option, Machine Learning Server (Standalone). We recommend that you do not install this on the same computer as a SQL Server instance that uses R Services or Python Services. Instead, install the Machine Learning Server (Standalone) on a separate computer that you will use for developing and testing your R solutions.

  • You cannot install Machine Learning with Python Services on a failover cluster. The reason is that the security mechanism used for isolating Python processes is not compatible with a Windows Server failover cluster environment. As a workaround, you can use replication to copy necessary tables to a standalone SQL Server instance that uses Python Services, or you can install Machine Learning with Python Services on a standalone computer that uses Always On and is part of an availability group. .

  • Side-by-side installation with other versions of Python is possible, because the SQL Server instance uses its own copy of the Anaconda distribution. However, running code using Python on the SQL Server computer outside of SQL Server can lead to various problems:

    • You use a different library and different executable and get different results than when running in SQL Server.
    • Python scripts running in external libraries cannot be managed by SQL Server, leading to resource contention.

Important

After setup is complete, be sure to complete the additional post-configuration steps as described here. You must enable SQL Server to use the Python executable and ensure that SQL Server can run Python jobs on your behalf.

Step 1: Install Machine Learning Services (In-Database) on SQL Server

  1. Run the setup wizard for SQL Server vNext.

    [!NOTE] Need to perform an unattended installation? See this section for command-line arguments.

  2. On the Installation tab, click New SQL Server stand-alone installation or add features to an existing installation.

  3. On the Feature Selection page, select both of these options:

    • Database Engine Services

      To use Python with SQL Server, you must install an instance of the database engine. You can use either a default or named instance.

    • Machine Services (In-Database)

      This option installs the database services that support Python script execution.

    • Python Check this option to get the Python 3.5 executable and select libraries from the Anaconda distribution. We recommend you install only one language per instance.

      [!NOTE] Do not select the option in Shared Features for Microsoft R Server (Standalone). Use this option in a separate installation if you need to add the Machine Learning components to a different computer that is used for R development, such as your data scientist's laptop.

      [image:ml-svcs-features-python-highlight.png]

  4. On the page, Consent to Install Python, click Accept.

    This license agreement is required to download the Python executable, Python packages from Anaconda.
    [imae: ml-svcs-license-python.PNG]

    [!NOTE]
    If the computer you are using does not have Internet access, you can pause setup at this point to download the installers separately as described here: Installing R Components without Internet Access

    Click Accept, wait until the Next button becomes active, and then click Next.

  5. On the Ready to Install page, verify that these selections are included, and click Install.

    • Database Engine Services
    • Machine Learning Services (In-Database)
    • Python
  6. When installation is complete, restart the computer.

Step 2: Enable Python script execution

  1. Open [!INCLUDEssManStudioFull]. If it is not already installed, you can rerun the SQL Server setup wizard to open a download link and install it.

  2. Connect to the instance where you installed Machine Learning Services, and run the following command:

    sp_configure    
    

    The value for the property, external scripts enabled, should be 0 at this point. That is because the feature is turned off by default, to reduce the surface area, and must be explicitly enabled by an administrator.

  3. To enable the external scripting feature that supports Python, run the following statement.

    EXEC sp_configure  'external scripts enabled', 1  
    RECONFIGURE WITH OVERRIDE 
    

Note that this is exactly the same process that is used to enable R, as the underlying extensibility feature supports both languages.

  1. Restart the SQL Server service for the [!INCLUDEssNoVersion] instance. Restarting the SQL Server service will also automatically restart the related [!INCLUDErsql_launchpad] service.

    You can restart the service using the Services panel in Control Panel, or by using SQL Server Configuration Manager.

Step 3. Verify that script execution works locally

Take a moment to verify that all components used for Python script execution are running.

  1. In SQL Server Management Studio, open a new Query window, and run the following command:

    exec sp_configure  'external scripts enabled'    
    

    The run_value should now be set to 1.

  2. Open the Services panel or SQL Server Configuration Manager and verify that the Launchpad service for your instance is running. If the Launchpad is not running, restart the service.

If you have installed multiple instances of SQL Server, using either R or Python, each instance has its own Launchpad service.

  1. If Launchpad is running, you should be able to run simple Python scripts like the following in [!INCLUDEssManStudioFull]:

    exec sp_execute_external_script  @language =N'Python',  
    @script=N'OutputDataSet<-InputDataSet',    
    @input_data_1 = N'SELECT 1 as col'  
    go  
    

    Results

    col
    1

Troubleshooting

If this command is successful, you can run Python commands from SQL Server Management Studio, Visual Studio Code, or any other client that can send T-SQL statements to the server.

To get started with some simple examples, and learn the basics of how R works with SQL Server, see Using R Code in Transact-SQL.

If you get an error, see this article for a list of some common problems:

Step 4: Enable Implied Authentication for Launchpad Accounts

During setup, a number of new Windows user accounts are created for the purpose of running tasks under the security token of the [!INCLUDErsql_launchpad_md] service. When a user sends a Python or R script from an external client, [!INCLUDEssNoVersion] will activate an available worker account, map it to the identity of the calling user, and run the script on behalf of the user.

This is called called implied authentication, and is a service of the database engine that supports secure execution of external scripts in SQL Server 2016 and SQL Server vNext.

You can view these accounts in the Windows user group, SQLRUserGroup. By default, 20 worker accounts are created, which is typically more than enough for running external script jobs.

Important

The worker group is named SQLRUserGroup regardless of the type of script you are running. There is a single group for each instance.

If you need to run R scripts from a remote data science client and are using Windows authentication, these worker accounts must be given permission to log into the [!INCLUDEssNoVersion] instance on your behalf.

  1. In [!INCLUDEssManStudioFull], in Object Explorer, expand Security, right-click Logins, and select New Login.
  2. In the Login - New dialog box, click Search.
  3. Click Object Types and select Groups. Deselect everything else.
  4. In Enter the object name to select, type SQLRUserGroup and click Check Names.
  5. The name of the local group associated with the instance's Launchpad service should resolve to something like instancename\SQLRUserGroup. Click OK.
  6. By default, the login is assigned to the public role and has permission to connect to the database engine.
  7. Click OK.

Note

If you use a SQL login for running scripts in a SQL Server compute context, this extra step is not required.

Step 5: Give Non-Admin Users Script Permissions

If you installed [!INCLUDEssNoVersion] yourself and are running R scripts in your own instance, you are typically executing scripts as an administrator and thus have implicit permission over various operations and all data in the database, as well as the ability to install new R packages as needed.

However, in an enterprise scenario, most users, including those accessing the database using SQL logins, do not have such elevated permissions. Therefore, for each user that will be running external scripts, you must grant the user permissions to run scripts in each database where R or Python will be used.

USE <database_name>  
GO  
GRANT EXECUTE ANY EXTERNAL SCRIPT  TO [UserName]  

Note that permissions are agnostic to the supported script language. Currently there are not separate permission levels for R vs. Python script. For this reason, and to enable easier resource management, we recommend that you install R and Python on separate instances.

Additional Configuration Options

This section describes additional changes that you can make in the configuration of the instance, or of your data science client, to support Python script execution.

Modify the number of worker accounts

If you expect many users to be running scripts concurrently, you can increase the number of worker accounts that are assigned to the Launchpad service. For more information, see Modify the User Account Pool for SQL Server R Services.

Give your users read, write, or DDL permissions as needed in additional databases

While running scripts, the user account or SQL login might need to read data from other databases, create new tables to store results, and write data into tables.

For each user account or SQL login that will be executing R or Python scripts, be sure that the account or login has db_datareader, db_datawriter, or db_ddladmin permissions on the specific database.

For example, the following [!INCLUDEtsql] statement gives the SQL login MySQLLogin the rights to run T-SQL queries in the ML_Samples database. To run this statement, the SQL login must already exist in the security context of the server.

USE ML_Samples  
GO  
EXEC sp_addrolemember 'db_datareader', 'MySQLLogin'  

For more information about the permissions included in each role, see Database-Level Roles.

Create an ODBC data source for the instance on your data science client

If you create a machine learning solution on a data science client computer and need to run code using the SQL Server computer as the compute context, you can use either a SQL login, or integrated Windows authentication.

  • For SQL logins: Ensure that the login has appropriate permissions on the database where you will be reading data. You can do this by adding Connect to and SELECT permissions, or by adding the login to the db_datareader role. If you need to create objects, you will need DDL_admin rights. To save data to tables, add the login to the db_datawriter role.

  • For Windows authentication: You must configure an ODBC data source on the data science client that specifies the instance name and other connection information. For more information, see Using the ODBC Data Source Administrator.

Optimize the Server for Python Use

The default settings for [!INCLUDEssNoVersion] setup are intended to optimize the balance of the server for a variety of services supported by the database engine, which might include ETL processes, reporting, auditing, and applications that use [!INCLUDEssNoVersion] data. Therefore, under the default settings you might find that resources for running Python are sometimes restricted or throttled, particularly in memory-intensive operations.

To ensure that Python jobs are prioritized and resourced appropriately, we recommend that you use Resource Governor to configure an external resource pool specific for Python.

You might also wish to change the amount of memory allocated to the [!INCLUDEssNoVersion] database engine, or increase the number of accounts running under the [!INCLUDErsql_launchpad] service.

If you are using Standard Edition and do not have Resource Governor, you can use DMVs and Extended Events, as well as Windows event monitoring, to help you manage server resources. For more information, see Monitoring and Managing R Services.

See Also