Skip to content

Latest commit

 

History

History
289 lines (177 loc) · 19.9 KB

File metadata and controls

289 lines (177 loc) · 19.9 KB
title Setup and configuration for Python Machine Learning Services | Microsoft Docs
ms.custom
ms.date 07/31/2017
ms.prod sql-server-2016
ms.reviewer
ms.suite
ms.technology
r-services
ms.tgt_pltfrm
ms.topic article
author jeannt
ms.author jeannt
manager jhubbard
ms.workload On Demand

Set up Python Machine Learning Services (In-Database)

You install the components required for Python by running the [!INCLUDEssNoVersion] setup wizard, and following the interactive prompts as described in this topic.

Machine learning options in SQL Server setup

Choose the Machine Learning Services feature, and select Python as the language.

The Shared Features section contains a separate installation option, Machine Learning Server (Standalone). This option supports operationalization of Python code on a server that does not have SQL Server, or that does not require use of SQL Server compute contexts. Thus, we recommend that you do not install this on the same computer as a SQL Server instance. Instead, install Machine Learning Server (Standalone) on a separate computer.

After the installation is complete, reconfigure the instance to allow execution of scripts that use an external executable. You might need to make additional changes to the server to support machine learning workloads. Configuration changes generally require a restart of the instance, or a restart of the Launchpad service.

Prerequisites

  • SQL Server 2017 is required. Python integration is not supported on previous versions of SQL Server.

  • Be sure to install the database engine. An instance of SQL Server is required to run Python scripts in-database.

  • Prerequisites are installed as part of the Python component setup.

  • You cannot install machine learning with Python services on a failover cluster. The security mechanism used for isolating Python processes is not compatible with a Windows Server failover cluster environment.

    As a workaround, you can use replication to copy necessary tables to a standalone SQL Server instance that uses Python services. Alternatively, you can install machine learning with Python services on a standalone computer that uses the AlwaysOn setting, and is part of an availability group.

  • Side-by-side installation with other versions of Python is possible, because the SQL Server instance uses its own copy of the Anaconda distribution. However, running code that uses Python on the SQL Server computer outside SQL Server can lead to various problems:

    • You use a different library and different executable, and get different results, than you do when you are running in SQL Server.
    • Python scripts running in external libraries cannot be managed by SQL Server, leading to resource contention.

Important

After setup is complete, be sure to complete the additional post-configuration steps described in this topic. These include enabling SQL Server to use external scripts, and adding accounts required for SQL Server to run Python jobs on your behalf.

Unattended installation

To perform an unattended installation, use the command-line options for SQL Server setup and the arguments specific to Python. For more information, see Unattended installs of SQL Server with Python Machine Learning Services.

Step 1: Install Machine Learning Services (In-Database) on SQL Server

  1. Run the setup wizard for SQL Server 2017.

  2. On the Installation tab, select New SQL Server stand-alone installation or add features to an existing installation.

  3. On the Feature Selection page, select these options:

    • Database Engine Services

      To use Python with SQL Server, you must install an instance of the database engine. You can use either a default or a named instance.

    • Machine Learning Services (In-Database)

      This option installs the database services that support Python script execution.

    • Python Check this option to get the Python 3.5 executable and select libraries from the Anaconda distribution. Install only one language per instance.

      [!NOTE] Do not select the option in Shared Features for Microsoft R Server (Standalone). Use this option in a separate installation if you need to add the machine learning components to a different computer that is used for R development. For example, this might be useful for your data scientist's laptop.

      Setup options for Python

  4. On the Consent to Install Python page, select Accept.

    This license agreement is required to download the Python executable, Python packages from Anaconda.

    Agreement to Python license

    [!NOTE] If the computer you are using does not have internet access, you can pause setup at this point to download the installers separately. For more information, see Installing components without internet access.

    Select Accept, wait until the Next button becomes active, and then select Next.

  5. On the Ready to Install page, verify that these selections are included, and select Install.

    • Database Engine Services
    • Machine Learning Services (In-Database)
    • Python

    These selections represent the minimum configuration required to use Python with SQL Server.

    Ready to install Python

    Optionally, make a note of the location of the folder under the path ..\Setup Bootstrap\Log where the configuration files are stored. When setup is complete, you can review the installed components in the Summary file.

  6. When installation is complete, restart the computer.

Step 2: Enable Python script execution

  1. Open [!INCLUDEssManStudioFull]. If it is not already installed, you can run the SQL Server setup wizard again to open a download link and install it.

  2. Connect to the instance where you installed Machine Learning Services, and run the following command:

    sp_configure

    The value for the property, external scripts enabled, should be 0 at this point. That is because the feature is turned off by default. The feature must be explicitly enabled by an administrator before you can run R or Python scripts.

  3. To enable the external scripting feature that supports Python, run the following statement:

    EXEC sp_configure  'external scripts enabled', 1
    RECONFIGURE WITH OVERRIDE

    This is exactly the same process that is used to enable R, because the underlying extensibility feature supports both languages.

  4. Restart the SQL Server service for the [!INCLUDEssNoVersion] instance. Restarting the SQL Server service also automatically restarts the related [!INCLUDErsql_launchpad] service.

    You can restart the service by using the Services panel in Control Panel, or by using SQL Server Configuration Manager.

Step 3: Verify that the external script execution feature is running

Take a moment to verify that all components used to launch the Python script are running.

  1. In SQL Server Management Studio, open a new query window, and run the following command:

    EXEC sp_configure  'external scripts enabled'

    The run_value should now be set to 1.

  2. Open the Services panel or SQL Server Configuration Manager, and verify that the Launchpad service for your instance is running. If the Launchpad is not running, restart the service.

    If you have installed multiple instances of SQL Server, any instance that has either R or Python enabled has its own Launchpad service.

    However, if you install R and Python on a single instance, only one Launchpad is installed. A separate, language-specific launcher DLL is added for each language. For more information, see Components to support Python integration.

  3. If Launchpad is running, you should be able to run simple Python scripts like the following in [!INCLUDEssManStudioFull]:

    EXEC sp_execute_external_script  @language =N'Python',
    @script=N'OutputDataSet=InputDataSet',
    @input_data_1 = N'SELECT 1 AS col'

    Results

       1

Note

Columns or headings used in the Python script are not returned, by design. To add column names for your output, you must specify the schema for the return data set. Do this by using the WITH RESULTS parameter of the stored procedure, naming the columns and specifying the SQL data type. For example, you can add the following line to generate an arbitrary column name: WITH RESULT SETS ((Col1 AS int))

Step 4: Additional configuration

If the previous command was successful, you can run Python commands from SQL Server Management Studio, Visual Studio Code, or any other client that can send T-SQL statements to the server.

If you got an error when running the command, review the following list. You might need to make additional appropriate configurations to the service or database.

Note

Not all the listed changes are required, and none might be required. Requirements depend on your security schema, where you installed SQL Server, and how you expect users to connect to the database and run external scripts.

Enable implied authentication for Launchpad account group

During setup, a number of new Windows user accounts are created for the purpose of running tasks under the security token of the [!INCLUDErsql_launchpad_md] service. When a user sends a Python or R script from an external client, [!INCLUDEssNoVersion] activates an available worker account. Then it maps it to the identity of the calling user, and runs the script on behalf of the user.

This is called implied authentication, and is a service of the database engine. This service supports secure execution of external scripts in SQL Server 2016 and SQL Server 2017.

You can view these accounts in the Windows user group, SQLRUserGroup. By default, 20 worker accounts are created, which is usually more than enough for running external script jobs.

Important

The worker group is named SQLRUserGroup regardless of the type of script you are running. There is a single group for each instance.

If you need to run R scripts from a remote data science client, and you are using Windows authentication, there are additional considerations. These worker accounts must be given permission to sign in to the [!INCLUDEssNoVersion] instance on your behalf.

  1. In [!INCLUDEssManStudioFull], in Object Explorer, expand Security. Then right-click Logins, and select New Login.
  2. In the Login - New dialog box, select Search.
  3. Select Object Types, and select Groups. Clear everything else.
  4. In Enter the object name to select, type SQLRUserGroup, and select Check Names.
  5. The name of the local group associated with the instance's Launchpad service should resolve to something like instancename\SQLRUserGroup. Select OK.
  6. By default, the sign-in is assigned to the public role, and has permission to connect to the database engine.
  7. Select OK.

Note

If you use a SQL sign-in for running scripts in a SQL Server compute context, this extra step is not required.

Give users permission to run external scripts

If you installed [!INCLUDEssNoVersion] yourself, and you are running Python scripts in your own instance, you typically execute scripts as an administrator. Thus, you have implicit permission over various operations and all data in the database.

Most users, however, do not have such elevated permissions. For example, users in an organization who use SQL sign-ins to access the database generally do not have elevated permissions. Therefore, for each user who is using Python, you must grant users of Machine Learning Services the permission to run external scripts in each database where Python is used. Here's how:

USE <database_name>
GO
GRANT EXECUTE ANY EXTERNAL SCRIPT  TO [UserName]

Note

Permissions are not specific to the supported script language. In other words, there are not separate permission levels for R script versus Python script. If you need to maintain separate permissions for these languages, install R and Python on separate instances.

Give your users read, write, or data definition language (DDL) permissions to databases

While a user is running scripts, the user account or SQL sign-in might need to read data from other databases. The user account or SQL sign-in might also need to create new tables to store results, and write data into tables.

For each user account or SQL sign-in that is running R or Python scripts, ensure that it has db_datareader, db_datawriter, or db_ddladmin permissions on the specific database.

For example, the following [!INCLUDEtsql] statement gives the SQL sign-in MySQLLogin the rights to run T-SQL queries in the ML_Samples database. To run this statement, the SQL sign-in must already exist in the security context of the server.

USE ML_Samples
GO
EXEC sp_addrolemember 'db_datareader', 'MySQLLogin'

For more information about the permissions included in each role, see Database-level roles.

Ensure that the SQL Server installation supports remote connections

If you cannot connect from a remote computer, check whether the firewall allows access to SQL Server. In a default installation, remote connections might be disabled, or the specific port used by SQL Server might be blocked by the firewall. For more information, see Configure Windows Firewall for database engine access.

Create an ODBC data source for the instance on your data science client

You might create a machine learning solution on a data science client computer. If you need to run code by using the SQL Server computer as the compute context, you have two options. You can use a SQL sign-in or an integrated Windows authentication.

  • For SQL sign-ins: Ensure that the sign-in has appropriate permissions on the database where you are reading data. You can do this by adding Connect to and SELECT permissions, or by adding the sign-in to the db_datareader role. Sign-ins that create objects need DDL_admin rights. Sign-ins that must save data to tables should be added to the db_datawriter role.

  • For Windows authentication: You might need to create an ODBC data source on the data science client that specifies the instance name and other connection information. For more information, see ODBC data source administrator.

Additional optimizations

Now that you have everything working, you might also want to optimize the server to support machine learning, or install pretrained models.

Add more worker accounts

If you expect many users to be running scripts concurrently, you can increase the number of worker accounts that are assigned to the Launchpad service. For more information, see Modify the user account pool for SQL Server R Services.

Optimize the server for script execution

The default settings for [!INCLUDEssNoVersion] setup optimize the balance of the server for a variety of services. These services include ETL processes, reporting, auditing, and applications that use [!INCLUDEssNoVersion] data.

If you use the default settings, you might find that resources for running external scripts are restricted or throttled, particularly in memory-intensive operations. If machine learning is a priority, change the default database settings to ensure that external script jobs are prioritized and resourced appropriately. These changes can include:

  • Reducing the amount of memory allocated to the [!INCLUDEssNoVersion] database engine.
  • Increasing the number of accounts running under the [!INCLUDErsql_launchpad] service. This does not increase the number of resources, but does increase the number of scripts that can run concurrently.

If you have SQL Server Enterprise Edition, use resource governor to configure an external resource pool for Python. For more information, see the following articles:

If you are using SQL Server Standard Edition and do not have resource governor, you can use dynamic management views and extended events to help you manage server resources. You can also use Windows event monitoring for this purpose. For more information, see Monitoring and managing R Services.

Upgrade the machine learning components

When you install Machine Learning Services by using SQL Server 2017, you get the version of the components at the time the release was published. Each time you patch or upgrade the SQL Server instance, the machine learning components are upgraded as well.

You can upgrade the machine learning components on a faster schedule than is supported by SQL Server releases, by installing Microsoft Machine Learning Server. When you do so, you also get any new features supported in the latest release of Machine Learning Server, such as:

For information about how to upgrade an instance, see Upgrade R components through binding.

Note

The current release version contains the latest version of all machine learning components. Therefore, although upgrades via Microsoft Machine Learning Server are supported for SQL Server 2017, the upgrade that is currently available applies only to SQL Server 2016 instances.

Tutorials

See the following tutorials for some examples of how you can use Python with SQL Server to build and deploy machine learning solutions:

Using Python in T-SQL

Create a Python model using revoscalepy