| title | Install SQL Server Machine Learning Services (R, Python, Java) on Linux | Microsoft Docs |
|---|---|
| description | This article describes how to install SQL Server Machine Learning Services (R, Python, Java) on Red Hat and Ubuntu. |
| author | HeidiSteen |
| ms.author | heidist |
| manager | cgronlun |
| ms.date | 09/24/2018 |
| ms.topic | conceptual |
| ms.prod | sql |
| ms.component | |
| ms.suite | sql |
| ms.custom | sql-linux |
| ms.technology | machine-learning |
| monikerRange | >=sql-server-ver15||>=sql-server-linux-ver15||=sqlallproducts-allversions |
SQL Server Machine Learning Services runs on Linux operating systems starting in this preview release of SQL Server 2019. Follow the steps in this article to install the Java programming extension, or the machine learning extensions for R and Python.
Machine learning and programming extensions are an add-on to the database engine. Although you can install the database engine and Machine Learning Services concurrently, it's a best practice to install and configure the SQL Server database engine first so that you can resolve any issues before adding more components.
Package location of the R, Python, and Java extensions are in the SQL Server Linux source repositories. If you already configured source repositories for the database engine install, you can run the mssql-mlservices package install commands using the same repo registration.
-
Linux operating system must be supported by SQL Server, running on premises or in a Docker container.
-
SQL Server 2019 Database Engine instance on:
-
For R only, install Microsoft R Open as a prerequisite to the mssql-mlsservices R package providing the combination of R features you require.
If you are using R on SQL Server for Linux, Microsoft's base distribution of R is a prerequisite of the mssql-mlservices R packages.
The following commands register the repository providing MRO. When the repository is registered, the commands for installing the mssql-mlservices R packages will automatically install MRO as a package dependency.
# Set the location of the package repo the "prod" directory containing the distribution.
# This example specifies 16.04. Replace with 18.04 if you want that version
wget https://packages.microsoft.com/config/ubuntu/16.04/packages-microsoft-prod.deb
# Register the repo
dpkg -i packages-microsoft-prod.deb# Set the location of the package repo at the "prod" directory
rpm -Uvh https://packages.microsoft.com/config/rhel/7/packages-microsoft-prod.rpm# Set the location of the package repo
zypper ar -f https://packages.microsoft.com/sles/12/prod packages-microsoft-comOn an internet-connected device, packages are downloaded and installed independently of the database engine using the package installer for each operating system. The following table describes all available packages, but you only need one R or Python package to get a specific combination of features.
Package names are "root" names. Fully-qualified package names include version information that varies by package installer. See the commands in the following sections for package references that work in an install command.
| Package name | Applies to | Description |
|---|---|---|
| mssql-server-extensibility | All | Extensibility framework used to run R, Python, or Java code. |
| mssql-server-extensibility | Java | Java extension for loading a Java execution environment. There are no additional libraries or packages for Java. |
| microsoft-openmpi | All | Message passing interface used by the Revo* libraries for parallelization on Linux. |
| mssql-mlservices-python | Python | Open-source distribution of Anaconda and Python. |
| mssql-mlservices-mlm-py | Python | Full install. Provides revoscalepy, microsoftml, pre-trained models for image featurization and text sentiment analysis. |
| mssql-mlservices-mml-py | Python | Partial install. Provides revoscalepy, microsoftml. Excludes pre-trained models. |
| mssql-mlservices-packages-py | Python | Partial install. Provides revoscalepy, revoscalepy. Excludes pre-trained models and microsoftml. |
| mssql-mlservices-mlm-r | R | Full install. Provides RevoScaleR, MicrosoftML, sqlRUtils, olapR, pre-trained models for image featurization and text sentiment analysis. |
| mssql-mlservices-mml-r | R | Partial install. Provides RevoScaleR, MicrosoftML, sqlRUtils, olapR. Excludes pre-trained models. |
| mssql-mlservices-packages-r | R | Partial install. Provides RevoScaleR, sqlRUtils, olapR. Excludes pre-trained models and MicrosoftML. |
Install the extensibility framework, followed by any one R package, plus any one Python package, and Java if you want that capability.
Tip
If possible, run yum clean all to refresh packages on the system prior to installation.
Includes the extensibility framework, R, Python, Java, with machine learning libraries and pre-trained models for both R and Python. For R and Python, if you want something in between full and minimum install - such as machine learning libraries but without the pre-trained models - substitute the R or Python package providing the feature combination you want.
# Install as root or sudo
# Add everything
sudo yum install microsoft-openmpi-3.0.0-x86_64.rpm
sudo yum install mssql-server-extensibility-15.0.1000.xxxx-y.x86_64.rpm
sudo yum install mssql-server-extensibility-java-15.0.1000.xxxx-y.x86_64.rpm
sudo yum install mssql-mlservices-mlm-py-9.4.5.x86_64.rpm
sudo yum install mssql-mlservices-mlm-r-9.4.5.x86_64.rpm
sudo yum install mssql-mlservices-python-9.4.5.x86_64.rpmIncludes the extensibility framework, and core libraries and extensions. Excludes pre-trained models and machine learning libraries for R and Python.
# Install as root or sudo
# Minimum install of R, Python, Java
sudo yum install microsoft-openmpi-3.0.0-x86_64.rpm
sudo yum install mssql-server-extensibility-15.0.1000.xxxx-y.x86_64.rpm
sudo yum install mssql-server-extensibility-java-15.0.1000.xxxx-y.x86_64.rpm
sudo yum install mssql-mlservices-packages-py-9.4.5.x86_64.rpm
sudo yum install mssql-mlservices-packages-r-9.4.5.x86_64.rpm
sudo yum install mssql-mlservices-python-9.4.5.x86_64.rpmInstall the extensibility framework, followed by any one R package, plus any one Python package, and Java if you want that capability.
Tip
If possible, run apt-get update to refresh packages on the system prior to installation. Additionally, some docker images of Ubuntu might not have the https apt transport option. To install it, use apt-get install apt-transport-https.
Includes the extensibility framework, R, Python, Java, with machine learning libraries and pre-trained models for both R and Python. For R and Python, if you want something in between full and minimum install - such as machine learning libraries but without the pre-trained models - substitute the R or Python package providing the feature combination you want.
# Install as root or sudo
# Add everything
sudo apt-get install microsoft-openmpi_3.0.0-1_amd64.deb
sudo apt-get install mssql-server-extensibility-15.0.1000.xxxx-y_amd64.deb
sudo apt-get install mssql-server-extensibility-java-15.0.1000.xxxx-y_amd64.deb
sudo apt-get install mssql-mlservices-mlm-r_9.4.5.19_amd64.deb
sudo apt-get install mssql-mlservices-mlm-py_9.4.5.19_amd64.deb
sudo apt-get install mssql-mlservices-python_9.4.5.19_amd64.debIncludes the extensibility framework, and core libraries and extensions. Excludes pre-trained models and machine learning libraries for R and Python.
# Install as root or sudo
# Minimum install of R, Python, Java
sudo apt-get install microsoft-openmpi_3.0.0-1_amd64.deb
sudo apt-get install mssql-server-extensibility-15.0.1000.xxxx-y_amd64.deb
sudo apt-get install mssql-server-extensibility-java-15.0.1000.xxxx-y_amd64.deb
sudo apt-get install mssql-mlservices-packages-r-9.4.5.19_amd64.deb
sudo apt-get install mssql-mlservices-packages-py-9.4.5.19_amd64.deb
sudo apt-get install mssql-mlservices-python_9.4.5.19_amd64.deb
Install the extensibility framework, followed by any one R package, plus any one Python package, and Java if you want that capability.
Includes the extensibility framework, R, Python, Java, with machine learning libraries and pre-trained models for both R and Python. For R and Python, if you want something in between full and minimum install - such as machine learning libraries but without the pre-trained models - substitute the R or Python package providing the feature combination you want.
# Install as root or sudo
# Add everything
sudo zypper install microsoft-openmpi-3.0.0-x86_64.rpm
sudo zypper install mssql-server-extensibility-15.0.1000.xxxx-y.x86_64.rpm
sudo zypper install mssql-server-extensibility-java-15.0.1000.xxxx-y.x86_64.rpm
sudo zypper install mssql-mlservices-python-9.4.5.x86_64.rpm
sudo zypper install mssql-mlservices-mlm-r-9.4.5.x86_64.rpm
sudo zypper install mssql-mlservices-mlm-py-9.4.5.x86_64.rpm
Includes the extensibility framework, and core libraries and extensions. Excludes pre-trained models and machine learning libraries for R and Python.
# Install as root or sudo
# Minimum install of R, Python, Java
sudo zypper install microsoft-openmpi-3.0.0-x86_64.rpm
sudo zypper install mssql-server-extensibility-15.0.1000.xxxx-y.x86_64.rpm
sudo zypper install mssql-server-extensibility-java-15.0.1000.xxxx-y.x86_64.rpm
sudo zypper nstall mssql-mlservices-packages-py-9.4.5.x86_64.rpm
sudo zypper nstall mssql-mlservices-packages-r-9.4.5.x86_64.rpm
sudo zypper install mssql-mlservices-python-9.4.5.x86_64.rpmAdditional configuration is primarily through the mssql-conf tool.
- Accept the licensing agreements for open-source R and Python. There are several ways to do this. If you previously accepted SQL Server licensing and are now adding the R or Python extensions, the following command is your consent to their terms:
# Run as SUDO or root
# Use set + EULA
sudo /opt/mssql/bin/mssql-conf set EULA accepteulaml YAn alternative workflow is that if you have not yet accepted the SQL Server database engine licensing agreement, setup detects the mssql-mlservices packages and prompts for EULA acceptance when mssql-conf setup is run. For more information about EULA parameters, see Configure SQL Server with the mssql-conf tool.
- Enable external script execution.
EXEC sp_configure 'external scripts enabled', 1
RECONFIGURE WITH OVERRIDE You should now be able to run external scripts on SQL Server with no restart required.
Execute the following SQL command to test R execution in SQL Server. If the script does not run, try a service restart,
sudo systemctl restart mssql-server.
EXEC sp_execute_external_script
@language =N'R',
@script=N'
OutputDataSet <- InputDataSet',
@input_data_1 =N'SELECT 1 AS hello'
WITH RESULT SETS (([hello] int not null));
GO Execute the following SQL command to test Python execution in SQL Server.
EXEC sp_execute_external_script
@language =N'Python',
@script=N'
OutputDataSet = InputDataSet;
',
@input_data_1 =N'SELECT 1 AS hello'
WITH RESULT SETS (([hello] int not null));
GO You can install and configure the database engine and Machine Learning Services in one procedure by appending the extensibility, R, or Python packages and EULA parameters on a command that installs the database engine. The following example is a template illustration using the Yum package manager (replace package names with actual names that include version and file extensions valid for your package manager):
sudo yum install -y mssql-sqlserver mssql-server-extensibility mssql-server-extensibility-java mssql-mlservices-mlm-py mssql-mlservices-mlm-r Using the unattended install for the Database Engine, add the packages for mssql-mlservices and EULAs.
Recall that Setup or the mssql-conf tool prompts for license agreement acceptance. For open-source R and Python components, accepting the mssql-mlservices EULA supplement for is required for uninterrupted installation for mssql-mlservices packages. This is in addition to the SQL Server EULA.
All possible permutations of EULA acceptance are documented in Configure SQL Server on Linux with the mssql-conf tool.
Locate the Machine Learning Services and extensibility package downloads in the Release notes. Follow the Offline installation instructions using the packages you obtained.
You can install other R and Python packages and use them in script that executes on SQL Server 2019.
-
Start an R session.
# sudo /opt/mssql/mlservices/bin/R/R -
Install an R package called glue to test package installation.
# install.packages("glue",lib="/opt/mssql/mlservices/libraries/RServer")Alternatively, you can install an R package from the command line
# sudo /opt/mssql/mlservices/bin/R/R CMD INSTALL -l /opt/mssql/mlservices/libraries/RServer glue_1.1.1.tar.gz
3. Import the R package in sp_execute_external_script.
EXEC sp_execute_external_script
@language = N'R',
@script = N'library(glue)' -
Install a Python package called httpie using pip.
# sudo /opt/mssql/mlservices/bin/python/python -m pip install httpie
2. Import the Python package in sp_execute_external_script.
EXEC sp_execute_external_script
@language = N'Python',
@script = N'import httpie' The following limitations exist in this CTP release.
-
Implied authentication is currently not available in Machine Learning Services on Linux at this time, which means you cannot connect back to the server from an in-progress R or Python script to access data or other resources.
-
CREATE EXTERNAL LIBRARY (for storing R packages in the database) is currently not available on Linux and does not support Python.
There is parity between Linux and Windows for Resource governance for external resource pools, but the statistics for sys.dm_resource_governor_external_resource_pools currently have different units on Linux. Units will align in an upcoming CTP.
| Column name | Description | Value on Linux |
|---|---|---|
| peak_memory_kb | The maximum amount of memory used for the resource pool. | On Linux, this statistic is sourced from the CGroups memory subsystem, where the value is memory.max_usage_in_bytes |
| write_io_count | The total write IOs issued since the Resource Governor statistics were reset. | On Linux, this statistic is sourced from the CGroups blkio subsystem, where the value on the write row is blkio.throttle.io_serviced |
| read_io_count | The total read IOs issued since the Resource Governor statistics were reset. | On Linux, this statistic is sourced from the CGroups blkio subsystem, where value on the read row is blkio.throttle.io_serviced |
| total_cpu_kernel_ms | The cumulative CPU user kernel time in milliseconds since the Resource Governor statistics were reset. | On Linux, this statistic is sourced from the CGroups cpuacct subsystem, where the value on the user row is cpuacct.stat |
| total_cpu_user_ms | The cumulative CPU user time in milliseconds since the Resource Governor statistics were reset. | On Linux, this statistic is sourced from the CGroups cpuacct subsystem, where the value on the system row value is cpuacct.stat |
| active_processes_count | The number of external processes running at the moment of the request. | On Linux, this statistic is sourced from the GGroups pids subsystem, where the value is pids.current |
R developers can get started with some simple examples, and learn the basics of how R works with SQL Server. For your next step, see the following links:
Python developers can learn how to use Python with SQL Server by following these tutorials:
To view examples of machine learning that are based on real-world scenarios, see Machine learning tutorials.