---
title: "CREATE EXTERNAL DATA SOURCE (Transact-SQL)"
description: CREATE EXTERNAL DATA SOURCE creates an external data source used to establish connectivity and data virtualization from SQL Server and Azure SQL platforms.
author: MikeRayMSFT
ms.author: mikeray
ms.reviewer: randolphwest, hudequei, wiassaf, jovanpop
ms.date: 03/17/2026
ms.service: sql
ms.subservice: t-sql
ms.topic: reference
ms.custom:
- ignite-2025
f1_keywords:
- "CREATE EXTERNAL DATA SOURCE"
- "CREATE_EXTERNAL_DATA_SOURCE"
helpviewer_keywords:
- "External"
- "External, data source"
- "PolyBase, create data source"
dev_langs:
- TSQL
monikerRange: ">=aps-pdw-2016 || =azuresqldb-current || =azure-sqldw-latest || >=sql-server-2016 || >=sql-server-linux-2017 || =azuresqldb-mi-current || =azuresqledge-current || =fabric || =fabric-sqldb"
---
# CREATE EXTERNAL DATA SOURCE (Transact-SQL)
[!INCLUDE [sqlserver2016-asdb-asdbmi-asa-pdw-fabricse-fabricdw-fabricsqldb](../../includes/applies-to-version/sqlserver2016-asdb-asdbmi-asa-pdw-fabricse-fabricdw-fabricsqldb.md)]
Creates an external data source for querying external data, used for PolyBase and data virtualization features.
This article provides the syntax, arguments, remarks, permissions, and examples for whichever SQL product you choose.
[!INCLUDE [select-product](../includes/select-product.md)]
::: moniker range=">=sql-server-2016||>=sql-server-linux-2017"
:::row:::
:::column:::
**_\* SQL Server \*_**
:::column-end:::
:::column:::
[Azure SQL Database](create-external-data-source-transact-sql.md?view=azuresqldb-current&preserve-view=true)
:::column-end:::
:::column:::
[SQL Managed
Instance](create-external-data-source-transact-sql.md?view=azuresqldb-mi-current&preserve-view=true)
:::column-end:::
:::column:::
[Azure Synapse
Analytics](create-external-data-source-transact-sql.md?view=azure-sqldw-latest&preserve-view=true)
:::column-end:::
:::column:::
[Analytics Platform
System (PDW)](create-external-data-source-transact-sql.md?view=aps-pdw-2016-au7&preserve-view=true)
:::column-end:::
:::column:::
[Microsoft Fabric Data Warehouse](create-external-data-source-transact-sql.md?view=fabric&preserve-view=true)
:::column-end:::
:::column:::
[Microsoft Fabric SQL database](create-external-data-source-transact-sql.md?view=fabric-sqldb&preserve-view=true)
:::column-end:::
:::row-end:::
::: moniker-end
::: moniker range="=sql-server-2016"
## Overview: SQL Server 2016
**Applies to**: [!INCLUDE [sssql16-md](../../includes/sssql16-md.md)]
Creates an external data source for PolyBase queries. External data sources are used to establish connectivity and support these primary use cases:
- Data virtualization and data load using [PolyBase in SQL Server](../../relational-databases/polybase/overview.md)
- Bulk load operations using `BULK INSERT` or `OPENROWSET`
> [!NOTE]
> The syntax of `CREATE EXTERNAL DATA SOURCE` varies in different versions of the [SQL Database Engine](../../database-engine/sql-database-engine.md). Use the version selector dropdown list to [choose the appropriate product version](../../sql-server/sql-docs-navigation-guide.md#what-the-applies-to-options-mean).
> To view the features of [!INCLUDE [sssql19-md](../../includes/sssql19-md.md)], visit [CREATE EXTERNAL DATA SOURCE](create-external-data-source-transact-sql.md?view=sql-server-ver15&preserve-view=true#syntax).
> To view the features of [!INCLUDE [sssql22-md](../../includes/sssql22-md.md)], visit [CREATE EXTERNAL DATA SOURCE](create-external-data-source-transact-sql.md?view=sql-server-ver16&preserve-view=true#syntax).
:::image type="icon" source="../../includes/media/topic-link-icon.svg" border="false"::: [Transact-SQL syntax conventions](../../t-sql/language-elements/transact-sql-syntax-conventions-transact-sql.md)
## Syntax for SQL Server 2016
```syntaxsql
CREATE EXTERNAL DATA SOURCE
WITH
( [ LOCATION = '://[:]' ]
[ [ , ] CREDENTIAL = ]
[ [ , ] TYPE = { HADOOP } ]
[ [ , ] RESOURCE_MANAGER_LOCATION = '[:]' )
[ ; ]
```
## Arguments
#### data_source_name
Specifies the user-defined name for the data source. The name must be unique within the database in [!INCLUDE [ssNoVersion](../../includes/ssnoversion-md.md)].
#### LOCATION = '*\://\*'
Provides the connectivity protocol and path to the external data source.
| External Data Source | Connector location prefix | Location path | Supported locations by product / service | Authentication |
| --- | --- | --- | --- | ---: |
| Cloudera CDH or Hortonworks HDP | `hdfs` | `[:port]` | [!INCLUDE [sssql16-md](../../includes/sssql16-md.md)] to [!INCLUDE [sql-server-2019](../../includes/sssql19-md.md)] | Anonymous or basic authentication |
| Azure Storage account(V2) | `wasb[s]` | `@.blob.core.windows.net` | Starting with [!INCLUDE [sssql16-md](../../includes/sssql16-md.md)]
Hierarchical Namespace **not** supported | Azure Storage account key |
Location path:
- `` = the machine name, name service URI, or IP address of the `Namenode` in the Hadoop cluster. PolyBase must resolve any DNS names used by the Hadoop cluster.
- `port` = The port that the external data source is listening on. In Hadoop, the port can be found using the `fs.defaultFS` configuration parameter. The default is 8020.
- `` = the container of the storage account holding the data. Root containers are read-only, data can't be written back to the container.
- `` = the storage account name of the Azure resource.
- `` = the host name.
- `` = the name of the SQL Server named instance. Used if you have SQL Server Browser Service running on the target instance.
Additional notes and guidance when setting the location:
- The [!INCLUDE [ssDEnoversion](../../includes/ssdenoversion-md.md)] doesn't verify the existence of the external data source when the object is created. To validate, create an external table using the external data source.
- Use the same external data source for all tables when querying Hadoop to ensure consistent querying semantics.
- `wasbs` is optional but recommended in [!INCLUDE [sssql16-md](../../includes/sssql16-md.md)] for accessing Azure Storage Accounts as data will be sent using a secure TLS/SSL connection.
- To ensure successful PolyBase queries during a Hadoop `Namenode` fail-over, consider using a virtual IP address for the `Namenode` of the Hadoop cluster. If you don't, execute [ALTER EXTERNAL DATA SOURCE](alter-external-data-source-transact-sql.md) to point to the new location.
#### CREDENTIAL = *credential_name*
Specifies a database-scoped credential for authenticating to the external data source.
`CREDENTIAL` is only required if the data has been secured. `CREDENTIAL` isn't required for data sets that allow anonymous access.
To create a database scoped credential, see [CREATE DATABASE SCOPED CREDENTIAL](create-database-scoped-credential-transact-sql.md).
#### TYPE = * [ HADOOP ] *
Specifies the type of the external data source being configured. In SQL Server 2016, this parameter is always required, and should only be specified as `HADOOP`. Supports connections to Cloudera CDH, Hortonworks HDP, or an Azure Storage account. The behavior of this parameter is different in later versions of SQL Server.
For an example of using `TYPE` = `HADOOP` to load data from an Azure Storage account, see [Create external data source to access data in Azure Storage using the wasb:// interface](#e-create-external-data-source-to-access-data-in-azure-storage-using-the-wasb-interface)
#### RESOURCE_MANAGER_LOCATION = *'*ResourceManager_URI[:port]'*
Configure this optional value when connecting to Cloudera CDH, Hortonworks HDP, or an Azure Storage account only. For a complete list of supported Hadoop versions, see [PolyBase connectivity configuration](../../database-engine/configure-windows/polybase-connectivity-configuration-transact-sql.md).
When the `RESOURCE_MANAGER_LOCATION` is defined, the query optimizer makes a cost-based decision to improve performance. A MapReduce job can be used to push down the computation to Hadoop. Specifying the `RESOURCE_MANAGER_LOCATION` can significantly reduce the volume of data transferred between Hadoop and [!INCLUDE [ssNoVersion](../../includes/ssnoversion-md.md)], which can lead to improved query performance.
If the Resource Manager isn't specified, pushing compute to Hadoop is disabled for PolyBase queries. [Create external data source to reference Hadoop with push-down enabled](#c-create-external-data-source-to-reference-hadoop-with-push-down-enabled) provides a concrete example and further guidance.
The RESOURCE_MANAGER_LOCATION value isn't validated when you create the external data source. Entering an incorrect value might cause query failure at execution time whenever push-down is attempted as the provided value wouldn't be able to resolve.
In order for PolyBase to function correctly with a Hadoop external data source, the ports for the following Hadoop cluster components must be open:
- HDFS ports
- Namenode
- Datanode
- Resource Manager
- Job submission
- Job history
If the port isn't specified, the default value is chosen using the current setting for 'hadoop connectivity' configuration.
| Hadoop Connectivity | Default Resource Manager Port |
| --- | --- |
| `1` | 50300 |
| `2` | 50300 |
| `3` | 8021 |
| `4` | 8032 |
| `5` | 8050 |
| `6` | 8032 |
| `7` | 8050 |
| `8` | 8032 |
The following table shows the default ports for these components. There's Hadoop version dependency as well as the possibility of custom configuration that doesn't use the default port assignment.
| **Hadoop cluster component** | **Default Port** |
| --- | --- |
| NameNode | 8020 |
| DataNode (Data transfer, non-privilege IPC port) | 50010 |
| DataNode (Data transfer, privilege IPC port) | 1019 |
| Resource Manager Job Submission (Hortonworks 1.3) | 50300 |
| Resource Manager Job Submission (Cloudera 4.3) | 8021 |
| Resource Manager Job Submission (Hortonworks 2.0 on Windows, Cloudera 5.x on Linux) | 8032 |
| Resource Manager Job Submission (Hortonworks 2.x, 3.0 on Linux, Hortonworks 2.1-3 on Windows) | 8050 |
| Resource Manager Job History | 10020 |
## Permissions
Requires `CONTROL` permission on database in [!INCLUDE [ssNoVersion](../../includes/ssnoversion-md.md)].
## Locking
Takes a shared lock on the `EXTERNAL DATA SOURCE` object.
## Security
PolyBase supports proxy based authentication for most external data sources. Create a database scoped credential to create the proxy account.
## Examples
> [!IMPORTANT]
> For information on how to install and enable PolyBase, see [Install PolyBase on Windows](../../relational-databases/polybase/polybase-installation.md)
### A. Create external data source to reference Hadoop
To create an external data source to reference your Hortonworks HDP or Cloudera CDH Hadoop cluster, specify the machine name, or IP address of the Hadoop `Namenode` and port.
```sql
CREATE EXTERNAL DATA SOURCE MyHadoopCluster
WITH (
TYPE = HADOOP,
LOCATION = 'hdfs://10.10.10.10:8050'
);
```
### B. Create external data source to reference Hadoop with push-down enabled
Specify the `RESOURCE_MANAGER_LOCATION` option to enable push-down computation to Hadoop for PolyBase queries. Once enabled, PolyBase makes a cost-based decision to determine whether the query computation should be pushed to Hadoop.
```sql
CREATE EXTERNAL DATA SOURCE MyHadoopCluster
WITH (
TYPE = HADOOP,
LOCATION = 'hdfs://10.10.10.10:8020',
RESOURCE_MANAGER_LOCATION = '10.10.10.10:8050'
);
```
### C. Create external data source to reference Kerberos-secured Hadoop
To verify if the Hadoop cluster is Kerberos-secured, check the value of `hadoop.security.authentication` property in Hadoop core-site.xml. To reference a Kerberos-secured Hadoop cluster, you must specify a database scoped credential that contains your Kerberos username and password. The database master key is used to encrypt the database scoped credential secret.
```sql
-- Create a database master key if one does not already exist, using your own password.
-- This key is used to encrypt the credential secret in next step.
CREATE MASTER KEY ENCRYPTION BY PASSWORD = '';
-- Create a database scoped credential with Kerberos user name and password.
CREATE DATABASE SCOPED CREDENTIAL HadoopUser1
WITH IDENTITY = '',
SECRET = '';
-- Create an external data source with CREDENTIAL option.
CREATE EXTERNAL DATA SOURCE MyHadoopCluster
WITH (
TYPE = HADOOP,
LOCATION = 'hdfs://10.10.10.10:8050',
CREDENTIAL = HadoopUser1,
RESOURCE_MANAGER_LOCATION = '10.10.10.10:8050'
);
```
### D. Create external data source to access data in Azure Storage using the wasb:// interface
In this example, the external data source is an Azure V2 Storage account named `logs`. The storage container is named `daily`. The Azure Storage external data source is for data transfer only. It doesn't support predicate push-down. Hierarchical namespaces aren't supported when accessing data via the `wasb://` interface.
This example shows how to create the database scoped credential for authentication to an Azure V2 Storage account. Specify the Azure Storage account key in the database credential secret. You can specify any string in database scoped credential identity as it isn't used during authentication to Azure Storage. When connecting to the Azure Storage via `wasb` or `wasbs`, authentication must be done with a storage account key, not with a shared access signature (SAS).
In [!INCLUDE [sssql16-md](../../includes/sssql16-md.md)], `TYPE` should be set to `HADOOP` even when accessing Azure Storage.
```sql
-- Create a database master key if one does not already exist, using your own password.
-- This key is used to encrypt the credential secret in next step.
CREATE MASTER KEY ENCRYPTION BY PASSWORD = '';
-- Create a database scoped credential with Azure storage account key as the secret.
CREATE DATABASE SCOPED CREDENTIAL AzureStorageCredential
WITH IDENTITY = '',
SECRET = '';
-- Create an external data source with CREDENTIAL option.
CREATE EXTERNAL DATA SOURCE MyAzureStorage
WITH (
TYPE = HADOOP,
LOCATION = 'wasbs://daily@logs.blob.core.windows.net/',
CREDENTIAL = AzureStorageCredential
);
```
## Related content
- [ALTER EXTERNAL DATA SOURCE (Transact-SQL)](alter-external-data-source-transact-sql.md)
- [CREATE DATABASE SCOPED CREDENTIAL (Transact-SQL)](create-database-scoped-credential-transact-sql.md)
- [CREATE EXTERNAL FILE FORMAT (Transact-SQL)](create-external-file-format-transact-sql.md)
- [CREATE EXTERNAL TABLE (Transact-SQL)](create-external-table-transact-sql.md)
- [sys.external_data_sources (Transact-SQL)](../../relational-databases/system-catalog-views/sys-external-data-sources-transact-sql.md)
- [PolyBase connectivity configuration (Transact-SQL)](../../database-engine/configure-windows/polybase-connectivity-configuration-transact-sql.md)
::: moniker-end
::: moniker range="=sql-server-2017||=sql-server-linux-2017"
## Overview: SQL Server 2017
**Applies to**: [!INCLUDE [sssql17-md](../../includes/sssql17-md.md)]
Creates an external data source for PolyBase queries. External data sources are used to establish connectivity and support these primary use cases:
- Data virtualization and data load using [PolyBase in SQL Server](../../relational-databases/polybase/overview.md)
- Bulk load operations using `BULK INSERT` or `OPENROWSET`
::: moniker-end
::: moniker range="=sql-server-linux-2017"
> [!NOTE]
> The syntax of `CREATE EXTERNAL DATA SOURCE` varies in different versions of SQL Server on Linux. Use the version selector dropdown list to choose the appropriate version.
> To view the features of [!INCLUDE [sssql19-md](../../includes/sssql19-md.md)], visit [CREATE EXTERNAL DATA SOURCE](create-external-data-source-transact-sql.md?view=sql-server-linux-ver15&preserve-view=true#syntax).
> To view the features of [!INCLUDE [sssql22-md](../../includes/sssql22-md.md)], visit [CREATE EXTERNAL DATA SOURCE](create-external-data-source-transact-sql.md?view=sql-server-linux-ver16&preserve-view=true#syntax).
::: moniker-end
::: moniker range="=sql-server-2017"
> [!NOTE]
> The syntax of `CREATE EXTERNAL DATA SOURCE` varies in different versions of the [SQL Database Engine](../../database-engine/sql-database-engine.md). Use the version selector dropdown list to [choose the appropriate product version](../../sql-server/sql-docs-navigation-guide.md#what-the-applies-to-options-mean).
> To view the features of [!INCLUDE [sssql19-md](../../includes/sssql19-md.md)], visit [CREATE EXTERNAL DATA SOURCE](create-external-data-source-transact-sql.md?view=sql-server-ver15&preserve-view=true#syntax).
> To view the features of [!INCLUDE [sssql22-md](../../includes/sssql22-md.md)], visit [CREATE EXTERNAL DATA SOURCE](create-external-data-source-transact-sql.md?view=sql-server-ver16&preserve-view=true#syntax).
::: moniker-end
::: moniker range="=sql-server-2017||=sql-server-linux-2017"
:::image type="icon" source="../../includes/media/topic-link-icon.svg" border="false"::: [Transact-SQL syntax conventions](../../t-sql/language-elements/transact-sql-syntax-conventions-transact-sql.md)
## Syntax for SQL Server 2017
```syntaxsql
CREATE EXTERNAL DATA SOURCE
WITH
( [ LOCATION = '://[:]' ]
[ [ , ] CREDENTIAL = ]
[ [ , ] TYPE = { HADOOP | BLOB_STORAGE } ]
[ [ , ] RESOURCE_MANAGER_LOCATION = '[:]' )
[ ; ]
```
## Arguments
#### data_source_name
Specifies the user-defined name for the data source. The name must be unique within the database in [!INCLUDE [ssNoVersion](../../includes/ssnoversion-md.md)].
#### LOCATION = '*\://\*'
Provides the connectivity protocol and path to the external data source.
| External Data Source | Connector location prefix | Location path | Supported locations by product / service | Authentication |
| --- | --- | --- | --- | ---: |
| Cloudera CDH or Hortonworks HDP | `hdfs` | `[:port]` | [!INCLUDE [sssql16-md](../../includes/sssql16-md.md)] to [!INCLUDE [sql-server-2019](../../includes/sssql19-md.md)] only | Anonymous or basic authentication |
| Azure Storage account(V2) | `wasb[s]` | `@.blob.core.windows.net` | Starting with [!INCLUDE [sssql16-md](../../includes/sssql16-md.md)]
Hierarchical Namespace **not** supported | Azure Storage account key |
| Bulk Operations | `https` | `.blob.core.windows.net/` | Starting with [!INCLUDE [ssSQL17](../../includes/sssql17-md.md)] | Shared access signature (SAS) |
Location path:
- `<`Namenode`>` = the machine name, name service URI, or IP address of the `Namenode` in the Hadoop cluster. PolyBase must resolve any DNS names used by the Hadoop cluster.
- `port` = The port that the external data source is listening on. In Hadoop, the port can be found using the `fs.defaultFS` configuration parameter. The default is 8020.
- `` = the container of the storage account holding the data. Root containers are read-only, data can't be written back to the container.
- `` = the storage account name of the Azure resource.
- `` = the host name.
- `` = the name of the SQL Server named instance. Used if you have SQL Server Browser Service running on the target instance.
Additional notes and guidance when setting the location:
- The [!INCLUDE [ssDEnoversion](../../includes/ssdenoversion-md.md)] doesn't verify the existence of the external data source when the object is created. To validate, create an external table using the external data source.
- Use the same external data source for all tables when querying Hadoop to ensure consistent querying semantics.
- Specify the `Driver={}` when connecting via `ODBC`.
- `wasbs` is optional but recommended in [!INCLUDE [sssql17-md](../../includes/sssql17-md.md)] for accessing Azure Storage Accounts as data will be sent using a secure TLS/SSL connection.
- To ensure successful PolyBase queries during a Hadoop `Namenode` fail-over, consider using a virtual IP address for the `Namenode` of the Hadoop cluster. If you don't, execute [ALTER EXTERNAL DATA SOURCE](alter-external-data-source-transact-sql.md) to point to the new location.
#### CREDENTIAL = *credential_name*
Specifies a database-scoped credential for authenticating to the external data source.
Additional notes and guidance when creating a credential:
- `CREDENTIAL` is only required if the data has been secured. `CREDENTIAL` isn't required for data sets that allow anonymous access.
- When the `TYPE` = `BLOB_STORAGE`, the credential must be created using `SHARED ACCESS SIGNATURE` as the identity.
- `TYPE` = `BLOB_STORAGE` is only permitted for bulk operations; you can't create external tables for an external data source with `TYPE` = `BLOB_STORAGE`.
- When connecting to the Azure Storage via `wasb` or `wasbs`, authentication must be done with a storage account key, not with a shared access signature (SAS).
- When `TYPE` = `HADOOP` the credential must be created using the storage account key as the `SECRET`.
There are multiple ways to create a shared access signature:
- You can create a SAS token by navigating to the **Azure portal -> -> Shared access signature -> Configure permissions -> Generate SAS and connection string**. For more information, see [Generate a shared access signature](/azure/storage/blobs/blob-containers-portal#generate-a-shared-access-signature).
- You can [create and configure a SAS with Azure Storage Explorer](/azure/vs-azure-tools-storage-explorer-blobs#get-the-sas-for-a-blob-container).
- You can create a SAS token programmatically via PowerShell, Azure CLI, .NET, and REST API. For more information, see [Grant limited access to Azure Storage resources using shared access signatures (SAS)](/azure/storage/common/storage-sas-overview?toc=%2Fazure%2Fstorage%2Fblobs%2Ftoc.json).
- The SAS token should be configured as follows:
- When a SAS token is generated, it includes a question mark ('?') at the beginning of the token. Exclude the leading `?` when configured as the SECRET.
- Use a valid expiration period (all dates are in UTC time).
- Grant at least read permission on the file that should be loaded (for example `srt=o&sp=r`). Multiple shared access signatures can be created for different use cases. Permissions should be granted as follows:
| Action | Permission |
| --- | --- |
| Read data from a file | Read |
| Read data from multiple files and subfolders | Read and List |
For an example of using a `CREDENTIAL` with `SHARED ACCESS SIGNATURE` and `TYPE` = `BLOB_STORAGE`, see [Create an external data source to execute bulk operations and retrieve data from Azure Storage into SQL Database](#c-create-an-external-data-source-for-bulk-operations-retrieving-data-from-azure-storage)
To create a database scoped credential, see [CREATE DATABASE SCOPED CREDENTIAL](create-database-scoped-credential-transact-sql.md).
#### TYPE = * [ HADOOP | BLOB_STORAGE ] *
Specifies the type of the external data source being configured. This parameter isn't always required, and should only be specified when connecting to Cloudera CDH, Hortonworks HDP, an Azure Storage account, or an Azure Data Lake Storage Gen2.
- Use `HADOOP` when the external data source is Cloudera CDH, Hortonworks HDP, an Azure Storage account, or an Azure Data Lake Storage Gen2.
- Use `BLOB_STORAGE` when executing bulk operations from Azure Storage account using [BULK INSERT](bulk-insert-transact-sql.md) or [OPENROWSET BULK](../functions/openrowset-bulk-transact-sql.md). Introduced with [!INCLUDE [sssql17-md](../../includes/sssql17-md.md)]. Use `HADOOP` when intending to `CREATE EXTERNAL TABLE` against Azure Storage.
> [!NOTE]
> `TYPE` should be set to `HADOOP` even when accessing Azure Storage.
For an example of using `TYPE` = `HADOOP` to load data from an Azure Storage account, see [Create external data source to access data in Azure Storage using the wasb:// interface](#e-create-external-data-source-to-access-data-in-azure-storage-using-the-wasb-interface)
#### RESOURCE_MANAGER_LOCATION = '*ResourceManager_URI*[:port]'
Configure this optional value when connecting to Cloudera CDH, Hortonworks HDP, or an Azure Storage account only. For a complete list of supported Hadoop versions, see [PolyBase connectivity configuration](../../database-engine/configure-windows/polybase-connectivity-configuration-transact-sql.md).
When the `RESOURCE_MANAGER_LOCATION` is defined, the query optimizer will make a cost-based decision to improve performance. A MapReduce job can be used to push down the computation to Hadoop. Specifying the `RESOURCE_MANAGER_LOCATION` can significantly reduce the volume of data transferred between Hadoop and [!INCLUDE [ssNoVersion](../../includes/ssnoversion-md.md)], which can lead to improved query performance.
If the Resource Manager isn't specified, pushing compute to Hadoop is disabled for PolyBase queries. [Create external data source to reference Hadoop with push-down enabled](#c-create-external-data-source-to-reference-hadoop-with-push-down-enabled) provides a concrete example and further guidance.
The `RESOURCE_MANAGER_LOCATION` value isn't validated when you create the external data source. Entering an incorrect value might cause query failure at execution time whenever push-down is attempted as the provided value wouldn't be able to resolve.
In order for PolyBase to function correctly with a Hadoop external data source, the ports for the following Hadoop cluster components must be open:
- HDFS ports
- Namenode
- Datanode
- Resource Manager
- Job submission
- Job history
If the port isn't specified, the default value is chosen using the current setting for 'hadoop connectivity' configuration.
| Hadoop Connectivity | Default Resource Manager Port |
| --- | --- |
| `1` | 50300 |
| `2` | 50300 |
| `3` | 8021 |
| `4` | 8032 |
| `5` | 8050 |
| `6` | 8032 |
| `7` | 8050 |
| `8` | 8032 |
The following table shows the default ports for these components. There's Hadoop version dependency as well as the possibility of custom configuration that doesn't use the default port assignment.
| **Hadoop cluster component** | **Default Port** |
| --- | --- |
| NameNode | 8020 |
| DataNode (Data transfer, non-privilege IPC port) | 50010 |
| DataNode (Data transfer, privilege IPC port) | 1019 |
| Resource Manager Job Submission (Hortonworks 1.3) | 50300 |
| Resource Manager Job Submission (Cloudera 4.3) | 8021 |
| Resource Manager Job Submission (Hortonworks 2.0 on Windows, Cloudera 5.x on Linux) | 8032 |
| Resource Manager Job Submission (Hortonworks 2.x, 3.0 on Linux, Hortonworks 2.1-3 on Windows) | 8050 |
| Resource Manager Job History | 10020 |
## Permissions
Requires `CONTROL` permission on database in [!INCLUDE [ssNoVersion](../../includes/ssnoversion-md.md)].
## Locking
Takes a shared lock on the `EXTERNAL DATA SOURCE` object.
## Security
PolyBase supports proxy based authentication for most external data sources. Create a database scoped credential to create the proxy account.
A SAS token with type `HADOOP` is unsupported. It's only supported with type = `BLOB_STORAGE` when a storage account access key is used instead. Attempting to create an external data source with type `HADOOP` and a SAS credential fails with the following error:
`Msg 105019, Level 16, State 1 - EXTERNAL TABLE access failed due to internal error: 'Java exception raised on call to HdfsBridge_Connect. Java exception message: Parameters provided to connect to the Azure storage account aren't valid.: Error [Parameters provided to connect to the Azure storage account aren't valid.] occurred while accessing external file.'`
## Examples
> [!IMPORTANT]
> For information on how to install and enable PolyBase, see [Install PolyBase on Windows](../../relational-databases/polybase/polybase-installation.md)
### A. Create external data source to reference Hadoop
To create an external data source to reference your Hortonworks HDP or Cloudera CDH Hadoop cluster, specify the machine name, or IP address of the Hadoop `Namenode` and port.
```sql
CREATE EXTERNAL DATA SOURCE MyHadoopCluster
WITH (
TYPE = HADOOP,
LOCATION = 'hdfs://10.10.10.10:8050'
);
```
### B. Create external data source to reference Hadoop with push-down enabled
Specify the `RESOURCE_MANAGER_LOCATION` option to enable push-down computation to Hadoop for PolyBase queries. Once enabled, PolyBase makes a cost-based decision to determine whether the query computation should be pushed to Hadoop.
```sql
CREATE EXTERNAL DATA SOURCE MyHadoopCluster
WITH (
TYPE = HADOOP,
LOCATION = 'hdfs://10.10.10.10:8020',
RESOURCE_MANAGER_LOCATION = '10.10.10.10:8050'
);
```
### C. Create external data source to reference Kerberos-secured Hadoop
To verify if the Hadoop cluster is Kerberos-secured, check the value of `hadoop.security.authentication` property in Hadoop core-site.xml. To reference a Kerberos-secured Hadoop cluster, you must specify a database scoped credential that contains your Kerberos username and password. The database master key is used to encrypt the database scoped credential secret.
```sql
-- Create a database master key if one does not already exist, using your own password.
-- This key is used to encrypt the credential secret in next step.
CREATE MASTER KEY ENCRYPTION BY PASSWORD = '';
-- Create a database scoped credential with Kerberos user name and password.
CREATE DATABASE SCOPED CREDENTIAL HadoopUser1
WITH IDENTITY = '',
SECRET = '';
-- Create an external data source with CREDENTIAL option.
CREATE EXTERNAL DATA SOURCE MyHadoopCluster
WITH (
TYPE = HADOOP,
LOCATION = 'hdfs://10.10.10.10:8050',
CREDENTIAL = HadoopUser1,
RESOURCE_MANAGER_LOCATION = '10.10.10.10:8050'
);
```
### D. Create external data source to access data in Azure Storage using the wasb:// interface
In this example, the external data source is an Azure V2 Storage account named `logs`. The storage container is named `daily`. The Azure Storage external data source is for data transfer only. It doesn't support predicate push-down. Hierarchical namespaces aren't supported when accessing data via the `wasb://` interface. When connecting to the Azure Storage via `wasb` or `wasbs`, authentication must be done with a storage account key, not with a shared access signature (SAS).
This example shows how to create the database scoped credential for authentication to an Azure V2 Storage account. Specify the Azure Storage account key in the database credential secret. You can specify any string in database scoped credential identity as it isn't used during authentication to Azure Storage.
```sql
-- Create a database master key if one does not already exist, using your own password.
-- This key is used to encrypt the credential secret in next step.
CREATE MASTER KEY ENCRYPTION BY PASSWORD = '';
-- Create a database scoped credential with Azure storage account key as the secret.
CREATE DATABASE SCOPED CREDENTIAL AzureStorageCredential
WITH IDENTITY = '',
SECRET = '';
-- Create an external data source with CREDENTIAL option.
CREATE EXTERNAL DATA SOURCE MyAzureStorage
WITH (
TYPE = HADOOP,
LOCATION = 'wasbs://daily@logs.blob.core.windows.net/',
CREDENTIAL = AzureStorageCredential
);
```
## Examples: Bulk operations
> [!IMPORTANT]
> Don't add a trailing **/**, file name, or shared access signature parameters at the end of the `LOCATION` URL when configuring an external data source for bulk operations.
### E. Create an external data source for bulk operations retrieving data from Azure Storage
**Applies to:** [!INCLUDE [sssql17-md](../../includes/sssql17-md.md)] and later versions.
Use the following data source for bulk operations using [BULK INSERT](bulk-insert-transact-sql.md) or [OPENROWSET BULK](../functions/openrowset-bulk-transact-sql.md). The credential must set `SHARED ACCESS SIGNATURE` as the identity, mustn't have the leading `?` in the SAS token, must have at least read permission on the file that should be loaded (for example `srt=o&sp=r`), and the expiration period should be valid (all dates are in UTC time). For more information on shared access signatures, see [Using Shared Access Signatures (SAS)](/azure/storage/common/storage-sas-overview).
```sql
CREATE DATABASE SCOPED CREDENTIAL AccessAzureInvoices
WITH IDENTITY = 'SHARED ACCESS SIGNATURE',
SECRET = -- Remove ? from the beginning of the SAS token'';
CREATE EXTERNAL DATA SOURCE MyAzureInvoices
WITH (
TYPE = BLOB_STORAGE,
LOCATION = 'https://newinvoices.blob.core.windows.net/week3',
CREDENTIAL = AccessAzureInvoices
);
```
To see this example in use, see [BULK INSERT](bulk-insert-transact-sql.md#f-import-data-from-a-file-in-azure-blob-storage).
## Related content
- [ALTER EXTERNAL DATA SOURCE (Transact-SQL)](alter-external-data-source-transact-sql.md)
- [CREATE DATABASE SCOPED CREDENTIAL (Transact-SQL)](create-database-scoped-credential-transact-sql.md)
- [CREATE EXTERNAL FILE FORMAT (Transact-SQL)](create-external-file-format-transact-sql.md)
- [CREATE EXTERNAL TABLE (Transact-SQL)](create-external-table-transact-sql.md)
- [sys.external_data_sources (Transact-SQL)](../../relational-databases/system-catalog-views/sys-external-data-sources-transact-sql.md)
- [Using Shared Access Signatures (SAS)](/azure/storage/common/storage-sas-overview)
- [PolyBase connectivity configuration (Transact-SQL)](../../database-engine/configure-windows/polybase-connectivity-configuration-transact-sql.md)
::: moniker-end
::: moniker range="=sql-server-ver15 || =sql-server-linux-ver15"
## Overview: SQL Server 2019
**Applies to**: [!INCLUDE [sssql19-md](../../includes/sssql19-md.md)]
Creates an external data source for PolyBase queries. External data sources are used to establish connectivity and support these primary use cases:
- Data virtualization and data load using [PolyBase in SQL Server](../../relational-databases/polybase/overview.md)
- Bulk load operations using `BULK INSERT` or `OPENROWSET`
::: moniker-end
::: moniker range="=sql-server-linux-ver15"
> [!NOTE]
> The syntax of `CREATE EXTERNAL DATA SOURCE` varies in different versions of the [SQL Database Engine](../../database-engine/sql-database-engine.md). Use the version selector dropdown list to [choose the appropriate product version](../../sql-server/sql-docs-navigation-guide.md#what-the-applies-to-options-mean).
> To view the features of [!INCLUDE [sssql22-md](../../includes/sssql22-md.md)], visit [CREATE EXTERNAL DATA SOURCE](create-external-data-source-transact-sql.md?view=sql-server-linux-ver16&preserve-view=true#syntax).
::: moniker-end
::: moniker range="=sql-server-ver15"
> [!NOTE]
> The syntax of `CREATE EXTERNAL DATA SOURCE` varies in different versions of the [SQL Database Engine](../../database-engine/sql-database-engine.md). Use the version selector dropdown list to [choose the appropriate product version](../../sql-server/sql-docs-navigation-guide.md#what-the-applies-to-options-mean).
> To view the features of [!INCLUDE [sssql22-md](../../includes/sssql22-md.md)], visit [CREATE EXTERNAL DATA SOURCE](create-external-data-source-transact-sql.md?view=sql-server-ver16&preserve-view=true#syntax).
::: moniker-end
::: moniker range="=sql-server-ver15||=sql-server-linux-ver15"
:::image type="icon" source="../../includes/media/topic-link-icon.svg" border="false"::: [Transact-SQL syntax conventions](../../t-sql/language-elements/transact-sql-syntax-conventions-transact-sql.md)
## Syntax for SQL Server 2019
```syntaxsql
CREATE EXTERNAL DATA SOURCE
WITH
( [ LOCATION = '://[:]' ]
[ [ , ] CONNECTION_OPTIONS = ''[,...]]
[ [ , ] CREDENTIAL = ]
[ [ , ] PUSHDOWN = { ON | OFF } ]
[ [ , ] TYPE = { HADOOP | BLOB_STORAGE } ]
[ [ , ] RESOURCE_MANAGER_LOCATION = '[:]' ]
)
[ ; ]
```
## Arguments
#### data_source_name
Specifies the user-defined name for the data source. The name must be unique within the database in [!INCLUDE [ssNoVersion](../../includes/ssnoversion-md.md)].
#### LOCATION = '*\://\*'
Provides the connectivity protocol and path to the external data source.
| External Data Source | Connector location prefix | Location path | Supported locations by product / service | Authentication |
| --- | --- | --- | --- | ---: |
| Cloudera CDH or Hortonworks HDP | `hdfs` | `[:port]` | [!INCLUDE [sssql16-md](../../includes/sssql16-md.md)] to [!INCLUDE [sql-server-2019](../../includes/sssql19-md.md)] | Anonymous or basic authentication |
| Azure Storage account(V2) | `wasb[s]` | `@.blob.core.windows.net` | Starting with [!INCLUDE [sssql16-md](../../includes/sssql16-md.md)]
Hierarchical Namespace **not** supported | Azure Storage account key |
| [!INCLUDE [ssNoVersion](../../includes/ssnoversion-md.md)] | `sqlserver` | `[\][:port]` | Starting with [!INCLUDE [sql-server-2019](../../includes/sssql19-md.md)] | SQL authentication only |
| Oracle | `oracle` | `[:port]` | Starting with [!INCLUDE [sql-server-2019](../../includes/sssql19-md.md)] | Basic authentication only |
| Teradata | `teradata` | `[:port]` | Starting with [!INCLUDE [sql-server-2019](../../includes/sssql19-md.md)] | Basic authentication only |
| MongoDB or Cosmos DB API for MongoDB | `mongodb` | `[:port]` | Starting with [!INCLUDE [sql-server-2019](../../includes/sssql19-md.md)] | Basic authentication only |
| Generic ODBC | `odbc` | `[:port]` | Starting with [!INCLUDE [sql-server-2019](../../includes/sssql19-md.md)] - Windows only | Basic authentication only |
| Bulk Operations | `https` | `.blob.core.windows.net/` | Starting with [!INCLUDE [ssSQL17](../../includes/sssql17-md.md)] | Shared access signature (SAS) |
| Azure Data Lake Storage Gen2 | `abfs[s]` | `abfss://@.dfs.core.windows.net` | Starting with [!INCLUDE [sql-server-2019](../../includes/sssql19-md.md)] CU11+. | Storage Access Key |
| [!INCLUDE [ssbigdataclusters-ss-nover](../../includes/ssbigdataclusters-ss-nover.md)] data pool | `sqldatapool` | `sqldatapool://controller-svc/default` | Only supported in [!INCLUDE [ssbigdataclusters-ver15](../../includes/ssbigdataclusters-ver15.md)] | Basic authentication only |
| [!INCLUDE [ssbigdataclusters-ss-nover](../../includes/ssbigdataclusters-ss-nover.md)] storage pool | `sqlhdfs` | `sqlhdfs://controller-svc/default` | Only supported in [!INCLUDE [ssbigdataclusters-ver15](../../includes/ssbigdataclusters-ver15.md)] | Basic authentication only |
Location path:
- `` = the machine name, name service URI, or IP address of the `Namenode` in the Hadoop cluster. PolyBase must resolve any DNS names used by the Hadoop cluster.
- `port` = The port that the external data source is listening on. In Hadoop, the port can be found using the `fs.defaultFS` configuration parameter. The default is 8020.
- `` = the container of the storage account holding the data. Root containers are read-only, data can't be written back to the container.
- `` = the storage account name of the Azure resource.
- `` = the host name.
- `` = the name of the SQL Server named instance. Used if you have SQL Server Browser Service running on the target instance.
Additional notes and guidance when setting the location:
- The [!INCLUDE [ssDEnoversion](../../includes/ssdenoversion-md.md)] doesn't verify the existence of the external data source when the object is created. To validate, create an external table using the external data source.
- Use the same external data source for all tables when querying Hadoop to ensure consistent querying semantics.
- You can use the `sqlserver` connector to connect [!INCLUDE [sql-server-2019](../../includes/sssql19-md.md)] to another [!INCLUDE [ssNoVersion](../../includes/ssnoversion-md.md)], or to [!INCLUDE [ssazure-sqldb](../../includes/ssazure-sqldb.md)].
- Specify the `Driver={}` when connecting via `ODBC`.
- Using `wasbs` or `abfss` is optional but recommended in [!INCLUDE [sssql19-md](../../includes/sssql19-md.md)] for accessing Azure Storage Accounts as data will be sent using a secure TLS/SSL connection.
- The `abfs` or `abfss` APIs are supported when accessing Azure Storage Accounts starting with [!INCLUDE [sql-server-2019](../../includes/sssql19-md.md)] CU11. For more information, see [the Azure Blob Filesystem driver (ABFS)](/azure/storage/blobs/data-lake-storage-abfs-driver).
- The Hierarchical Namespace option for Azure Storage Accounts(V2) using `abfs[s]` is supported via Azure Data Lake Storage Gen2 starting with [!INCLUDE [sql-server-2019](../../includes/sssql19-md.md)] CU11+. The Hierarchical Namespace option is otherwise not supported, and this option should remain **disabled**.
- To ensure successful PolyBase queries during a Hadoop `Namenode` fail-over, consider using a virtual IP address for the `Namenode` of the Hadoop cluster. If you don't, execute [ALTER EXTERNAL DATA SOURCE](alter-external-data-source-transact-sql.md) to point to the new location.
- The `sqlhdfs` and `sqldatapool` types are supported for connecting between the master instance and storage pool of a big data cluster. For Cloudera CDH or Hortonworks HDP, use `hdfs`. For more information on using `sqlhdfs` for querying [!INCLUDE [ssbigdataclusters-ss-nover](../../includes/ssbigdataclusters-ss-nover.md)] storage pools, see [Query HDFS in SQL Server 2019 Big Data Cluster](../../big-data-cluster/tutorial-query-hdfs-storage-pool.md).
- [!INCLUDE [polybase-java-connector-banner-retirement](../../includes/polybase-java-connector-banner-retirement.md)]
#### CONNECTION_OPTIONS = *key_value_pair*
Specified for [!INCLUDE [sssql19-md](../../includes/sssql19-md.md)] and later versions. Specifies additional options when connecting over `ODBC` to an external data source. To use multiple connection options, separate them by a semi-colon.
Applies to generic `ODBC` connections, as well as built-in `ODBC` connectors for [!INCLUDE [ssNoVersion](../../includes/ssnoversion-md.md)], Oracle, Teradata, MongoDB, and Azure Cosmos DB API for MongoDB.
The `key_value_pair` is the keyword and the value for a specific connection option. The available keywords and values depend on the external data source type. The name of the driver is required as a minimum, but there are other options such as `APP=''` or `ApplicationIntent= ReadOnly|ReadWrite` that are also useful to set and can assist with troubleshooting.
Possible key value pairs are specific to the provider for the external data source vendor. For more information for each provider, see [CREATE EXTERNAL DATA SOURCE (Transact-SQL) CONNECTION_OPTIONS](create-external-data-source-connection-options.md).
[!INCLUDE [sssql19-md](../../includes/sssql19-md.md)] Cumulative Update 19 and later versions introduce additional keywords to support Oracle TNS files:
- The keyword `TNSNamesFile` specifies the filepath to the `tnsnames.ora` file located on the Oracle server.
- The keyword `ServerName` specifies the alias used inside the `tnsnames.ora` that will be used to replace the host name and the port.
#### Pushdown = ON | OFF
Specified for [!INCLUDE [sssql19-md](../../includes/sssql19-md.md)] only. States whether computation can be pushed down to the external data source. It's **ON** by default.
`PUSHDOWN` is supported when connecting to [!INCLUDE [ssNoVersion](../../includes/ssnoversion-md.md)], Oracle, Teradata, MongoDB, the Azure Cosmos DB API for MongoDB, or ODBC at the external data source level.
Enabling or disabling push-down at the query level is achieved through the [EXTERNALPUSHDOWN hint](../../relational-databases/polybase/polybase-pushdown-computation.md#force-pushdown).
#### CREDENTIAL = *credential_name*
Specifies a database-scoped credential for authenticating to the external data source.
Additional notes and guidance when creating a credential:
- `CREDENTIAL` is only required if the data has been secured. `CREDENTIAL` isn't required for data sets that allow anonymous access.
- When the `TYPE` = `BLOB_STORAGE`, the credential must be created using `SHARED ACCESS SIGNATURE` as the identity.
- `TYPE` = `BLOB_STORAGE` is only permitted for bulk operations; you can't create external tables for an external data source with `TYPE` = `BLOB_STORAGE`.
There are multiple ways to create a shared access signature:
- You can create a SAS token by navigating to the **Azure portal -> -> Shared access signature -> Configure permissions -> Generate SAS and connection string**. For more information, see [Generate a shared access signature](/azure/storage/blobs/blob-containers-portal#generate-a-shared-access-signature).
- You can [create and configure a SAS with Azure Storage Explorer](/azure/vs-azure-tools-storage-explorer-blobs#get-the-sas-for-a-blob-container).
- You can create a SAS token programmatically via PowerShell, Azure CLI, .NET, and REST API. For more information, see [Grant limited access to Azure Storage resources using shared access signatures (SAS)](/azure/storage/common/storage-sas-overview?toc=%2Fazure%2Fstorage%2Fblobs%2Ftoc.json).
- The SAS token should be configured as follows:
- When a SAS token is generated, it includes a question mark ('?') at the beginning of the token. Exclude the leading `?` when configured as the SECRET.
- Use a valid expiration period (all dates are in UTC time).
- Grant at least read permission on the file that should be loaded (for example `srt=o&sp=r`). Multiple shared access signatures can be created for different use cases. Permissions should be granted as follows:
| Action | Permission |
| --- | --- |
| Read data from a file | Read |
| Read data from multiple files and subfolders | Read and List |
For an example of using a `CREDENTIAL` with `SHARED ACCESS SIGNATURE` and `TYPE` = `BLOB_STORAGE`, see [Create an external data source to execute bulk operations and retrieve data from Azure Storage into SQL Database](#h-create-an-external-data-source-for-bulk-operations-retrieving-data-from-azure-storage)
To create a database scoped credential, see [CREATE DATABASE SCOPED CREDENTIAL](create-database-scoped-credential-transact-sql.md).
#### TYPE = * [ HADOOP | BLOB_STORAGE ] *
Specifies the type of the external data source being configured. This parameter isn't always required, and should only be specified when connecting to Cloudera CDH, Hortonworks HDP, an Azure Storage account, or an Azure Data Lake Storage Gen2.
- In [!INCLUDE [sssql19-md](../../includes/sssql19-md.md)], don't specify TYPE unless connecting to Cloudera CDH, Hortonworks HDP, an Azure Storage account.
- Use `HADOOP` when the external data source is Cloudera CDH, Hortonworks HDP, an Azure Storage account, or an Azure Data Lake Storage Gen2.
- Use `BLOB_STORAGE` when executing bulk operations from Azure Storage account using [BULK INSERT](bulk-insert-transact-sql.md), or [OPENROWSET BULK](../functions/openrowset-bulk-transact-sql.md) with [!INCLUDE [sssql17-md](../../includes/sssql17-md.md)]. Use `HADOOP` when intending to CREATE EXTERNAL TABLE against Azure Storage.
- [!INCLUDE [polybase-java-connector-banner-retirement](../../includes/polybase-java-connector-banner-retirement.md)]
For an example of using `TYPE` = `HADOOP` to load data from an Azure Storage account, see [Create external data source to access data in Azure Storage using the wasb:// interface](#e-create-external-data-source-to-access-data-in-azure-storage-using-the-wasb-interface).
#### RESOURCE_MANAGER_LOCATION = '*ResourceManager_URI*[:port]'
In [!INCLUDE [sssql19-md](../../includes/sssql19-md.md)], don't specify RESOURCE_MANAGER_LOCATION unless connecting to Cloudera CDH, Hortonworks HDP, an Azure Storage account.
Configure this optional value when connecting to Cloudera CDH, Hortonworks HDP, or an Azure Storage account only. For a complete list of supported Hadoop versions, see [PolyBase connectivity configuration](../../database-engine/configure-windows/polybase-connectivity-configuration-transact-sql.md).
When the `RESOURCE_MANAGER_LOCATION` is defined, the query optimizer makes a cost-based decision to improve performance. A MapReduce job can be used to push down the computation to Hadoop. Specifying the `RESOURCE_MANAGER_LOCATION` can significantly reduce the volume of data transferred between Hadoop and [!INCLUDE [ssNoVersion](../../includes/ssnoversion-md.md)], which can lead to improved query performance.
If the Resource Manager isn't specified, pushing compute to Hadoop is disabled for PolyBase queries. [Create external data source to reference Hadoop with push-down enabled](#c-create-external-data-source-to-reference-hadoop-with-push-down-enabled) provides a concrete example and further guidance.
The RESOURCE_MANAGER_LOCATION value isn't validated when you create the external data source. Entering an incorrect value might cause query failure at execution time whenever push-down is attempted as the provided value wouldn't be able to resolve.
In order for PolyBase to function correctly with a Hadoop external data source, the ports for the following Hadoop cluster components must be open:
- HDFS ports
- Namenode
- Datanode
- Resource Manager
- Job submission
- Job history
If the port isn't specified, the default value is chosen using the current setting for 'hadoop connectivity' configuration.
| Hadoop Connectivity | Default Resource Manager Port |
| --- | --- |
| `1` | 50300 |
| `2` | 50300 |
| `3` | 8021 |
| `4` | 8032 |
| `5` | 8050 |
| `6` | 8032 |
| `7` | 8050 |
| `8` | 8032 |
The following table shows the default ports for these components. There's Hadoop version dependency as well as the possibility of custom configuration that doesn't use the default port assignment.
| **Hadoop cluster component** | **Default Port** |
| --- | --- |
| NameNode | 8020 |
| DataNode (Data transfer, non-privilege IPC port) | 50010 |
| DataNode (Data transfer, privilege IPC port) | 1019 |
| Resource Manager Job Submission (Hortonworks 1.3) | 50300 |
| Resource Manager Job Submission (Cloudera 4.3) | 8021 |
| Resource Manager Job Submission (Hortonworks 2.0 on Windows, Cloudera 5.x on Linux) | 8032 |
| Resource Manager Job Submission (Hortonworks 2.x, 3.0 on Linux, Hortonworks 2.1-3 on Windows) | 8050 |
| Resource Manager Job History | 10020 |
## Permissions
Requires `CONTROL` permission on database in [!INCLUDE [ssNoVersion](../../includes/ssnoversion-md.md)].
## Locking
Takes a shared lock on the `EXTERNAL DATA SOURCE` object.
## Security
PolyBase supports proxy based authentication for most external data sources. Create a database scoped credential to create the proxy account.
When you connect to the storage or data pool in SQL Server 2019 Big Data Cluster, the user's credentials are passed through to the back-end system. Create logins in the data pool itself to enable pass through authentication.
A SAS token with type `HADOOP` is unsupported. It's only supported with type = `BLOB_STORAGE` when a storage account access key is used instead. Attempting to create an external data source with type `HADOOP` and a SAS credential fails with the following error:
`Msg 105019, Level 16, State 1 - EXTERNAL TABLE access failed due to internal error: 'Java exception raised on call to HdfsBridge_Connect. Java exception message: Parameters provided to connect to the Azure storage account aren't valid.: Error [Parameters provided to connect to the Azure storage account aren't valid.] occurred while accessing external file.'`
## Examples
> [!IMPORTANT]
> For information on how to install and enable PolyBase, see [Install PolyBase on Windows](../../relational-databases/polybase/polybase-installation.md)
### A. Create external data source in SQL Server 2019 to reference Oracle
To create an external data source that references Oracle, ensure you have a database scoped credential. You might optionally also enable or disable push-down of computation against this data source.
```sql
-- Create a database master key if one does not already exist, using your own password.
-- This key is used to encrypt the credential secret in next step.
CREATE MASTER KEY ENCRYPTION BY PASSWORD = '';
-- Create a database scoped credential with Azure storage account key as the secret.
CREATE DATABASE SCOPED CREDENTIAL OracleProxyAccount
WITH IDENTITY = 'oracle_username',
SECRET = 'oracle_password';
CREATE EXTERNAL DATA SOURCE MyOracleServer
WITH (
LOCATION = 'oracle://145.145.145.145:1521',
PUSHDOWN = ON,
CREDENTIAL = OracleProxyAccount
);
```
Optionally, the external data source to Oracle can use proxy authentication to provide fine-grained access control. A proxy user can be configured to have limited access compared to the user being impersonated.
```sql
CREATE DATABASE SCOPED CREDENTIAL [OracleProxyCredential]
WITH IDENTITY = 'oracle_username',
SECRET = 'oracle_password';
CREATE EXTERNAL DATA SOURCE [OracleSalesSrvr]
WITH (
LOCATION = 'oracle://145.145.145.145:1521',
CONNECTION_OPTIONS = 'ImpersonateUser=%CURRENT_USER',
CREDENTIAL = [OracleProxyCredential]
);
```
Alternatively, you can use TNS authentication.
Starting in [!INCLUDE [sssql19-md](../../includes/sssql19-md.md)] Cumulative Update 19, `CREATE EXTERNAL DATA SOURCE` now supports the use of TNS files when connecting to Oracle.
The `CONNECTION_OPTIONS` parameter was expanded and now uses `TNSNamesFile` and `ServerName` as variables to browse the `tnsnames.ora` file and establish connection with the server.
In the example below, during runtime SQL Server will search for the `tnsnames.ora` file location specified by `TNSNamesFile` and search for the host and network port specified by `ServerName`.
```sql
CREATE EXTERNAL DATA SOURCE [external_data_source_name]
WITH (
LOCATION = N'oracle://XE',
CREDENTIAL = [OracleCredentialTest],
CONNECTION_OPTIONS = N'TNSNamesFile=C:\Temp\tnsnames.ora;ServerName=XE'
);
```
For additional examples to other data sources such as MongoDB, see [Configure PolyBase to access external data in MongoDB](../../relational-databases/polybase/polybase-configure-mongodb.md).
### B. Create external data source to reference Hadoop
To create an external data source to reference your Hortonworks HDP or Cloudera CDH Hadoop cluster, specify the machine name, or IP address of the Hadoop `Namenode` and port.
```sql
CREATE EXTERNAL DATA SOURCE MyHadoopCluster
WITH (
TYPE = HADOOP,
LOCATION = 'hdfs://10.10.10.10:8050'
);
```
### C. Create external data source to reference Hadoop with push-down enabled
Specify the `RESOURCE_MANAGER_LOCATION` option to enable push-down computation to Hadoop for PolyBase queries. Once enabled, PolyBase makes a cost-based decision to determine whether the query computation should be pushed to Hadoop.
```sql
CREATE EXTERNAL DATA SOURCE MyHadoopCluster
WITH (
TYPE = HADOOP,
LOCATION = 'hdfs://10.10.10.10:8020',
RESOURCE_MANAGER_LOCATION = '10.10.10.10:8050'
);
```
### D. Create external data source to reference Kerberos-secured Hadoop
To verify if the Hadoop cluster is Kerberos-secured, check the value of `hadoop.security.authentication` property in Hadoop core-site.xml. To reference a Kerberos-secured Hadoop cluster, you must specify a database scoped credential that contains your Kerberos username and password. The database master key is used to encrypt the database scoped credential secret.
```sql
-- Create a database master key if one does not already exist, using your own password.
-- This key is used to encrypt the credential secret in next step.
CREATE MASTER KEY ENCRYPTION BY PASSWORD = '';
-- Create a database scoped credential with Kerberos user name and password.
CREATE DATABASE SCOPED CREDENTIAL HadoopUser1
WITH IDENTITY = '',
SECRET = '';
-- Create an external data source with CREDENTIAL option.
CREATE EXTERNAL DATA SOURCE MyHadoopCluster
WITH (
TYPE = HADOOP,
LOCATION = 'hdfs://10.10.10.10:8050',
CREDENTIAL = HadoopUser1,
RESOURCE_MANAGER_LOCATION = '10.10.10.10:8050'
);
```
### E. Create external data source to access data in Azure Storage using the wasb:// interface
In this example, the external data source is an Azure V2 Storage account named `logs`. The storage container is named `daily`. The Azure Storage external data source is for data transfer only. It doesn't support predicate push-down. Hierarchical namespaces aren't supported when accessing data via the `wasb://` interface. When connecting to the Azure Storage via `wasb` or `wasbs`, authentication must be done with a storage account key, not with a shared access signature (SAS).
This example shows how to create the database scoped credential for authentication to an Azure V2 Storage account. Specify the Azure Storage account key in the database credential secret. You can specify any string in database scoped credential identity as it isn't used during authentication to Azure Storage.
```sql
-- Create a database master key if one does not already exist, using your own password.
-- This key is used to encrypt the credential secret in next step.
CREATE MASTER KEY ENCRYPTION BY PASSWORD = '';
-- Create a database scoped credential with Azure storage account key as the secret.
CREATE DATABASE SCOPED CREDENTIAL AzureStorageCredential
WITH IDENTITY = '',
SECRET = '';
-- Create an external data source with CREDENTIAL option.
CREATE EXTERNAL DATA SOURCE MyAzureStorage
WITH (
TYPE = HADOOP,
LOCATION = 'wasbs://daily@logs.blob.core.windows.net/',
CREDENTIAL = AzureStorageCredential
);
```
### F. Create external data source to reference a SQL Server named instance via PolyBase connectivity
**Applies to:** [!INCLUDE [sssql19-md](../../includes/sssql19-md.md)] and later
To create an external data source that references a named instance of [!INCLUDE [ssNoVersion](../../includes/ssnoversion-md.md)], use `CONNECTION_OPTIONS` to specify the instance name.
In the following example, `WINSQL2019` is the host name and `SQL2019` is the instance name. `'Server=%s\SQL2019'` is the key value pair.
```sql
CREATE EXTERNAL DATA SOURCE SQLServerInstance2
WITH (
LOCATION = 'sqlserver://WINSQL2019',
CONNECTION_OPTIONS = 'Server=%s\SQL2019',
CREDENTIAL = SQLServerCredentials
);
```
Alternatively, you can use a port to connect to a [!INCLUDE [ssNoVersion](../../includes/ssnoversion-md.md)] default instance.
```sql
CREATE EXTERNAL DATA SOURCE SQLServerInstance2
WITH (
LOCATION = 'sqlserver://WINSQL2019:58137',
CREDENTIAL = SQLServerCredentials
);
```
### G. Create external data source to reference a readable secondary replica of Always On availability group
**Applies to:** [!INCLUDE [sssql19-md](../../includes/sssql19-md.md)] and later
To create an external data source that references a readable secondary replica of [!INCLUDE [ssNoVersion](../../includes/ssnoversion-md.md)], use `CONNECTION_OPTIONS` to specify the `ApplicationIntent=ReadOnly`. Also, you need to either set the availability database as `Database={dbname}` in `CONNECTION_OPTIONS`, or set the availability database as the default database of the login used for the database scoped credential. You need to do this on all availability replicas of the availability group.
First, create the database scoped credential, storing credentials for a SQL authenticated login. The SQL ODBC Connector for PolyBase only supports basic authentication. Before you create a database scoped credential, the database must have a master key to protect the credential. For more information, see [CREATE MASTER KEY](create-master-key-transact-sql.md). The following sample creates a database scoped credential, provide your own login and password.
```sql
CREATE DATABASE SCOPED CREDENTIAL SQLServerCredentials
WITH IDENTITY = 'username',
SECRET = 'password';
```
Next, create the new external data source.
Whether you included `Database=dbname` in the `CONNECTION_OPTIONS` or set the availability database as the default database for the login in the database scoped credential, you must still provide the database name via a three-part name in the CREATE EXTERNAL TABLE statement, within the LOCATION parameter. For an example, see [CREATE EXTERNAL TABLE](create-external-table-transact-sql.md?view=sql-server-ver15&preserve-view=true#g-create-an-external-table-for-sql-server).
In the following example, `WINSQL2019AGL` is the availability group listener name and `dbname` is the name of the database to be the target of the CREATE EXTERNAL TABLE statement.
```sql
CREATE EXTERNAL DATA SOURCE SQLServerInstance2
WITH (
LOCATION = 'sqlserver://WINSQL2019AGL',
CONNECTION_OPTIONS = 'ApplicationIntent=ReadOnly; Database=dbname',
CREDENTIAL = SQLServerCredentials
);
```
You can demonstrate the redirection behavior of the availability group by specifying `ApplicationIntent` and creating an external table on the system view `sys.servers`. In the following sample script, two external data sources are created, and one external table is created for each. Use the views to test which server is responding to the connection. Similar outcomes can also be achieved via the read-only routing feature. For more information, see [Configure read-only routing for an Always On availability group](../../database-engine/availability-groups/windows/configure-read-only-routing-for-an-availability-group-sql-server.md).
```sql
CREATE EXTERNAL DATA SOURCE [DataSource_SQLInstanceListener_ReadOnlyIntent]
WITH (
LOCATION = 'sqlserver://WINSQL2019AGL',
CONNECTION_OPTIONS = 'ApplicationIntent=ReadOnly; Database=dbname',
CREDENTIAL = [SQLServerCredentials]
);
GO
CREATE EXTERNAL DATA SOURCE [DataSource_SQLInstanceListener_ReadWriteIntent]
WITH (
LOCATION = 'sqlserver://WINSQL2019AGL',
CONNECTION_OPTIONS = 'ApplicationIntent=ReadWrite',
CREDENTIAL = [SQLServerCredentials]
);
GO
```
Inside the database in the availability group, create a view to return `sys.servers` and the name of the local instance, which helps you identify which replica is responding to the query. For more information, see [sys.servers](../../relational-databases/system-catalog-views/sys-servers-transact-sql.md).
```sql
CREATE VIEW vw_sys_servers
AS
SELECT [name]
FROM sys.servers
WHERE server_id = 0;
GO
```
Then, create an external table on the source instance:
```sql
CREATE EXTERNAL TABLE vw_sys_servers_ro (name SYSNAME NOT NULL)
WITH (
DATA_SOURCE = [DataSource_SQLInstanceListener_ReadOnlyIntent],
LOCATION = N'dbname.dbo.vw_sys_servers'
);
GO
CREATE EXTERNAL TABLE vw_sys_servers_rw (name SYSNAME NOT NULL)
WITH (
DATA_SOURCE = [DataSource_SQLInstanceListener_ReadWriteIntent],
LOCATION = N'dbname.dbo.vw_sys_servers'
);
GO
SELECT [name]
FROM dbo.vw_sys_servers_ro;--should return secondary replica instance
SELECT [name]
FROM dbo.vw_sys_servers_rw;--should return primary replica instance
GO
```
## Examples: Bulk operations
> [!IMPORTANT]
> Don't add a trailing **/**, file name, or shared access signature parameters at the end of the `LOCATION` URL when configuring an external data source for bulk operations.
### H. Create an external data source for bulk operations retrieving data from Azure Storage
**Applies to:** [!INCLUDE [sssql17-md](../../includes/sssql17-md.md)] and [!INCLUDE [sssql19-md](../../includes/sssql19-md.md)]
Use the following data source for bulk operations using [BULK INSERT](bulk-insert-transact-sql.md) or [OPENROWSET BULK](../functions/openrowset-bulk-transact-sql.md). The credential must set `SHARED ACCESS SIGNATURE` as the identity, mustn't have the leading `?` in the SAS token, must have at least read permission on the file that should be loaded (for example `srt=o&sp=r`), and the expiration period should be valid (all dates are in UTC time). For more information on shared access signatures, see [Using Shared Access Signatures (SAS)](/azure/storage/common/storage-sas-overview).
```sql
CREATE DATABASE SCOPED CREDENTIAL AccessAzureInvoices
WITH IDENTITY = 'SHARED ACCESS SIGNATURE',
SECRET = -- Remove ? from the beginning of the SAS token'';
CREATE EXTERNAL DATA SOURCE MyAzureInvoices
WITH (
TYPE = BLOB_STORAGE,
LOCATION = 'https://newinvoices.blob.core.windows.net/week3',
CREDENTIAL = AccessAzureInvoices
);
```
To see this example in use, see [BULK INSERT](bulk-insert-transact-sql.md#f-import-data-from-a-file-in-azure-blob-storage).
### I. Create external data source to access data in Azure Storage using the abfs:// interface
**Applies to:** [!INCLUDE [sssql19-md](../../includes/sssql19-md.md)] CU11 and later
In this example, the external data source is an Azure Data Lake Storage Gen2 account `logs`, using [the Azure Blob Filesystem driver (ABFS)](/azure/storage/blobs/data-lake-storage-abfs-driver). The storage container is named `daily`. The Azure Data Lake Storage Gen2 external data source is for data transfer only, as predicate push-down isn't supported.
This example shows how to create the database scoped credential for authentication to an Azure Data Lake Storage Gen2 account. Specify the Azure Storage account key in the database credential secret. You can specify any string in database scoped credential identity as it isn't used during authentication to Azure Storage.
```sql
-- Create a database master key if one does not already exist, using your own password.
-- This key is used to encrypt the credential secret in next step.
CREATE MASTER KEY ENCRYPTION BY PASSWORD = '';
-- Create a database scoped credential with Azure storage account key as the secret.
CREATE DATABASE SCOPED CREDENTIAL AzureStorageCredential
WITH IDENTITY = '',
SECRET = '';
-- Create an external data source with CREDENTIAL option.
CREATE EXTERNAL DATA SOURCE MyAzureStorage
WITH (
TYPE = HADOOP,
LOCATION = 'abfss://daily@logs.dfs.core.windows.net/',
CREDENTIAL = AzureStorageCredential
);
```
### J. Create external data source using generic ODBC to PostgreSQL
As in previous examples, first create a database master key and database scoped credential. The database scoped credential will be used for the external data source. This example also assumes that a generic ODBC data provider for PostgreSQL is installed on the server.
In this example, the generic ODBC data provider is used to connect to a PostgreSQL database server in the same network, where the fully qualified domain name of the PostgreSQL server is `POSTGRES1`, using the default port of TCP 5432.
```sql
CREATE EXTERNAL DATA SOURCE POSTGRES1
WITH (
LOCATION = 'odbc://POSTGRES1.domain:5432',
CONNECTION_OPTIONS = 'Driver={PostgreSQL Unicode(x64)};',
CREDENTIAL = postgres_credential
);
```
## Related content
- [ALTER EXTERNAL DATA SOURCE (Transact-SQL)](alter-external-data-source-transact-sql.md)
- [CREATE DATABASE SCOPED CREDENTIAL (Transact-SQL)](create-database-scoped-credential-transact-sql.md)
- [CREATE EXTERNAL FILE FORMAT (Transact-SQL)](create-external-file-format-transact-sql.md)
- [CREATE EXTERNAL TABLE (Transact-SQL)](create-external-table-transact-sql.md)
- [sys.external_data_sources (Transact-SQL)](../../relational-databases/system-catalog-views/sys-external-data-sources-transact-sql.md)
- [Using Shared Access Signatures (SAS)](/azure/storage/common/storage-sas-overview)
- [PolyBase connectivity configuration (Transact-SQL)](../../database-engine/configure-windows/polybase-connectivity-configuration-transact-sql.md)
::: moniker-end
::: moniker range="=sql-server-ver16||=sql-server-linux-ver16"
## Overview: SQL Server 2022
**Applies to**: [!INCLUDE [sssql22-md](../../includes/sssql22-md.md)] and later versions
Creates an external data source for PolyBase queries. External data sources are used to establish connectivity and support these primary use cases:
- Data virtualization and data load using [PolyBase in SQL Server](../../relational-databases/polybase/overview.md)
- Bulk load operations using `BULK INSERT` or `OPENROWSET`
> [!NOTE]
> The syntax of `CREATE EXTERNAL DATA SOURCE` varies in different versions of the [SQL Database Engine](../../database-engine/sql-database-engine.md). Use the version selector dropdown list to [choose the appropriate product version](../../sql-server/sql-docs-navigation-guide.md#what-the-applies-to-options-mean). This content applies to [!INCLUDE [sssql22-md](../../includes/sssql22-md.md)] and later versions.
## Syntax for SQL Server 2022
## Syntax for SQL Server 2022 and later versions
```syntaxsql
CREATE EXTERNAL DATA SOURCE
WITH
( [ LOCATION = '://[:]' ]
[ [ , ] CONNECTION_OPTIONS = ''[,...]]
[ [ , ] CREDENTIAL = ]
[ [ , ] PUSHDOWN = { ON | OFF } ]
)
[ ; ]
```
## Arguments
#### data_source_name
Specifies the user-defined name for the data source. The name must be unique within the database in [!INCLUDE [ssNoVersion](../../includes/ssnoversion-md.md)].
#### LOCATION = '*\://\*'
Provides the connectivity protocol and path to the external data source.
| External Data Source | Connector location prefix | Location path | Supported locations by product / service | Authentication |
| --- | --- | --- | --- | ---: |
| Azure Storage Account(V2) | `abs` | `abs://@.blob.core.windows.net/`
or
`abs://.blob.core.windows.net/` | Starting with [!INCLUDE [sssql22-md](../../includes/sssql22-md.md)]
Hierarchical Namespace is supported. | Shared access signature (SAS) |
| Azure Data Lake Storage Gen2 | `adls` | `adls://@.dfs.core.windows.net/`
or
`adls://.dfs.core.windows.net/` | Starting with [!INCLUDE [sssql22-md](../../includes/sssql22-md.md)] | Shared access signature (SAS) |
| [!INCLUDE [ssNoVersion](../../includes/ssnoversion-md.md)] | `sqlserver` | `[\][:port]` | Starting with [!INCLUDE [sql-server-2019](../../includes/sssql19-md.md)] | SQL authentication only |
| Oracle | `oracle` | `[:port]` | Starting with [!INCLUDE [sql-server-2019](../../includes/sssql19-md.md)] | Basic authentication only |
| Teradata | `teradata` | `[:port]` | Starting with [!INCLUDE [sql-server-2019](../../includes/sssql19-md.md)] | Basic authentication only |
| MongoDB or Cosmos DB API for MongoDB | `mongodb` | `[:port]` | Starting with [!INCLUDE [sql-server-2019](../../includes/sssql19-md.md)] | Basic authentication only |
| Generic ODBC | `odbc` | `[:port]` | Starting with [!INCLUDE [sql-server-2019](../../includes/sssql19-md.md)] - Windows only | Basic authentication only |
| Bulk Operations | `https` | `.blob.core.windows.net/` | Starting with [!INCLUDE [ssSQL17](../../includes/sssql17-md.md)] | Shared access signature (SAS) |
| S3-compatible object storage | `s3` | - S3-compatible: `s3://:/`
- AWS S3: `s3://.S3.amazonaws.com[:port]/`
or `s3://s3.amazonaws.com[:port]//` | Starting with [!INCLUDE [sssql22-md](../../includes/sssql22-md.md)] | Basic or pass-through (STS) \* |
\* Must be a [database scoped credential](create-database-scoped-credential-transact-sql.md), where the IDENTITY is hard-coded to `IDENTITY = 'S3 Access Key'` and the SECRET argument is in the format `= ':'` or use pass-through (STS) authorization. For more information, see [Configure PolyBase to access external data in S3-compatible object storage](../../relational-databases/polybase/polybase-configure-s3-compatible.md).
Location path:
- `port` = The port that the external data source is listening on. Optional in many cases, depending on network configuration.
- `` = the container of the storage account holding the data. Root containers are read-only, data can't be written back to the container.
- `` = the storage account name of the Azure resource.
- `` = the host name.
- `` = the name of the SQL Server named instance. Used if you have SQL Server Browser Service running on the target instance.
- `:` = For S3-compatible object storage only (starting with [!INCLUDE [sssql22-md](../../includes/sssql22-md.md)]), the endpoint and port used to connect to the S3-compatible storage.
- `` = For S3-compatible object storage only (starting with [!INCLUDE [sssql22-md](../../includes/sssql22-md.md)]), specific to the storage platform.
- `` = For S3-compatible object storage only (starting with [!INCLUDE [sssql22-md](../../includes/sssql22-md.md)]), specific to the storage platform.
- `` = Part of the storage path within the storage URL.
Additional notes and guidance when setting the location:
- The [!INCLUDE [ssDEnoversion](../../includes/ssdenoversion-md.md)] doesn't verify the existence of the external data source when the object is created. To validate, create an external table using the external data source.
- You can use the `sqlserver` connector to connect [!INCLUDE [sql-server-2019](../../includes/sssql19-md.md)] to another [!INCLUDE [ssNoVersion](../../includes/ssnoversion-md.md)] or to [!INCLUDE [ssazure-sqldb](../../includes/ssazure-sqldb.md)].
- Specify the `Driver={}` when connecting via `ODBC`.
- The Hierarchical Namespace option for Azure Storage Accounts(V2) using the prefix `adls` is supported via Azure Data Lake Storage Gen2 in [!INCLUDE [sssql22-md](../../includes/sssql22-md.md)].
- SQL Server support for HDFS Cloudera (CDP) and Hortonworks (HDP) external data sources are retired and not included in [!INCLUDE [sssql22-md](../../includes/sssql22-md.md)]. There's no need to use the TYPE argument in [!INCLUDE [sssql22-md](../../includes/sssql22-md.md)].
- For more information on S3-compatible object storage and PolyBase starting with [!INCLUDE [sssql22-md](../../includes/sssql22-md.md)], see [Configure PolyBase to access external data in S3-compatible object storage](../../relational-databases/polybase/polybase-configure-s3-compatible.md). For an example of querying a parquet file within S3-compatible object storage, see [Virtualize parquet file in a S3-compatible object storage with PolyBase](../../relational-databases/polybase/polybase-virtualize-parquet-file.md).
- Differing from previous versions, in [!INCLUDE [sssql22-md](../../includes/sssql22-md.md)], the prefix used for Azure Storage Account (v2) changed from `wasb[s]` to `abs`.
- Differing from previous versions, in [!INCLUDE [sssql22-md](../../includes/sssql22-md.md)], the prefix used for Azure Data Lake Storage Gen2 changed from `abfs[s]` to `adls`.
- For an example using PolyBase to virtualize a CSV file in Azure Storage, see [Virtualize CSV file with PolyBase](../../relational-databases/polybase/virtualize-csv.md).
- For an example using PolyBase to virtualize a delta table in ADLS Gen2, see [Virtualize delta table with PolyBase](../../relational-databases/polybase/virtualize-delta.md).
- [!INCLUDE [sssql22-md](../../includes/sssql22-md.md)] fully supports two URL formats for both Azure Storage Account v2 (`abs`) and Azure Data Lake Gen2 (`adls`).
- The LOCATION path can use the formats: `@..` (recommended) or `../`. For example:
- Azure Storage Account v2: `abs://@.blob.core.windows.net` (recommended) or `abs://.blob.core.windows.net/`.
- Azure Data Lake Gen2 supports: `adls://@.blob.core.windows.net` (recommended) or `adls://.dfs.core.windows.net/`.
#### CONNECTION_OPTIONS = *key_value_pair*
Specified for [!INCLUDE [sssql19-md](../../includes/sssql19-md.md)] and later versions. Specifies additional options when connecting over `ODBC` to an external data source. To use multiple connection options, separate them by a semi-colon.
Applies to generic `ODBC` connections, as well as built-in `ODBC` connectors for [!INCLUDE [ssNoVersion](../../includes/ssnoversion-md.md)], Oracle, Teradata, MongoDB, and Azure Cosmos DB API for MongoDB.
The `key_value_pair` is the keyword and the value for a specific connection option. The available keywords and values depend on the external data source type. The name of the driver is required as a minimum, but there are other options such as `APP=''` or `ApplicationIntent= ReadOnly|ReadWrite` that are also useful to set and can assist with troubleshooting.
Possible key value pairs are specific to the driver. For more information for each provider, see [CREATE EXTERNAL DATA SOURCE (Transact-SQL) CONNECTION_OPTIONS](create-external-data-source-connection-options.md).
Starting in [!INCLUDE [sssql22-md](../../includes/sssql22-md.md)] Cumulative Update 2, additional keywords were introduced to support Oracle TNS files:
- The keyword `TNSNamesFile` specifies the filepath to the `tnsnames.ora` file located on the Oracle server.
- The keyword `ServerName` specifies the alias used inside the `tnsnames.ora` that will be used to replace the host name and the port.
#### PUSHDOWN = ON | OFF
**Applies to: [!INCLUDE [sssql19-md](../../includes/sssql19-md.md)] and later versions.** States whether computation can be pushed down to the external data source. It's on by default.
`PUSHDOWN` is supported when connecting to [!INCLUDE [ssNoVersion](../../includes/ssnoversion-md.md)], Oracle, Teradata, MongoDB, the Azure Cosmos DB API for MongoDB, or ODBC at the external data source level.
Enabling or disabling push-down at the query level is achieved through the [EXTERNALPUSHDOWN hint](../../relational-databases/polybase/polybase-pushdown-computation.md#force-pushdown).
#### CREDENTIAL = *credential_name*
Specifies a database-scoped credential for authenticating to the external data source.
Additional notes and guidance when creating a credential:
- `CREDENTIAL` is only required if the data has been secured. `CREDENTIAL` isn't required for data sets that allow anonymous access.
- When accessing Azure Storage Account (V2) or Azure Data Lake Storage Gen2, the `IDENTITY` must be `SHARED ACCESS SIGNATURE`.
- For an example, see [Create an external data source to execute bulk operations and retrieve data from Azure Storage into SQL Database](#h-create-an-external-data-source-for-bulk-operations-retrieving-data-from-azure-storage-1).
There are multiple ways to create a shared access signature:
- You can create a SAS token by navigating to the **Azure portal -> -> Shared access signature -> Configure permissions -> Generate SAS and connection string**. For more information, see [Generate a shared access signature](/azure/storage/blobs/blob-containers-portal#generate-a-shared-access-signature).
- You can [create and configure a SAS with Azure Storage Explorer](/azure/vs-azure-tools-storage-explorer-blobs#get-the-sas-for-a-blob-container).
- You can create a SAS token programmatically via PowerShell, Azure CLI, .NET, and REST API. For more information, see [Grant limited access to Azure Storage resources using shared access signatures (SAS)](/azure/storage/common/storage-sas-overview?toc=%2Fazure%2Fstorage%2Fblobs%2Ftoc.json).
- The SAS token should be configured as follows:
- When a SAS token is generated, it includes a question mark ('?') at the beginning of the token. Exclude the leading `?` when configured as the SECRET.
- Use a valid expiration period (all dates are in UTC time).
- Grant at least read permission on the file that should be loaded (for example `srt=o&sp=r`). Multiple shared access signatures can be created for different use cases. Permissions should be granted as follows:
| Action | Permission |
| --- | --- |
| Read data from a file | Read |
| Read data from multiple files and subfolders | Read and List |
| Use Create External Table as Select (CETAS) | Read, Create, List and Write |
- For Azure Blob Storage and Azure Data Lake Gen 2:
- Allowed services: `Blob` must be selected to generate the SAS token
- Allowed resource types: `Container` and `Object` must be selected to generate the SAS token
For an example of using a `CREDENTIAL` with S3-compatible object storage and PolyBase, see [Configure PolyBase to access external data in S3-compatible object storage](../../relational-databases/polybase/polybase-configure-s3-compatible.md).
To create a database scoped credential, see [CREATE DATABASE SCOPED CREDENTIAL](create-database-scoped-credential-transact-sql.md).
## Permissions
Requires `CONTROL` permission on database in [!INCLUDE [ssNoVersion](../../includes/ssnoversion-md.md)].
## Locking
Takes a shared lock on the `EXTERNAL DATA SOURCE` object.
## Security
PolyBase supports proxy based authentication for most external data sources. Create a database scoped credential to create the proxy account.
## Upgrade to SQL Server 2022
Starting in [!INCLUDE [sssql22-md](../../includes/sssql22-md.md)], Hadoop external data sources are no longer supported. It's required to manually recreate external data sources previously created with `TYPE = HADOOP`, and any external table that uses this external data source.
Users will also need to configure their external data sources to use new connectors when connecting to Azure Storage.
| External Data Source | From | To |
| --- | --- | --- |
| Azure Blob Storage | `wasb[s]` | `abs` |
| ADLS Gen2 | `abfs[s]` | `adls` |
## Examples
> [!IMPORTANT]
> For information on how to install and enable PolyBase, see [Install PolyBase on Windows](../../relational-databases/polybase/polybase-installation.md)
### A. Create external data source in SQL Server to reference Oracle
To create an external data source that references Oracle, ensure you have a database scoped credential. You might optionally also enable or disable push-down of computation against this data source.
```sql
-- Create a database master key if one does not already exist, using your own password.
-- This key is used to encrypt the credential secret in next step.
CREATE MASTER KEY ENCRYPTION BY PASSWORD = '';
-- Create a database scoped credential with Azure storage account key as the secret.
CREATE DATABASE SCOPED CREDENTIAL OracleProxyAccount
WITH IDENTITY = 'oracle_username',
SECRET = 'oracle_password';
CREATE EXTERNAL DATA SOURCE MyOracleServer
WITH (
LOCATION = 'oracle://145.145.145.145:1521',
PUSHDOWN = ON,
CREDENTIAL = OracleProxyAccount
);
```
Optionally, the external data source to Oracle can use proxy authentication to provide fine-grained access control. A proxy user can be configured to have limited access compared to the user being impersonated.
```sql
CREATE DATABASE SCOPED CREDENTIAL [OracleProxyCredential]
WITH IDENTITY = 'oracle_username',
SECRET = 'oracle_password';
CREATE EXTERNAL DATA SOURCE [OracleSalesSrvr]
WITH (
LOCATION = 'oracle://145.145.145.145:1521',
CONNECTION_OPTIONS = 'ImpersonateUser=%CURRENT_USER',
CREDENTIAL = [OracleProxyCredential]
);
```
Alternatively, you can authenticate using TNS.
Starting in [!INCLUDE [sssql22-md](../../includes/sssql22-md.md)] Cumulative Update 2, `CREATE EXTERNAL DATA SOURCE` now supports the use of TNS files when connecting to Oracle.
The `CONNECTION_OPTIONS` parameter was expanded and now uses `TNSNamesFile` and `ServerName` as variables to browse the `tnsnames.ora` file and establish connection with the server.
In the example below, during runtime SQL Server will search for the `tnsnames.ora` file location specified by `TNSNamesFile` and search for the host and network port specified by `ServerName`.
```sql
CREATE EXTERNAL DATA SOURCE [external_data_source_name]
WITH (
LOCATION = N'oracle://XE',
CREDENTIAL = [OracleCredentialTest],
CONNECTION_OPTIONS = N'TNSNamesFile=C:\Temp\tnsnames.ora;ServerName=XE'
);
```
### B. Create external data source to reference a SQL Server named instance via PolyBase connectivity
**Applies to:** [!INCLUDE [sssql19-md](../../includes/sssql19-md.md)] and later
To create an external data source that references a named instance of [!INCLUDE [ssNoVersion](../../includes/ssnoversion-md.md)], use `CONNECTION_OPTIONS` to specify the instance name.
First, create the database scoped credential, storing credentials for a SQL authenticated login. The SQL ODBC Connector for PolyBase only supports basic authentication. Before you create a database scoped credential, the database must have a master key to protect the credential. For more information, see [CREATE MASTER KEY](create-master-key-transact-sql.md). The following sample creates a database scoped credential, provide your own login and password.
```sql
CREATE DATABASE SCOPED CREDENTIAL SQLServerCredentials
WITH IDENTITY = 'username',
SECRET = 'password';
```
In the following example, `WINSQL2019` is the host name and `SQL2019` is the instance name. `'Server=%s\SQL2019'` is the key value pair.
```sql
CREATE EXTERNAL DATA SOURCE SQLServerInstance2
WITH (
LOCATION = 'sqlserver://WINSQL2019',
CONNECTION_OPTIONS = 'Server=%s\SQL2019',
CREDENTIAL = SQLServerCredentials
);
```
Alternatively, you can use a port to connect to a [!INCLUDE [ssNoVersion](../../includes/ssnoversion-md.md)] default instance.
```sql
CREATE EXTERNAL DATA SOURCE SQLServerInstance2
WITH (
LOCATION = 'sqlserver://WINSQL2019:58137',
CREDENTIAL = SQLServerCredentials
);
```
### C. Create external data source to reference a readable secondary replica of Always On availability group
**Applies to:** [!INCLUDE [sssql19-md](../../includes/sssql19-md.md)] and later
To create an external data source that references a readable secondary replica of [!INCLUDE [ssNoVersion](../../includes/ssnoversion-md.md)], use `CONNECTION_OPTIONS` to specify the `ApplicationIntent=ReadOnly`. Also, you need to either set the availability database as `Database={dbname}` in `CONNECTION_OPTIONS`, or set the availability database as the default database of the login used for the database scoped credential. You need to do this on all availability replicas of the availability group.
First, create the database scoped credential, storing credentials for a SQL authenticated login. The SQL ODBC Connector for PolyBase only supports basic authentication. Before you create a database scoped credential, the database must have a master key to protect the credential. For more information, see [CREATE MASTER KEY](create-master-key-transact-sql.md). The following sample creates a database scoped credential, provide your own login and password.
```sql
CREATE DATABASE SCOPED CREDENTIAL SQLServerCredentials
WITH IDENTITY = 'username',
SECRET = 'password';
```
Next, create the new external data source.
Whether you included `Database=dbname` in the `CONNECTION_OPTIONS` or set the availability database as the default database for the login in the database scoped credential, you must still provide the database name via a three-part name in the CREATE EXTERNAL TABLE statement, within the LOCATION parameter. For an example, see [CREATE EXTERNAL TABLE](create-external-table-transact-sql.md?view=sql-server-ver15&preserve-view=true#g-create-an-external-table-for-sql-server).
In the following example, `WINSQL2019AGL` is the availability group listener name and `dbname` is the name of the database to be the target of the CREATE EXTERNAL TABLE statement.
```sql
CREATE EXTERNAL DATA SOURCE SQLServerInstance2
WITH (
LOCATION = 'sqlserver://WINSQL2019AGL',
CONNECTION_OPTIONS = 'ApplicationIntent=ReadOnly; Database=dbname',
CREDENTIAL = SQLServerCredentials
);
```
You can demonstrate the redirection behavior of the availability group by specifying `ApplicationIntent` and creating an external table on the system view `sys.servers`. In the following sample script, two external data sources are created, and one external table is created for each. Use the views to test which server is responding to the connection. Similar outcomes can also be achieved via the read-only routing feature. For more information, see [Configure read-only routing for an Always On availability group](../../database-engine/availability-groups/windows/configure-read-only-routing-for-an-availability-group-sql-server.md).
```sql
CREATE EXTERNAL DATA SOURCE [DataSource_SQLInstanceListener_ReadOnlyIntent]
WITH (
LOCATION = 'sqlserver://WINSQL2019AGL',
CONNECTION_OPTIONS = 'ApplicationIntent=ReadOnly; Database=dbname',
CREDENTIAL = [SQLServerCredentials]
);
GO
CREATE EXTERNAL DATA SOURCE [DataSource_SQLInstanceListener_ReadWriteIntent]
WITH (
LOCATION = 'sqlserver://WINSQL2019AGL',
CONNECTION_OPTIONS = 'ApplicationIntent=ReadWrite',
CREDENTIAL = [SQLServerCredentials]
);
GO
```
Inside the database in the availability group, create a view to return `sys.servers` and the name of the local instance, which helps you identify which replica is responding to the query. For more information, see [sys.servers](../../relational-databases/system-catalog-views/sys-servers-transact-sql.md).
```sql
CREATE VIEW vw_sys_servers
AS
SELECT [name]
FROM sys.servers
WHERE server_id = 0;
GO
```
Then, create an external table on the source instance:
```sql
CREATE EXTERNAL TABLE vw_sys_servers_ro (name SYSNAME NOT NULL)
WITH (
DATA_SOURCE = [DataSource_SQLInstanceListener_ReadOnlyIntent],
LOCATION = N'dbname.dbo.vw_sys_servers'
);
GO
CREATE EXTERNAL TABLE vw_sys_servers_rw (name SYSNAME NOT NULL)
WITH (
DATA_SOURCE = [DataSource_SQLInstanceListener_ReadWriteIntent],
LOCATION = N'dbname.dbo.vw_sys_servers'
);
GO
SELECT [name]
FROM dbo.vw_sys_servers_ro;--should return secondary replica instance
SELECT [name]
FROM dbo.vw_sys_servers_rw;--should return primary replica instance
GO
```
### D. Create external data source to query a parquet file in S3-compatible object storage via PolyBase
**Applies to:** [!INCLUDE [sssql22-md](../../includes/sssql22-md.md)] and later
The following sample script creates an external data source `s3_ds` in the source user database in SQL Server. The external data source references the `s3_dc` database scoped credential.
```sql
CREATE DATABASE SCOPED CREDENTIAL s3_dc
WITH IDENTITY = 'S3 Access Key', -- for S3-compatible object storage the identity must always be S3 Access Key
SECRET = ':' -- provided by the S3-compatible object storage
GO
CREATE EXTERNAL DATA SOURCE s3_ds
WITH (
LOCATION = 's3://:/',
CREDENTIAL = s3_dc
);
GO
```
Verify the new external data source with [sys.external_data_sources](../../relational-databases/system-catalog-views/sys-external-data-sources-transact-sql.md).
```sql
SELECT *
FROM sys.external_data_sources;
```
Then, the following example demonstrates using T-SQL to query a parquet file stored in S3-compatible object storage via OPENROWSET query. For more information, see [Virtualize parquet file in a S3-compatible object storage with PolyBase](../../relational-databases/polybase/polybase-virtualize-parquet-file.md).
```sql
SELECT *
FROM OPENROWSET (
BULK '//',
FORMAT = 'PARQUET',
DATA_SOURCE = 's3_ds'
) AS [cc];
```
### E. Create external data source using generic ODBC to PostgreSQL
As in previous examples, first create a database master key and database scoped credential. The database scoped credential will be used for the external data source. This example also assumes that a generic ODBC data provider for PostgreSQL is installed on the server.
In this example, the generic ODBC data provider is used to connect to a PostgreSQL database server in the same network, where the fully qualified domain name of the PostgreSQL server is `POSTGRES1`, using the default port of TCP 5432.
```sql
CREATE EXTERNAL DATA SOURCE POSTGRES1
WITH (
LOCATION = 'odbc://POSTGRES1.domain:5432',
CONNECTION_OPTIONS = 'Driver={PostgreSQL Unicode(x64)};',
CREDENTIAL = postgres_credential
);
```
### Azure Storage
#### Create a shared access signature
For both Azure Blob Storage and Azure Data Lake Storage (ADLS) Gen2, the supported authentication method is shared access signature (SAS). One simple way to generate a shared access signature token follow the steps that follow. For more information, see [CREDENTIAL](#credential--credential_name-3).
1. Navigate to the Azure portal, and the desired Storage Account.
1. Navigate to your desired Container under **Data Storage** menu.
1. Select **Shared access tokens**.
1. Choose the appropriate permission based on the desired action:
| Action | Permission |
| --- | --- |
| Read data from a file | Read |
| Read data from multiple files and subfolders | Read and List |
| Use Create External Table as Select (CETAS) | Read, Create and Write |
1. Choose the token expiration date.
1. Generate SAS token and URL.
1. Copy the SAS token.
### F. Create external data source to access data in Azure Blob Storage using the abs:// interface
**Applies to:** [!INCLUDE [sssql22-md](../../includes/sssql22-md.md)] and later
Starting in [!INCLUDE [sssql22-md](../../includes/sssql22-md.md)], use a new prefix `abs` for Azure Storage Account v2. The `abs` prefix supports authentication using `SHARED ACCESS SIGNATURE`. The `abs` prefix replaces `wasb`, used in previous versions. HADOOP isn't longer supported, there's no more need to use `TYPE = BLOB_STORAGE`.
The Azure storage account key is no longer needed, instead using SAS Token as we can see in the following example:
```sql
-- Create a database master key if one does not already exist, using your own password.
-- This key is used to encrypt the credential secret in next step.
CREATE MASTER KEY ENCRYPTION BY PASSWORD = '';
GO
CREATE DATABASE SCOPED CREDENTIAL AzureStorageCredentialv2
WITH IDENTITY = 'SHARED ACCESS SIGNATURE', -- to use SAS the identity must be fixed as-is
SECRET = '';
GO
-- Create an external data source with CREDENTIAL option.
CREATE EXTERNAL DATA SOURCE MyAzureStorage
WITH (
LOCATION = 'abs://@.blob.core.windows.net/',
CREDENTIAL = AzureStorageCredentialv2
);
```
For a more detailed example on how to access CSV files stored in Azure Blob Storage, see [Virtualize CSV file with PolyBase](../../relational-databases/polybase/virtualize-csv.md).
### G. Create external data source to access data in Azure Data Lake Gen2
**Applies to:** [!INCLUDE [sssql22-md](../../includes/sssql22-md.md)] and later versions
Starting in [!INCLUDE [sssql22-md](../../includes/sssql22-md.md)], use a new prefix `adls` for Azure Data Lake Gen2, replacing `abfs` used in previous versions. The `adls` prefix also supports SAS token as authentication method as shown in this example:
```sql
--Create a database scoped credential using SAS Token
CREATE DATABASE SCOPED CREDENTIAL datalakegen2
WITH IDENTITY = 'SHARED ACCESS SIGNATURE',
SECRET = '';
GO
CREATE EXTERNAL DATA SOURCE data_lake_gen2_dfs
WITH (
LOCATION = 'adls://@.dfs.core.windows.net',
CREDENTIAL = datalakegen2
);
```
For a more detailed example on how to access delta files stored on Azure Data Lake Gen2, see [Virtualize delta table with PolyBase](../../relational-databases/polybase/virtualize-delta.md).
## Examples: Bulk Operations
> [!IMPORTANT]
> Don't add a trailing **/**, file name, or shared access signature parameters at the end of the `LOCATION` URL when configuring an external data source for bulk operations.
### H. Create an external data source for bulk operations retrieving data from Azure Storage
**Applies to:** [!INCLUDE [sssql22-md](../../includes/sssql22-md.md)] and later versions.
Use the following data source for bulk operations using [BULK INSERT](bulk-insert-transact-sql.md) or [OPENROWSET BULK](../functions/openrowset-bulk-transact-sql.md). The credential must set `SHARED ACCESS SIGNATURE` as the identity, mustn't have the leading `?` in the SAS token, must have at least read permission on the file that should be loaded (for example `srt=o&sp=r`), and the expiration period should be valid (all dates are in UTC time). For more information on shared access signatures, see [Using Shared Access Signatures (SAS)](/azure/storage/common/storage-sas-overview).
```sql
CREATE DATABASE SCOPED CREDENTIAL AccessAzureInvoices
WITH IDENTITY = 'SHARED ACCESS SIGNATURE',
-- Remove ? from the beginning of the SAS token
SECRET = '';
CREATE EXTERNAL DATA SOURCE MyAzureInvoices
WITH (
LOCATION = 'abs://@.blob.core.windows.net/',
CREDENTIAL = AccessAzureInvoices,
);
```
## Related content
- [ALTER EXTERNAL DATA SOURCE (Transact-SQL)](alter-external-data-source-transact-sql.md)
- [CREATE DATABASE SCOPED CREDENTIAL (Transact-SQL)](create-database-scoped-credential-transact-sql.md)
- [CREATE EXTERNAL FILE FORMAT (Transact-SQL)](create-external-file-format-transact-sql.md)
- [CREATE EXTERNAL TABLE (Transact-SQL)](create-external-table-transact-sql.md)
- [sys.external_data_sources (Transact-SQL)](../../relational-databases/system-catalog-views/sys-external-data-sources-transact-sql.md)
- [Using Shared Access Signatures (SAS)](/azure/storage/common/storage-sas-overview)
- [PolyBase connectivity configuration (Transact-SQL)](../../database-engine/configure-windows/polybase-connectivity-configuration-transact-sql.md)
::: moniker-end
::: moniker range=">=sql-server-ver17 || >=sql-server-linux-ver17"
## Overview: SQL Server 2025
**Applies to**: [!INCLUDE [sssql25-md](../../includes/sssql25-md.md)]
Creates an external data source for PolyBase queries. External data sources are used to establish connectivity and support these primary use cases:
- Data virtualization and data load using [PolyBase in SQL Server](../../relational-databases/polybase/overview.md)
- Bulk load operations using `BULK INSERT` or `OPENROWSET`
Supports Managed Identity connections for instances enabled by Azure Arc. For details, review [Connect to Azure Storage with managed identity from PolyBase](../../relational-databases/polybase/managed-identity.md).
> [!NOTE]
> The syntax of `CREATE EXTERNAL DATA SOURCE` varies in different versions of the [SQL Database Engine](../../database-engine/sql-database-engine.md). Use the version selector dropdown list to [choose the appropriate product version](../../sql-server/sql-docs-navigation-guide.md#what-the-applies-to-options-mean). This content applies to [!INCLUDE [sssql25-md](../../includes/sssql25-md.md)] and later versions.
## Syntax for SQL Server 2025 and later versions
For more information about the syntax conventions, see [Transact-SQL syntax conventions](../../t-sql/language-elements/transact-sql-syntax-conventions-transact-sql.md).
```syntaxsql
CREATE EXTERNAL DATA SOURCE
WITH
( [ LOCATION = '://[:]' ]
[ [ , ] CONNECTION_OPTIONS = ''[,...]]
[ [ , ] CREDENTIAL = ]
[ [ , ] PUSHDOWN = { ON | OFF } ]
)
[ ; ]
```
## Arguments
#### data_source_name
Specifies the user-defined name for the data source. The name must be unique within the database in [!INCLUDE [ssNoVersion](../../includes/ssnoversion-md.md)].
#### LOCATION = '*\://\*'
Provides the connectivity protocol and path to the external data source.
| External data source | Connector location prefix | Location path | Supported locations by product / service | Authentication |
| --- | --- | --- | --- | ---: |
| Azure Storage Account(V2) | `abs` | `abs://@.blob.core.windows.net/`
or
`abs://.blob.core.windows.net/` | Starting with [!INCLUDE [sssql22-md](../../includes/sssql22-md.md)]
Hierarchical Namespace is supported. | Shared access signature (SAS)
or
[PolyBase support for Managed Identity to Azure Storage](../../relational-databases/polybase/managed-identity.md) 1 |
| Azure Data Lake Storage Gen2 | `adls` | `adls://@.dfs.core.windows.net/`
or
`adls://.dfs.core.windows.net/` | Starting with [!INCLUDE [sssql22-md](../../includes/sssql22-md.md)] | Shared access signature (SAS)
or
[PolyBase support for Managed Identity to Azure Storage](../../relational-databases/polybase/managed-identity.md) 1 |
| [!INCLUDE [ssNoVersion](../../includes/ssnoversion-md.md)] | `sqlserver` | `[\][:port]` | Starting with [!INCLUDE [sql-server-2019](../../includes/sssql19-md.md)] | SQL authentication only |
| Oracle | `oracle` | `[:port]` | Starting with [!INCLUDE [sql-server-2019](../../includes/sssql19-md.md)] | Basic authentication only |
| Teradata | `teradata` | `[:port]` | Starting with [!INCLUDE [sql-server-2019](../../includes/sssql19-md.md)] | Basic authentication only |
| MongoDB or Cosmos DB API for MongoDB | `mongodb` | `[:port]` | Starting with [!INCLUDE [sql-server-2019](../../includes/sssql19-md.md)] | Basic authentication only |
| Generic ODBC | `odbc` | `[:port]` | Starting with [!INCLUDE [sql-server-2019](../../includes/sssql19-md.md)] - Windows only | Basic authentication only |
| Bulk Operations | `https` | `.blob.core.windows.net/` | Starting with [!INCLUDE [ssSQL17](../../includes/sssql17-md.md)] | Shared access signature (SAS) |
| S3-compatible object storage | `s3` | - S3-compatible: `s3://:/`
- AWS S3: `s3://.S3.amazonaws.com[:port]/`
or `s3://s3.amazonaws.com[:port]//` | Starting with [!INCLUDE [sssql22-md](../../includes/sssql22-md.md)] | Basic or pass-through (STS) 2 |
1 Requires [!INCLUDE [sssql25-md](../../includes/sssql25-md.md)] instance enabled by Azure Arc. For details, review [Connect to Azure Storage with managed identity from PolyBase](../../relational-databases/polybase/managed-identity.md).
2 Must be a [database scoped credential](create-database-scoped-credential-transact-sql.md), where the `IDENTITY` is hard-coded to `IDENTITY = 'S3 Access Key'` and the `SECRET` argument is in the format `= ':'` or use pass-through (STS) authorization. For more information, see [Configure PolyBase to access external data in S3-compatible object storage](../../relational-databases/polybase/polybase-configure-s3-compatible.md).
Location path:
| Location path | Description |
| --- | --- |
| `port` | The port that the external data source is listening on. Optional in many cases, depending on network configuration. |
| `` | The container of the storage account holding the data. Root containers are read-only, data can't be written back to the container. |
| `` | The storage account name of the Azure resource. |
| `` | The host name. |
| `` | The name of the SQL Server named instance. Used if you have SQL Server Browser Service running on the target instance. |
| `:` 1 | For S3-compatible object storage only, the endpoint and port used to connect to the S3-compatible storage. |
| `` 1 | For S3-compatible object storage only, specific to the storage platform. |
| `` 1 | For S3-compatible object storage only, specific to the storage platform. |
| `` | Part of the storage path within the storage URL. |
1 [!INCLUDE [sssql22-md](../../includes/sssql22-md.md)] and later versions.
Additional notes and guidance when setting the location:
- The [!INCLUDE [ssDEnoversion](../../includes/ssdenoversion-md.md)] doesn't verify the existence of the external data source when the object is created. To validate, create an external table using the external data source.
- You can use the `sqlserver` connector to connect [!INCLUDE [sql-server-2019](../../includes/sssql19-md.md)] to another [!INCLUDE [ssNoVersion](../../includes/ssnoversion-md.md)] or to [!INCLUDE [ssazure-sqldb](../../includes/ssazure-sqldb.md)].
- Specify the `Driver={}` when connecting via `ODBC`.
- The Hierarchical Namespace option for Azure Storage Accounts(V2) using the prefix `adls` is supported via Azure Data Lake Storage Gen2 in [!INCLUDE [sssql22-md](../../includes/sssql22-md.md)] and later versions.
- SQL Server support for HDFS Cloudera (CDP) and Hortonworks (HDP) external data sources are retired and not included in [!INCLUDE [sssql22-md](../../includes/sssql22-md.md)] and later versions. There's no need to use the `TYPE` argument in [!INCLUDE [sssql25-md](../../includes/sssql25-md.md)].
- For more information on S3-compatible object storage and PolyBase in [!INCLUDE [sssql22-md](../../includes/sssql22-md.md)] and later versions, see [Configure PolyBase to access external data in S3-compatible object storage](../../relational-databases/polybase/polybase-configure-s3-compatible.md). For an example of querying a parquet file within S3-compatible object storage, see [Virtualize parquet file in a S3-compatible object storage with PolyBase](../../relational-databases/polybase/polybase-virtualize-parquet-file.md).
In [!INCLUDE [sssql22-md](../../includes/sssql22-md.md)] and later versions:
- the prefix used for Azure Storage Account (v2) changed from `wasb[s]` to `abs`
- the prefix used for Azure Data Lake Storage Gen2 changed from `abfs[s]` to `adls`
- For an example using PolyBase to virtualize a CSV file in Azure Storage, see [Virtualize CSV file with PolyBase](../../relational-databases/polybase/virtualize-csv.md).
- For an example using PolyBase to virtualize a delta table in ADLS Gen2, see [Virtualize delta table with PolyBase](../../relational-databases/polybase/virtualize-delta.md).
- [!INCLUDE [sssql22-md](../../includes/sssql22-md.md)] and later versions fully support two URL formats for both Azure Storage Account v2 (`abs`) and Azure Data Lake Gen2 (`adls`).
- The `LOCATION` path can use the formats: `@..` (recommended) or `../`. For example:
- Azure Storage Account v2: `abs://@.blob.core.windows.net` (recommended) or `abs://.blob.core.windows.net/`.
- Azure Data Lake Gen2 supports: `adls://@.blob.core.windows.net` (recommended) or `adls://.dfs.core.windows.net/`.
#### CONNECTION_OPTIONS = *key_value_pair*
**Applies to**: [!INCLUDE [sssql19-md](../../includes/sssql19-md.md)] and later versions.
Specifies additional options when connecting over `ODBC` to an external data source. To use multiple connection options, separate them by a semi-colon.
Applies to generic `ODBC` connections, as well as built-in `ODBC` connectors for [!INCLUDE [ssNoVersion](../../includes/ssnoversion-md.md)], Oracle, Teradata, MongoDB, and Azure Cosmos DB API for MongoDB.
The `key_value_pair` is the keyword and the value for a specific connection option. The available keywords and values depend on the external data source type. The name of the driver is required as a minimum, but there are other options such as `APP=''` or `ApplicationIntent= ReadOnly|ReadWrite` that are also useful to set and can assist with troubleshooting.
Possible key value pairs are specific to the driver. For more information for each provider, see [CREATE EXTERNAL DATA SOURCE (Transact-SQL) CONNECTION_OPTIONS](create-external-data-source-connection-options.md).
Starting in [!INCLUDE [sssql22-md](../../includes/sssql22-md.md)] Cumulative Update 2, additional keywords were introduced to support Oracle TNS files:
- The keyword `TNSNamesFile` specifies the filepath to the `tnsnames.ora` file located on the Oracle server.
- The keyword `ServerName` specifies the alias used inside the `tnsnames.ora` that will be used to replace the host name and the port.
**Encryption options in [!INCLUDE [sssql25-md](../../includes/sssql25-md.md)]**
Starting in [!INCLUDE [sssql25-md](../../includes/sssql25-md.md)], when using `sqlserver` as the data source, the Microsoft ODBC Driver version 18 for SQL Server is the default driver. The `Encryption` option is required (`Yes`, `No`, or `Strict`), and `TrustServerCertificate` is available (`Yes` or `No`). If `Encryption` isn't specified, the default behavior is `Encrypt=Yes;TrustServerCertificate=No;`, and requires a server certificate.
To connect using the TDS 8.0 protocol, the strict mode (`Encrypt=Strict`) has been added. In this mode, a trusted server certificate is required to be installed and is always verified (TrustServerCertificate is ignored). A new keyword, `HostnameInCertificate`, can be used to specify the expected hostname found in the certificate if it differs from the specified server. `HostnameInCertificate` is usable in all encryption modes and is also applicable if the server-side **Force Encryption** option is enabled, which will cause the driver to verify the certificate in **Optional** or **Mandatory** modes unless disabled using `TrustServerCertificate`.
For more information about `Encryption` options, server certificates, and `TrustServerCertificate`, see [Features of the Microsoft ODBC Driver for SQL Server on Windows](../../connect/odbc/windows/features-of-the-microsoft-odbc-driver-for-sql-server-on-windows.md).
You should always use the latest driver. However, [!INCLUDE [sssql25-md](../../includes/sssql25-md.md)] also supports Microsoft ODBC Driver version 17 for SQL Server for backward compatibility. For more information on how to change the driver version used by PolyBase, see [Change the SQL Server driver version for PolyBase](../../relational-databases/polybase/polybase-change-odbc-driver.md).
#### PUSHDOWN = ON | OFF
**Applies to**: [!INCLUDE [sssql19-md](../../includes/sssql19-md.md)] and later versions.
States whether computation can be pushed down to the external data source. Enabled by default.
`PUSHDOWN` is supported when connecting to [!INCLUDE [ssNoVersion](../../includes/ssnoversion-md.md)], Oracle, Teradata, MongoDB, the Azure Cosmos DB API for MongoDB, or ODBC at the external data source level.
Enabling or disabling push-down at the query level is achieved through a [hint](../../relational-databases/polybase/polybase-pushdown-computation.md#force-pushdown).
#### CREDENTIAL = *credential_name*
Specifies a database-scoped credential for authenticating to the external data source.
Additional notes and guidance when creating a credential:
- `CREDENTIAL` is only required if the data has been secured. `CREDENTIAL` isn't required for data sets that allow anonymous access.
- When accessing Azure Storage Account (V2) or Azure Data Lake Storage Gen2, the `IDENTITY` must be `SHARED ACCESS SIGNATURE`.
- For an example, see [Create an external data source to execute bulk operations and retrieve data from Azure Storage into SQL Database](#g-create-external-data-source-to-access-data-in-azure-data-lake-gen2).
There are multiple ways to create a shared access signature:
- You can create a SAS token by navigating to the **Azure portal** > **\** > **Shared access signature** > **Configure permissions** > **Generate SAS and connection string**. For more information, see [Generate a shared access signature](/azure/storage/blobs/blob-containers-portal#generate-a-shared-access-signature).
- You can [create and configure a SAS with Azure Storage Explorer](/azure/vs-azure-tools-storage-explorer-blobs#get-the-sas-for-a-blob-container).
- You can create a SAS token programmatically via PowerShell, Azure CLI, .NET, and REST API. For more information, see [Grant limited access to Azure Storage resources using shared access signatures (SAS)](/azure/storage/common/storage-sas-overview?toc=%2Fazure%2Fstorage%2Fblobs%2Ftoc.json).
- The SAS token should be configured as follows:
- When a SAS token is generated, it includes a question mark ('?') at the beginning of the token. Exclude the leading `?` when configured as the `SECRET`.
- Use a valid expiration period (all dates are in UTC time).
- Grant at least read permission on the file that should be loaded (for example `srt=o&sp=r`). Multiple shared access signatures can be created for different use cases. Permissions should be granted as follows:
| Action | Permission |
| --- | --- |
| Read data from a file | Read |
| Read data from multiple files and subfolders | Read and List |
| Use Create External Table as Select (CETAS) | Read, Create, List and Write |
- For Azure Blob Storage and Azure Data Lake Gen 2:
- Allowed services: `Blob` must be selected to generate the SAS token
- Allowed resource types: `Container` and `Object` must be selected to generate the SAS token
For an example of using a `CREDENTIAL` with S3-compatible object storage and PolyBase, see [Configure PolyBase to access external data in S3-compatible object storage](../../relational-databases/polybase/polybase-configure-s3-compatible.md).
To create a database scoped credential, see [CREATE DATABASE SCOPED CREDENTIAL](create-database-scoped-credential-transact-sql.md).
## Permissions
Requires `CONTROL` permission on database in [!INCLUDE [ssNoVersion](../../includes/ssnoversion-md.md)].
## Locking
Takes a shared lock on the `EXTERNAL DATA SOURCE` object.
## Security
PolyBase supports proxy based authentication for most external data sources. Create a database scoped credential to create the proxy account.
## Upgrade to SQL Server 2025
In [!INCLUDE [sssql22-md](../../includes/sssql22-md.md)] and later versions, Hadoop external data sources aren't supported. It's required to manually recreate external data sources previously created with `TYPE = HADOOP`, and any external table that uses this external data source.
Users will also need to configure their external data sources to use new connectors when connecting to Azure Storage.
| External data source | From | To |
| --- | --- | --- |
| Azure Blob Storage | wasb[s] | abs |
| ADLS Gen2 | abfs[s] | adls |
## Examples
> [!IMPORTANT]
> For information on how to install and enable PolyBase, see [Install PolyBase on Windows](../../relational-databases/polybase/polybase-installation.md).
### A. Create external data source in SQL Server to reference Oracle
To create an external data source that references Oracle, ensure you have a database scoped credential. You might optionally also enable or disable push-down of computation against this data source.
```sql
-- Create a database master key if one does not already exist, using your own password.
-- This key is used to encrypt the credential secret in next step.
CREATE MASTER KEY ENCRYPTION BY PASSWORD = '';
-- Create a database scoped credential with Azure storage account key as the secret.
CREATE DATABASE SCOPED CREDENTIAL OracleProxyAccount
WITH IDENTITY = 'oracle_username',
SECRET = 'oracle_password';
CREATE EXTERNAL DATA SOURCE MyOracleServer
WITH (
LOCATION = 'oracle://145.145.145.145:1521',
PUSHDOWN = ON,
CREDENTIAL = OracleProxyAccount
);
```
Optionally, the external data source to Oracle can use proxy authentication to provide fine-grained access control. A proxy user can be configured to have limited access compared to the user being impersonated.
```sql
CREATE DATABASE SCOPED CREDENTIAL [OracleProxyCredential]
WITH IDENTITY = 'oracle_username',
SECRET = 'oracle_password';
CREATE EXTERNAL DATA SOURCE [OracleSalesSrvr]
WITH (
LOCATION = 'oracle://145.145.145.145:1521',
CONNECTION_OPTIONS = 'ImpersonateUser=%CURRENT_USER',
CREDENTIAL = [OracleProxyCredential]
);
```
Alternatively, you can authenticate using TNS.
Starting in [!INCLUDE [sssql22-md](../../includes/sssql22-md.md)] Cumulative Update 2, `CREATE EXTERNAL DATA SOURCE` now supports the use of TNS files when connecting to Oracle.
The `CONNECTION_OPTIONS` parameter was expanded and now uses `TNSNamesFile` and `ServerName` as variables to browse the `tnsnames.ora` file and establish connection with the server.
In the example below, during runtime SQL Server will search for the `tnsnames.ora` file location specified by `TNSNamesFile` and search for the host and network port specified by `ServerName`.
```sql
CREATE EXTERNAL DATA SOURCE [external_data_source_name]
WITH (
LOCATION = N'oracle://XE',
CREDENTIAL = [OracleCredentialTest],
CONNECTION_OPTIONS = N'TNSNamesFile=C:\Temp\tnsnames.ora;ServerName=XE'
);
```
### B. Create external data source to reference a SQL Server named instance via PolyBase connectivity
**Applies to:** [!INCLUDE [sssql19-md](../../includes/sssql19-md.md)] and later versions.
To create an external data source that references a named instance of [!INCLUDE [ssNoVersion](../../includes/ssnoversion-md.md)], use `CONNECTION_OPTIONS` to specify the instance name.
First, create the database scoped credential, storing credentials for a SQL authenticated login. The SQL ODBC Connector for PolyBase only supports basic authentication. Before you create a database scoped credential, the database must have a master key to protect the credential. For more information, see [CREATE MASTER KEY](create-master-key-transact-sql.md). The following sample creates a database scoped credential, provide your own login and password.
```sql
CREATE DATABASE SCOPED CREDENTIAL SQLServerCredentials
WITH IDENTITY = 'username',
SECRET = 'password';
```
In the following example, `WINSQL2019` is the host name and `SQL2019` is the instance name. `'Server=%s\SQL2019'` is the key value pair.
```sql
CREATE EXTERNAL DATA SOURCE SQLServerInstance2
WITH (
LOCATION = 'sqlserver://WINSQL2019',
CONNECTION_OPTIONS = 'Server=%s\SQL2019',
CREDENTIAL = SQLServerCredentials
);
```
Alternatively, you can use a port to connect to a [!INCLUDE [ssNoVersion](../../includes/ssnoversion-md.md)] default instance.
```sql
CREATE EXTERNAL DATA SOURCE SQLServerInstance2
WITH (
LOCATION = 'sqlserver://WINSQL2019:58137',
CREDENTIAL = SQLServerCredentials
);
```
### C. Create external data source to reference a readable secondary replica of Always On availability group
**Applies to:** [!INCLUDE [sssql19-md](../../includes/sssql19-md.md)] and later versions.
To create an external data source that references a readable secondary replica of [!INCLUDE [ssNoVersion](../../includes/ssnoversion-md.md)], use `CONNECTION_OPTIONS` to specify the `ApplicationIntent=ReadOnly`. In addition, you'll need to either set the availability database as `Database={dbname}` in `CONNECTION_OPTIONS`, or set the availability database as the default database of the login used for the database scoped credential. You'll need to do this on all availability replicas of the availability group.
First, create the database scoped credential, storing credentials for a SQL authenticated login. The SQL ODBC Connector for PolyBase only supports basic authentication. Before you create a database scoped credential, the database must have a master key to protect the credential. For more information, see [CREATE MASTER KEY](create-master-key-transact-sql.md). The following sample creates a database scoped credential, provide your own login and password.
```sql
CREATE DATABASE SCOPED CREDENTIAL SQLServerCredentials
WITH IDENTITY = 'username',
SECRET = 'password';
```
Next, create the new external data source.
Whether you included `Database=dbname` in the `CONNECTION_OPTIONS` or set the availability database as the default database for the login in the database scoped credential, you must still provide the database name via a three-part name in the CREATE EXTERNAL TABLE statement, within the LOCATION parameter. For an example, see [CREATE EXTERNAL TABLE](create-external-table-transact-sql.md?view=sql-server-ver15&preserve-view=true#g-create-an-external-table-for-sql-server).
In the following example, `WINSQL2019AGL` is the availability group listener name and `dbname` is the name of the database to be the target of the CREATE EXTERNAL TABLE statement.
```sql
CREATE EXTERNAL DATA SOURCE SQLServerInstance2
WITH (
LOCATION = 'sqlserver://WINSQL2019AGL',
CONNECTION_OPTIONS = 'ApplicationIntent=ReadOnly; Database=dbname',
CREDENTIAL = SQLServerCredentials
);
```
You can demonstrate the redirection behavior of the availability group by specifying `ApplicationIntent` and creating an external table on the system view `sys.servers`. In the following sample script, two external data sources are created, and one external table is created for each. Use the views to test which server is responding to the connection. Similar outcomes can also be achieved via the read-only routing feature. For more information, see [Configure read-only routing for an Always On availability group](../../database-engine/availability-groups/windows/configure-read-only-routing-for-an-availability-group-sql-server.md).
```sql
CREATE EXTERNAL DATA SOURCE [DataSource_SQLInstanceListener_ReadOnlyIntent]
WITH (
LOCATION = 'sqlserver://WINSQL2019AGL',
CONNECTION_OPTIONS = 'ApplicationIntent=ReadOnly; Database=dbname',
CREDENTIAL = [SQLServerCredentials]
);
GO
CREATE EXTERNAL DATA SOURCE [DataSource_SQLInstanceListener_ReadWriteIntent]
WITH (
LOCATION = 'sqlserver://WINSQL2019AGL',
CONNECTION_OPTIONS = 'ApplicationIntent=ReadWrite',
CREDENTIAL = [SQLServerCredentials]
);
GO
```
Inside the database in the availability group, create a view to return `sys.servers` and the name of the local instance, which helps you identify which replica is responding to the query. For more information, see [sys.servers](../../relational-databases/system-catalog-views/sys-servers-transact-sql.md).
```sql
CREATE VIEW vw_sys_servers AS
SELECT [name]
FROM sys.servers
WHERE server_id = 0;
GO
```
Then, create an external table on the source instance:
```sql
CREATE EXTERNAL TABLE vw_sys_servers_ro
(
name SYSNAME NOT NULL
)
WITH (
DATA_SOURCE = [DataSource_SQLInstanceListener_ReadOnlyIntent],
LOCATION = N'dbname.dbo.vw_sys_servers'
);
GO
CREATE EXTERNAL TABLE vw_sys_servers_rw
(
name SYSNAME NOT NULL
)
WITH (
DATA_SOURCE = [DataSource_SQLInstanceListener_ReadWriteIntent],
LOCATION = N'dbname.dbo.vw_sys_servers'
);
GO
SELECT [name]
FROM dbo.vw_sys_servers_ro;
--should return secondary replica instance
SELECT [name]
FROM dbo.vw_sys_servers_rw;
--should return primary replica instance
GO
```
### D. Create external data source to query a parquet file in S3-compatible object storage via PolyBase
**Applies to:** [!INCLUDE [sssql22-md](../../includes/sssql22-md.md)] and later versions.
The following sample script creates an external data source `s3_ds` in the source user database in SQL Server. The external data source references the `s3_dc` database scoped credential.
```sql
CREATE DATABASE SCOPED CREDENTIAL s3_dc
WITH IDENTITY = 'S3 Access Key', -- for S3-compatible object storage the identity must always be S3 Access Key
SECRET = ':'; -- provided by the S3-compatible object storage
GO
CREATE EXTERNAL DATA SOURCE s3_ds
WITH (
LOCATION = 's3://:/',
CREDENTIAL = s3_dc
);
GO
```
Verify the new external data source with [sys.external_data_sources](../../relational-databases/system-catalog-views/sys-external-data-sources-transact-sql.md).
```sql
SELECT *
FROM sys.external_data_sources;
```
Then, the following example demonstrates using T-SQL to query a parquet file stored in S3-compatible object storage via OPENROWSET query. For more information, see [Virtualize parquet file in a S3-compatible object storage with PolyBase](../../relational-databases/polybase/polybase-virtualize-parquet-file.md).
```sql
SELECT *
FROM OPENROWSET (
BULK '//',
FORMAT = 'PARQUET',
DATA_SOURCE = 's3_ds'
) AS [cc];
```
### E. Create external data source using generic ODBC to PostgreSQL
As in previous examples, first create a database master key and database scoped credential. The database scoped credential will be used for the external data source. This example also assumes that a generic ODBC data provider for PostgreSQL is installed on the server.
In this example, the generic ODBC data provider is used to connect to a PostgreSQL database server in the same network, where the fully qualified domain name of the PostgreSQL server is `POSTGRES1`, using the default port of TCP 5432.
```sql
CREATE EXTERNAL DATA SOURCE POSTGRES1
WITH (
LOCATION = 'odbc://POSTGRES1.domain:5432',
CONNECTION_OPTIONS = 'Driver={PostgreSQL Unicode(x64)};',
CREDENTIAL = postgres_credential
);
```
### Azure Storage
#### Create a shared access signature
For both Azure Blob Storage and Azure Data Lake Gen2, the supported authentication method is shared access signature (SAS). One simple way to generate a shared access signature token follow the steps that follow. For more information, see [CREDENTIAL](#credential--credential_name-3).
1. Navigate to the Azure portal, and the desired Storage Account.
1. Navigate to your desired Container under **Data Storage** menu.
1. Select **Shared access tokens**.
1. Choose the appropriate permission based on the desired action. For reference, use the following table:
| Action | Permission |
| --- | --- |
| Read data from a file | Read |
| Read data from multiple files and subfolders | Read and List |
| Use Create External Table as Select (CETAS) | Read, Create and Write |
1. Choose the token expiration date.
1. Generate SAS token and URL.
1. Copy the SAS token.
### F. Create external data source to access data in Azure Blob Storage using the abs:// interface
**Applies to:** [!INCLUDE [sssql22-md](../../includes/sssql22-md.md)] and later versions.
Use a new prefix `abs` for Azure Storage Account v2. The `abs` prefix supports authentication using `SHARED ACCESS SIGNATURE`. The `abs` prefix replaces `wasb`, used in previous versions. HADOOP isn't longer supported, there's no more need to use `TYPE = BLOB_STORAGE`.
The Azure storage account key is no longer needed, instead using SAS Token as we can see in the following example:
```sql
-- Create a database master key if one does not already exist, using your own password.
-- This key is used to encrypt the credential secret in next step.
CREATE MASTER KEY ENCRYPTION BY PASSWORD= '';
GO
CREATE DATABASE SCOPED CREDENTIAL AzureStorageCredentialv2
WITH IDENTITY = 'SHARED ACCESS SIGNATURE', -- to use SAS the identity must be fixed as-is
SECRET = '';
GO
-- Create an external data source with CREDENTIAL option.
CREATE EXTERNAL DATA SOURCE MyAzureStorage
WITH (
LOCATION = 'abs://@.blob.core.windows.net/',
CREDENTIAL = AzureStorageCredentialv2
);
```
For a more detailed example on how to access CSV files stored in Azure Blob Storage, see [Virtualize CSV file with PolyBase](../../relational-databases/polybase/virtualize-csv.md).
### G. Create external data source to access data in Azure Data Lake Gen2
**Applies to:** [!INCLUDE [sssql22-md](../../includes/sssql22-md.md)] and later versions.
Use a new prefix `adls` for Azure Data Lake Gen2, replacing `abfs` used in previous versions. The `adls` prefix also supports SAS token as authentication method as shown in this example:
```sql
--Create a database scoped credential using SAS Token
CREATE DATABASE SCOPED CREDENTIAL datalakegen2
WITH IDENTITY = 'SHARED ACCESS SIGNATURE',
SECRET = '';
GO
CREATE EXTERNAL DATA SOURCE data_lake_gen2_dfs
WITH (
LOCATION = 'adls://@.dfs.core.windows.net',
CREDENTIAL = datalakegen2
);
```
For a more detailed example on how to access delta files stored on Azure Data Lake Gen2, see [Virtualize delta table with PolyBase](../../relational-databases/polybase/virtualize-delta.md).
## Examples: Bulk Operations
> [!IMPORTANT]
> Don't add a trailing `/`, file name, or shared access signature parameters at the end of the `LOCATION` URL when configuring an external data source for bulk operations.
### H. Create an external data source for bulk operations retrieving data from Azure Storage
**Applies to:** [!INCLUDE [sssql22-md](../../includes/sssql22-md.md)] and later versions.
Use the following data source for bulk operations using [BULK INSERT](bulk-insert-transact-sql.md) or [OPENROWSET](../functions/openrowset-transact-sql.md). The credential must set `SHARED ACCESS SIGNATURE` as the identity, mustn't have the leading `?` in the SAS token, must have at least read permission on the file that should be loaded (for example `srt=o&sp=r`), and the expiration period should be valid (all dates are in UTC time). For more information on shared access signatures, see [Using Shared Access Signatures (SAS)](/azure/storage/storage-dotnet-shared-access-signature-part-1).
```sql
CREATE DATABASE SCOPED CREDENTIAL AccessAzureInvoices
WITH IDENTITY = 'SHARED ACCESS SIGNATURE',
-- Remove ? from the beginning of the SAS token
SECRET = '';
CREATE EXTERNAL DATA SOURCE MyAzureInvoices
WITH (
LOCATION = 'abs://@.blob.core.windows.net/',
CREDENTIAL = AccessAzureInvoices,
);
```
### I. Create external data source using TDS 8.0 to connect with another SQL Server
**Applies to**: [!INCLUDE [sssql25-md](../../includes/sssql25-md.md)] and later versions.
When using the latest Microsoft ODBC Driver 18 for SQL Server, you must use the `Encryption` option under `CONNECTION_OPTIONS`, and `TrustServerCertificate` is also supported. If `Encryption` isn't specified, the default behavior is `Encrypt=Yes;TrustServerCertificate=No;`, and you require a server certificate.
In this example, SQL Authentication is used. To protect the credential, you need a database master key (DMK). For more information, see [CREATE MASTER KEY](create-master-key-transact-sql.md). The following sample creates a database scoped credential, with a custom login and password.
```sql
CREATE DATABASE SCOPED CREDENTIAL SQLServerCredentials
WITH IDENTITY = '',
SECRET = '';
```
The target server name is `WINSQL2022`, port `58137`, and it's a default instance. By specifying `Encrypt=Strict`, the connection uses TDS 8.0, and the server certificate is always verified. In this example, the `HostnameinCertificate` used is `WINSQL2022`:
```sql
CREATE EXTERNAL DATA SOURCE SQLServerInstance2
WITH (
LOCATION = 'sqlserver://WINSQL2022:58137',
CONNECTION_OPTIONS = 'Encrypt=Strict;HostnameInCertificate=WINSQL2022;'
CREDENTIAL = SQLServerCredentials
);
```
### J. Create external data source using encryption and TrustServerCertificate option
Following the previous example here are two code samples. The first snippet has `Encryption` and `TrustServerCertificate` set.
```sql
CREATE EXTERNAL DATA SOURCE SQLServerInstance2
WITH (
LOCATION = 'sqlserver://WINSQL2022:58137',
CONNECTION_OPTIONS = 'Encrypt=Yes;HostnameInCertificate=WINSQL2022;TrustServerCertificate=Yes;'
CREDENTIAL = SQLServerCredentials
);
```
The following snippet doesn't have `Encryption` enabled.
```sql
CREATE EXTERNAL DATA SOURCE SQLServerInstance2
WITH (
LOCATION = 'sqlserver://WINSQL2022:58137',
CONNECTION_OPTIONS = 'Encrypt=no;'
CREDENTIAL = SQLServerCredentials
);
```
## Related content
- [ALTER EXTERNAL DATA SOURCE (Transact-SQL)](alter-external-data-source-transact-sql.md)
- [CREATE DATABASE SCOPED CREDENTIAL (Transact-SQL)](create-database-scoped-credential-transact-sql.md)
- [CREATE EXTERNAL FILE FORMAT (Transact-SQL)](create-external-file-format-transact-sql.md)
- [CREATE EXTERNAL TABLE (Transact-SQL)](create-external-table-transact-sql.md)
- [sys.external_data_sources (Transact-SQL)](../../relational-databases/system-catalog-views/sys-external-data-sources-transact-sql.md)
- [Using Shared Access Signatures (SAS)](/azure/storage/storage-dotnet-shared-access-signature-part-1)
- [PolyBase connectivity configuration (Transact-SQL)](../../database-engine/configure-windows/polybase-connectivity-configuration-transact-sql.md)
::: moniker-end
::: moniker range="=azuresqldb-current||=azuresqledge-current"
:::row:::
:::column:::
[SQL Server](create-external-data-source-transact-sql.md?view=sql-server-ver15&preserve-view=true)
:::column-end:::
:::column:::
**_\* SQL Database \*_**
:::column-end:::
:::column:::
[SQL Managed
Instance](create-external-data-source-transact-sql.md?view=azuresqldb-mi-current&preserve-view=true)
:::column-end:::
:::column:::
[Azure Synapse
Analytics](create-external-data-source-transact-sql.md?view=azure-sqldw-latest&preserve-view=true)
:::column-end:::
:::column:::
[Analytics Platform
System (PDW)](create-external-data-source-transact-sql.md?view=aps-pdw-2016-au7&preserve-view=true)
:::column-end:::
:::column:::
[Microsoft Fabric Data Warehouse](create-external-data-source-transact-sql.md?view=fabric&preserve-view=true)
:::column-end:::
:::column:::
[Microsoft Fabric SQL database](create-external-data-source-transact-sql.md?view=fabric-sqldb&preserve-view=true)
:::column-end:::
:::row-end:::
## Overview: Azure SQL Database
**Applies to**: [!INCLUDE [ssazure-sqldb](../../includes/ssazure-sqldb.md)]
Creates an external data source for elastic queries. External data sources are used to establish connectivity and support these primary use cases:
- [Data virtualization (preview)](/azure/azure-sql/database/data-virtualization-overview?view=azuresql-db&preserve-view=true)
- Bulk load operations using `BULK INSERT` or `OPENROWSET`
- Query remote SQL Database or Azure Synapse instances using SQL Database with [elastic query](/azure/azure-sql/database/elastic-query-getting-started-vertical?view=azuresql-db&preserve-view=true)
- Query a sharded SQL Database using [elastic query](/azure/azure-sql/database/elastic-query-getting-started?view=azuresql-db&preserve-view=true)
:::image type="icon" source="../../includes/media/topic-link-icon.svg" border="false"::: [Transact-SQL syntax conventions](../../t-sql/language-elements/transact-sql-syntax-conventions-transact-sql.md)
## Syntax
```syntaxsql
CREATE EXTERNAL DATA SOURCE
WITH
( [ LOCATION = '://[:]' ]
[ [ , ] CREDENTIAL = ]
[ [ , ] TYPE = { BLOB_STORAGE | RDBMS | SHARD_MAP_MANAGER } ]
[ [ , ] DATABASE_NAME = '' ]
[ [ , ] SHARD_MAP_NAME = '' ] )
[ ; ]
```
## Arguments
#### data_source_name
Specifies the user-defined name for the data source. The name must be unique within the database in SQL Database.
#### LOCATION = '*\://\*'
Provides the connectivity protocol and path to the external data source.
| External Data Source | Connector location prefix | Location path | Availability |
| --- | --- | --- | --- |
| Bulk Operations | `https` | `.blob.core.windows.net/` | |
| Elastic Query (shard) | Not required | `.database.windows.net` | |
| Elastic Query (remote) | Not required | `.database.windows.net` | |
| EdgeHub | `edgehub` | `edgehub://` | Available in [Azure SQL Edge](/azure/azure-sql-edge/overview) *only*. EdgeHub is always local to the instance of [Azure SQL Edge](/azure/azure-sql-edge/overview). As such there's no need to specify a path or port value. |
| Kafka | `kafka` | `kafka://: