---
title: "Configure PolyBase Hadoop security"
description: Explains how to configure PolyBase in Parallel Data Warehouse to connect to external Hadoop.
author: mzaman1
ms.prod: sql
ms.technology: data-warehouse
ms.topic: conceptual
ms.date: 10/26/2018
ms.author: murshedz
ms.reviewer: martinle
ms.custom: seo-dt-2019
---
# PolyBase configuration and security for Hadoop
This article provides a reference for various configuration settings that affect APS PolyBase connectivity to Hadoop. For a walkthrough on what is PolyBase, see [What is PolyBase](configure-polybase-connectivity-to-external-data.md).
> [!NOTE]
> On APS, changes on XML files are needed on all compute nodes and control node.
>
> Take special care when modifying XML files in APS. Any missing tags or unwanted characters can invalidate the xml file hindering the usablilty of the feature.
> Hadoop configuration files are located in the following path:
> ```
> C:\Program Files\Microsoft SQL Server Parallel Data Warehouse\100\Hadoop\conf
> ```
> Any changes to the xml files require a service restart to be effective.
## Hadoop.RPC.Protection setting
A common way to secure communication in a hadoop cluster is by changing the hadoop.rpc.protection configuration to 'Privacy' or 'Integrity'. By default, PolyBase assumes the configuration is set to 'Authenticate'. To override this default, add the following property to the core-site.xml file. Changing this configuration will enable secure data transfer among the hadoop nodes and SSL connection to SQL Server.
```xml
hadoop.rpc.protection
```
## Kerberos configuration
Note, when PolyBase authenticates to a Kerberos secured cluster, it expects the hadoop.rpc.protection setting is 'Authenticate' by default. This leaves the data communication between Hadoop nodes unencrypted. To use 'Privacy' or 'Integrity' settings for hadoop.rpc.protection, update the core-site.xml file on the PolyBase server. For more information, see the previous section [Connecting to Hadoop Cluster with Hadoop.rpc.protection](#rpcprotection).
To connect to a Kerberos-secured Hadoop cluster using MIT KDC the following changes are needed on all APS compute nodes and control node:
1. Find the Hadoop configuration directories in the installation path of APS. Typically, the path is:
```
C:\Program Files\Microsoft SQL Server Parallel Data Warehouse\100\Hadoop\conf
```
2. Find the Hadoop side configuration value of the configuration keys listed in the table. (On the Hadoop machine, find the files in the Hadoop configuration directory.)
3. Copy the configuration values into the value property in the corresponding files on the SQL Server machine.
|**#**|**Configuration file**|**Configuration key**|**Action**|
|------------|----------------|---------------------|----------|
|1|core-site.xml|polybase.kerberos.kdchost|Specify the KDC hostname. For example: kerberos.your-realm.com.|
|2|core-site.xml|polybase.kerberos.realm|Specify the Kerberos realm. For example: YOUR-REALM.COM|
|3|core-site.xml|hadoop.security.authentication|Find the Hadoop side configuration and copy to SQL Server machine. For example: KERBEROS
**Security note:** KERBEROS must be written in upper case. If lower case, it might not be on.|
|4|hdfs-site.xml|dfs.namenode.kerberos.principal|Find the Hadoop side configuration and copy to SQL Server machine. For example: hdfs/_HOST@YOUR-REALM.COM|
|5|mapred-site.xml|mapreduce.jobhistory.principal|Find the Hadoop side configuration and copy to SQL Server machine. For example: mapred/_HOST@YOUR-REALM.COM|
|6|mapred-site.xml|mapreduce.jobhistory.address|Find the Hadoop side configuration and copy to SQL Server machine. For example: 10.193.26.174:10020|
|7|yarn-site.xml yarn.|yarn.resourcemanager.principal|Find the Hadoop side configuration and copy to SQL Server machine. For example: yarn/_HOST@YOUR-REALM.COM|
**core-site.xml**
```xml
polybase.kerberos.realm
polybase.kerberos.kdchost
hadoop.security.authentication
KERBEROS
```
**hdfs-site.xml**
```xml
dfs.namenode.kerberos.principal
```
**mapred-site.xml**
```xml
mapreduce.jobhistory.principal
mapreduce.jobhistory.address
```
**yarn-site.xml**
```xml
yarn.resourcemanager.principal
```
4. Create a database-scoped credential object to specify the authentication information for each Hadoop user. See [PolyBase T-SQL objects](../relational-databases/polybase/polybase-t-sql-objects.md).
## Hadoop Encryption Zone setup
If you are using Hadoop encryption zone modify core-site.xml and hdfs-site.xml as following. Provide the ip address where KMS service is running with the corresponding port number. The default port for KMS on CDH is 16000.
**core-site.xml**
```xml
hadoop.security.key.provider.path
kms://http@:16000/kms
```
**hdfs-site.xml**
```xml
dfs.encryption.key.provider.uri
kms://http@:16000/kms
hadoop.security.key.provider.path
kms://http@:16000/kms
```