--- title: "PolyBase troubleshooting | Microsoft Docs" ms.custom: - "SQL2016_New_Updated" ms.date: "8/29/2017" ms.prod: "sql-server-2016" ms.reviewer: "" ms.suite: "" ms.technology: - "database-engine-polybase" ms.tgt_pltfrm: "" ms.topic: "article" f1_keywords: - "PolyBase, monitoring" - "PolyBase, performance monitoring" helpviewer_keywords: - "PolyBase, troubleshooting" ms.assetid: f119e819-c3ae-4e0b-a955-3948388a9cfe caps.latest.revision: 22 author: "barbkess" ms.author: "barbkess" manager: "jhubbard" --- # PolyBase troubleshooting [!INCLUDE[tsql-appliesto-ss2016-xxxx-xxxx-xxx_md](../../includes/tsql-appliesto-ss2016-xxxx-xxxx-xxx-md.md)] To troubleshoot PolyBase, use the techniques found in this topic. ## Catalog views Use the catalog views listed here to manage PolyBase operations. ||| |-|-| |View|Description| |[sys.external_tables (Transact-SQL)](../../relational-databases/system-catalog-views/sys-external-tables-transact-sql.md)|Identifies external tables.| |[sys.external_data_sources (Transact-SQL)](../../relational-databases/system-catalog-views/sys-external-data-sources-transact-sql.md)|Identifies external data sources.| |[sys.external_file_formats (Transact-SQL)](../../relational-databases/system-catalog-views/sys-external-file-formats-transact-sql.md)|Identifies external file formats.| ## Dynamic Management Views ||| |-|-| |[sys.dm_exec_compute_node_errors (Transact-SQL)](../../relational-databases/system-dynamic-management-views/sys-dm-exec-compute-node-errors-transact-sql.md)|[sys.dm_exec_compute_node_status (Transact-SQL)](../../relational-databases/system-dynamic-management-views/sys-dm-exec-compute-node-status-transact-sql.md)| |[sys.dm_exec_compute_nodes (Transact-SQL)](../../relational-databases/system-dynamic-management-views/sys-dm-exec-compute-nodes-transact-sql.md)|[sys.dm_exec_distributed_request_steps (Transact-SQL)](../../relational-databases/system-dynamic-management-views/sys-dm-exec-distributed-request-steps-transact-sql.md)| |[sys.dm_exec_distributed_requests (Transact-SQL)](../../relational-databases/system-dynamic-management-views/sys-dm-exec-distributed-requests-transact-sql.md)|[sys.dm_exec_distributed_sql_requests (Transact-SQL)](../../relational-databases/system-dynamic-management-views/sys-dm-exec-distributed-sql-requests-transact-sql.md)| |[sys.dm_exec_dms_services (Transact-SQL)](../../relational-databases/system-dynamic-management-views/sys-dm-exec-dms-services-transact-sql.md)|[sys.dm_exec_dms_workers (Transact-SQL)](../../relational-databases/system-dynamic-management-views/sys-dm-exec-dms-workers-transact-sql.md)| |[sys.dm_exec_external_operations (Transact-SQL)](../../relational-databases/system-dynamic-management-views/sys-dm-exec-external-operations-transact-sql.md)|[sys.dm_exec_external_work (Transact-SQL)](../../relational-databases/system-dynamic-management-views/sys-dm-exec-external-work-transact-sql.md)| PolyBase queries are broken into a series of steps within sys.dm_exec_distributed_request_steps. The following table provides a mapping from the step name to the associated DMV. |PolyBase Step|Associated DMV| |-|-| |HadoopJobOperation | sys.dm_exec_external_operations| |RandomIdOperation | sys.dm_exec_distributed_request_steps| |HadoopRoundRobinOperation | sys.dm_exec_dms_workers| |StreamingReturnOperation | sys.dm_exec_dms_workers| |OnOperation | sys.dm_exec_distributed_sql_requests | ## To monitor PolyBase queries using DMVs Monitor and troubleshoot PolyBase queries using the following DMVs. 1. **Find the longest running queries** Record the execution ID of the longest running query. ```tsql -- Find the longest running query SELECT execution_id, st.text, dr.total_elapsed_time FROM sys.dm_exec_distributed_requests dr cross apply sys.dm_exec_sql_text(sql_handle) st ORDER BY total_elapsed_time DESC; ``` 2. **Find the longest running step of the distributed query** Use the execution ID recorded in the previous step. Record the step index of the longest running step. Check the location_type of longest running step: - Head or Compute: implies a SQL operation. Proceed with Step 3a. - DMS: implies a PolyBase Data Movement Service operation. Proceed with Step 3b. ```tsql -- Find the longest running step of the distributed query plan SELECT execution_id, step_index, operation_type, distribution_type, location_type, status, total_elapsed_time, command FROM sys.dm_exec_distributed_request_steps WHERE execution_id = 'QID4547' ORDER BY total_elapsed_time DESC; ``` 3. **Find the execution progress of the longest running step** 1. **Find the execution progress of a SQL step** Use the execution ID and step index recorded in the previous steps. Use the execution ID and step index recorded in the previous steps. ```tsql -- Find the execution progress of SQL step SELECT execution_id, step_index, distribution_id, status, total_elapsed_time, row_count, command FROM sys.dm_exec_distributed_sql_requests WHERE execution_id = 'QID4547' and step_index = 1; ``` 2. **Find the execution progress of a DMS step** Use the execution ID and step index recorded in the previous steps. ```tsql -- Find the execution progress of DMS step SELECT execution_id, step_index, dms_step_index, status, type, bytes_processed, total_elapsed_time FROM sys.dm_exec_dms_workers WHERE execution_id = 'QID4547' ORDER BY total_elapsed_time DESC; ``` 4. **Find the information about external DMS operations** Use the execution ID and step index recorded in the previous steps. ```tsql SELECT execution_id, step_index, dms_step_index, compute_node_id, type, input_name, length, total_elapsed_time, status FROM sys.dm_exec_external_work WHERE execution_id = 'QID4547' and step_index = 7 ORDER BY total_elapsed_time DESC; ``` ## To view the PolyBase query plan 1. In SSMS, enable **Include Actual Execution Plan** (Ctrl + M) and run the query. 2. Click the **Execution plan** tab. ![PolyBase query plan](../../relational-databases/polybase/media/polybase-query-plan.png "PolyBase query plan") 3. Right-click on the **Remote Query operator** and select **Properties**. 4. Copy and paste the Remote Query value into a text editor to view the XML remote query plan. An example is shown below. ```xml ExecuteMemo explain query TEMP_ID_74 CREATE TABLE [tempdb].[dbo].[TEMP_ID_74] ([SensorKey] INT NOT NULL, [CustomerKey] INT NOT NULL, [GeographyKey] INT, [Speed] FLOAT(53) NOT NULL, [YearMeasured] INT NOT NULL ) WITH(DATA_COMPRESSION=PAGE); EXEC [tempdb].[sys].[sp_addextendedproperty] @name=N'IS_EXTERNAL_STREAMING_TABLE', @value=N'true', @level0type=N'SCHEMA', @level0name=N'dbo', @level1type=N'TABLE', @level1name=N'TEMP_ID_74' UPDATE STATISTICS [tempdb].[dbo].[TEMP_ID_74] WITH ROWCOUNT = 2401, PAGECOUNT = 7 hdfs://10.193.26.177:8020/Demo/car_sensordata.tbl/ [TEMP_ID_74] DROP TABLE [tempdb].[dbo].[TEMP_ID_74] ``` ## To monitor nodes in a PolyBase group After configuring a set of machines as part of a PolyBase scale out group, you can monitor the status of the machines. For details on creating a scale out group, see [PolyBase scale-out groups](../../relational-databases/polybase/polybase-scale-out-groups.md). 1. Connect to SQL Server on the head node of a group. 2. Run the DMV [sys.dm_exec_compute_nodes (Transact-SQL)](../../relational-databases/system-dynamic-management-views/sys-dm-exec-compute-nodes-transact-sql.md) to view all the nodes in the PolyBase Group. 3. Run the DMV [sys.dm_exec_compute_node_status (Transact-SQL)](../../relational-databases/system-dynamic-management-views/sys-dm-exec-compute-node-status-transact-sql.md) to view the status of all the nodes in the PolyBase Group. ## Known Limitations PolyBase has the following limitations: - The maximum possible row size, including the full length of variable length columns, can not exceed 1 MB. - PolyBase doesn’t support the Hive 0.12+ data types (i.e. Char(), VarChar()) - When exporting data into an ORC File Format from SQL Server or Azure SQL Data Warehouse text heavy columns can be limited to as few as 50 columns due to java out of memory errors. To work around this, export only a subset of the columns. - Cannot Read or Write data encrypted at rest in Hadoop. This includes HDFS Encrypted Zones or Transparent Encryption. - PolyBase cannot connect to a Hortonworks instance if KNOX is enabled. - PolyBase cannot connect to Hadoop instance if hadoop.RPC.Protection setting is set to anything other than "authenticate". [PolyBase doesn't install when you add a node to a SQL Server 2016 Failover Cluster](https://support.microsoft.com/en-us/help/3173087/fix-polybase-feature-doesn-t-install-when-you-add-a-node-to-a-sql-server-2016-failover-cluster) ## Hadoop Name Node High Availability PolyBase does not interface with Name Node HA services like Zookeeper or Knox today. However, there is a proven workaround that can be used to provide the functionality. Work Around: Use DNS name to reroute connections to the active Name Node. In order to do this, you will need to ensure that the External Data Source is using a DNS name to communicate with the Name Node. When Name Node Failover occurs, you will need to change the IP address associated with the DNS name used in the External Data Source definition. This will reroute all new connections to the correct Name Node. Existing connections will fail when failover occurs. To automate this process, a "heartbeat" can ping the active Name Node. If the heart beat fails, one can assume a failover has occured and automatically switch to the secondaries IP address. ## Error messages and possible solutions To troubleshoot external table errors, see Murshed Zaman's blog [https://blogs.msdn.microsoft.com/sqlcat/2016/06/21/polybase-setup-errors-and-possible-solutions/](https://blogs.msdn.microsoft.com/sqlcat/2016/06/21/polybase-setup-errors-and-possible-solutions/ "PolyBase setup errors and possible solutions"). ## See also [Troubleshoot PolyBase Kerberos connectivity](polybase-troubleshoot-connectivity.md)