MicrosoftDocs
diff --git a/‎docs/azdata/reference/reference-azdata-bdc-spark-batch.md‎
Lines changed: 6 additions & 6 deletions b/‎docs/azdata/reference/reference-azdata-bdc-spark-batch.md‎
Lines changed: 6 additions & 6 deletions
diff --git a/‎docs/azdata/reference/reference-azdata-bdc-spark-session.md‎
Lines changed: 6 additions & 6 deletions b/‎docs/azdata/reference/reference-azdata-bdc-spark-session.md‎
Lines changed: 6 additions & 6 deletions
diff --git a/‎docs/azdata/reference/reference-azdata-bdc-spark-statement.md‎
Lines changed: 4 additions & 4 deletions b/‎docs/azdata/reference/reference-azdata-bdc-spark-statement.md‎
Lines changed: 4 additions & 4 deletions
diff --git a/‎docs/big-data-cluster/concept-compute-pool.md‎
Lines changed: 30 additions & 5 deletions b/‎docs/big-data-cluster/concept-compute-pool.md‎
Lines changed: 30 additions & 5 deletions
diff --git a/‎docs/big-data-cluster/data-ingestion-curl.md‎
Lines changed: 18 additions & 6 deletions b/‎docs/big-data-cluster/data-ingestion-curl.md‎
Lines changed: 18 additions & 6 deletions
diff --git a/‎docs/big-data-cluster/media/concept-compute-pool/compute-pool-architecture.png‎
166 KB b/‎docs/big-data-cluster/media/concept-compute-pool/compute-pool-architecture.png‎
166 KB
diff --git a/‎docs/connect/odbc/linux-mac/known-issues-in-this-version-of-the-driver.md‎
Lines changed: 2 additions & 2 deletions b/‎docs/connect/odbc/linux-mac/known-issues-in-this-version-of-the-driver.md‎
Lines changed: 2 additions & 2 deletions
diff --git a/‎docs/connect/spark/connector.md‎
Lines changed: 8 additions & 5 deletions b/‎docs/connect/spark/connector.md‎
Lines changed: 8 additions & 5 deletions
@@ -62,7 +62,7 @@ azdata bdc spark batch create --file -f
 ### Examples
 Create a new Spark batch.
 ```bash
-azdata spark batch create --code "2+2"
+azdata bdc spark batch create --code "2+2"
 ```
 ### Required Parameters
 #### `--file -f`
@@ -115,7 +115,7 @@ azdata bdc spark batch list
 ### Examples
 List all the active batches.
 ```bash
-azdata spark batch list
+azdata bdc spark batch list
 ```
 ### Global Arguments
 #### `--debug`
@@ -137,7 +137,7 @@ azdata bdc spark batch info --batch-id -i
 ### Examples
 Get batch info for batch with ID of 0.
 ```bash
-azdata spark batch info --batch-id 0
+azdata bdc spark batch info --batch-id 0
 ```
 ### Required Parameters
 #### `--batch-id -i`
@@ -162,7 +162,7 @@ azdata bdc spark batch log --batch-id -i
 ### Examples
 Get batch log for batch with ID of 0.
 ```bash
-azdata spark batch log --batch-id 0
+azdata bdc spark batch log --batch-id 0
 ```
 ### Required Parameters
 #### `--batch-id -i`
@@ -187,7 +187,7 @@ azdata bdc spark batch state --batch-id -i
 ### Examples
 Get batch state for batch with ID of 0.
 ```bash
-azdata spark batch state --batch-id 0
+azdata bdc spark batch state --batch-id 0
 ```
 ### Required Parameters
 #### `--batch-id -i`
@@ -212,7 +212,7 @@ azdata bdc spark batch delete --batch-id -i
 ### Examples
 Delete a batch.
 ```bash
-azdata spark batch delete --batch-id 0
+azdata bdc spark batch delete --batch-id 0
 ```
 ### Required Parameters
 #### `--batch-id -i`
 
@@ -60,7 +60,7 @@ azdata bdc spark session create [--session-kind -k]
 ### Examples
 Create a session.
 ```bash
-azdata spark session create --session-kind pyspark
+azdata bdc spark session create --session-kind pyspark
 ```
 ### Optional Parameters
 #### `--session-kind -k`
@@ -110,7 +110,7 @@ azdata bdc spark session list
 ### Examples
 List all the active sessions.
 ```bash
-azdata spark session list
+azdata bdc spark session list
 ```
 ### Global Arguments
 #### `--debug`
@@ -132,7 +132,7 @@ azdata bdc spark session info --session-id -i
 ### Examples
 Get session info for session with ID of 0.
 ```bash
-azdata spark session info --session-id 0
+azdata bdc spark session info --session-id 0
 ```
 ### Required Parameters
 #### `--session-id -i`
@@ -157,7 +157,7 @@ azdata bdc spark session log --session-id -i
 ### Examples
 Get session log for session with ID of 0.
 ```bash
-azdata spark session log --session-id 0
+azdata bdc spark session log --session-id 0
 ```
 ### Required Parameters
 #### `--session-id -i`
@@ -182,7 +182,7 @@ azdata bdc spark session state --session-id -i
 ### Examples
 Get session state for session with ID of 0.
 ```bash
-azdata spark session state --session-id 0
+azdata bdc spark session state --session-id 0
 ```
 ### Required Parameters
 #### `--session-id -i`
@@ -207,7 +207,7 @@ azdata bdc spark session delete --session-id -i
 ### Examples
 Delete a session.
 ```bash
-azdata spark session delete --session-id 0
+azdata bdc spark session delete --session-id 0
 ```
 ### Required Parameters
 #### `--session-id -i`
 
@@ -34,7 +34,7 @@ azdata bdc spark statement list --session-id -i
 ### Examples
 List all the session statements.
 ```bash
-azdata spark statement list --session-id 0
+azdata bdc spark statement list --session-id 0
 ```
 ### Required Parameters
 #### `--session-id -i`
@@ -59,7 +59,7 @@ azdata bdc spark statement create --session-id -i
 ### Examples
 Run a statement.
 ```bash
-azdata spark statement create --session-id 0 --code "2+2"
+azdata bdc spark statement create --session-id 0 --code "2+2"
 ```
 ### Required Parameters
 #### `--session-id -i`
@@ -86,7 +86,7 @@ azdata bdc spark statement info --session-id -i
 ### Examples
 Get statement info for session with ID of 0 and statement ID of 0.
 ```bash
-azdata spark statement info --session-id 0 --statement-id 0
+azdata bdc spark statement info --session-id 0 --statement-id 0
 ```
 ### Required Parameters
 #### `--session-id -i`
@@ -113,7 +113,7 @@ azdata bdc spark statement cancel --session-id -i
 ### Examples
 Cancel a statement.
 ```bash
-azdata spark statement cancel --session-id 0 --statement-id 0
+azdata bdc spark statement cancel --session-id 0 --statement-id 0
 ```
 ### Required Parameters
 #### `--session-id -i`
 
@@ -5,30 +5,55 @@ description: This article describes the compute pool in a SQL Server 2019 big da
 author: MikeRayMSFT 
 ms.author: mikeray
 ms.reviewer: mihaelab 
-ms.date: 11/04/2019
+ms.date: 10/15/2020
 ms.topic: conceptual
 ms.prod: sql
 ms.technology: big-data-cluster
 ---
 
-# What are compute pools in a SQL Server big data cluster?
+# What are compute pools SQL Server Big Data Clusters?
 
 [!INCLUDE[SQL Server 2019](../includes/applies-to-version/sqlserver2019.md)]
 
-This article describes the role of *SQL Server compute pools* in a SQL Server big data cluster. Compute pools provide scale-out computational resources for a big data cluster. The following sections describe the architecture and functionality of a compute pool.
+This article describes the role of *SQL Server compute pools* in SQL Server Big Data Clusters. Compute pools provide scale-out computational resources for a Big Data Cluster. They are used to offload computational work, or intermediate result sets, from the SQL Server master instance. The following sections describe the architecture, functionality and usage scenarios of a compute pool.
 
 You can also watch this 5-minute video for an introduction into compute pools:
 
 > [!VIDEO https://channel9.msdn.com/Shows/Data-Exposed/Overview-Big-Data-Cluster-Compute-Pool/player?WT.mc_id=dataexposed-c9-niner]
 
-
 ## Compute pool architecture
 
 A compute pool is made of one or more compute pods running in Kubernetes. The automated creation and management of these pods is coordinated by the [SQL Server master instance](concept-master-instance.md). Each pod contains a set of base services and an instance of the SQL Server database engine.
 
+![Compute pool architecture](media/concept-compute-pool/compute-pool-architecture.png)
+
 ## Scale-out groups
 
-A compute pool can act as a PolyBase scale-out group for distributed queries over different data sources--such as HDFS, Oracle, MongoDB, or Teradata. By using compute pods in Kubernetes, big data clusters can automate creating and configuring compute pods for PolyBase scale-out groups.
+A compute pool can act as a PolyBase scale-out group for distributed queries over different external data sources such as SQL Server, Oracle, MongoDB, Teradata and HDFS. By using compute pods in Kubernetes, Big Data Clusters can automate creating and configuring compute pods for PolyBase scale-out groups.
+
+## Compute pool scenarios
+
+Scenarios where the compute pool is used include:
+
+- When queries submitted to the master instance use one or more tables located in the [Storage Pool](concept-storage-pool.md).
+
+- When queries submitted to the master instance use one or more tables with round-robin distribution located in the [Data Pool](concept-data-pool.md).
+
+- When queries submitted to the master instance use **partitioned** tables with external data sources of SQL Server, Oracle, MongoDB, and Teradata. For this scenario, the query hint OPTION (FORCE SCALEOUTEXECUTION) must be enabled.
+
+- When queries submitted to the master instance use one or more tables located in [HDFS Tiering](hdfs-tiering.md).
+
+Scenarios where the compute pool is **not** used include:
+
+- When queries submitted to the master instance use one or more tables in an external Hadoop HDFS cluster.
+
+- When queries submitted to the master instance use one or more tables in Azure Blob Storage.
+
+- When queries submitted to the master instance use **non-partitioned** tables with external data sources of SQL Server, Oracle, MongoDB, and Teradata.
+
+- When the query hint OPTION (DISABLE SCALEOUTEXECUTION) is enabled.
+
+- When queries submitted to the master instance apply to databases located on the master instance.
 
 ## Next steps
 
 
@@ -42,51 +42,63 @@ For example:
 
 `https://13.66.190.205:30443/gateway/default/webhdfs/v1/`
 
+## Authentication with Active Directory
+
+For deployments with Active Directory, use the authentication parameter with `curl` with Negotiate authentication. 
+
+To use `curl` with Active Directory authentication, run this command:
+
+```
+kinit <username>
+```
+
+The command generates a Kerberos token for `curl` to use. The commands demonstrated in the next sections specify the `--anyauth` parameter for `curl`. For URLs that require Negotiate authentication, `curl` automatically detects and uses the generated Kerberos token instead of username and password to authenticate to the URLs.
+
 ## List a file
 
 To list file under **hdfs:///product_review_data**, use the following curl command:
 
 ```terminal
-curl -i -k -u root:<AZDATA_PASSWORD> -X GET 'https://<gateway-svc-external IP external address>:30443/gateway/default/webhdfs/v1/product_review_data/?op=liststatus'
+curl -i -k --anyauth -u root:<AZDATA_PASSWORD> -X GET 'https://<gateway-svc-external IP external address>:30443/gateway/default/webhdfs/v1/product_review_data/?op=liststatus'
 ```
 
 [!INCLUDE [big-data-cluster-root-user](../includes/big-data-cluster-root-user.md)]
 
 For endpoints that do not use root, use the following curl command:
 
 ```terminal
-curl -i -k -u <AZDATA_USERNAME>:<AZDATA_PASSWORD> -X GET 'https://<gateway-svc-external IP external address>:30443/gateway/default/webhdfs/v1/product_review_data/?op=liststatus'
+curl -i -k --anyauth -u <AZDATA_USERNAME>:<AZDATA_PASSWORD> -X GET 'https://<gateway-svc-external IP external address>:30443/gateway/default/webhdfs/v1/product_review_data/?op=liststatus'
 ```
 
 ## Put a local file into HDFS
 
 To put a new file **test.csv** from local directory to product_review_data directory, use the following curl command (the **Content-Type** parameter is required):
 
 ```terminal
-curl -i -L -k -u root:<AZDATA_PASSWORD> -X PUT 'https://<gateway-svc-external IP external address>:30443/gateway/default/webhdfs/v1/product_review_data/test.csv?op=create' -H 'Content-Type: application/octet-stream' -T 'test.csv'
+curl -i -L -k --anyauth -u root:<AZDATA_PASSWORD> -X PUT 'https://<gateway-svc-external IP external address>:30443/gateway/default/webhdfs/v1/product_review_data/test.csv?op=create' -H 'Content-Type: application/octet-stream' -T 'test.csv'
 ```
 
 [!INCLUDE [big-data-cluster-root-user](../includes/big-data-cluster-root-user.md)]
 
 For endpoints that do not use root, use the following curl command:
 
 ```terminal
-curl -i -L -k -u <AZDATA_USERNAME>:<AZDATA_PASSWORD> -X PUT 'https://<gateway-svc-external IP external address>:30443/gateway/default/webhdfs/v1/product_review_data/test.csv?op=create' -H 'Content-Type: application/octet-stream' -T 'test.csv'
+curl -i -L -k --anyauth -u <AZDATA_USERNAME>:<AZDATA_PASSWORD> -X PUT 'https://<gateway-svc-external IP external address>:30443/gateway/default/webhdfs/v1/product_review_data/test.csv?op=create' -H 'Content-Type: application/octet-stream' -T 'test.csv'
 ```
 
 ## Create a directory
 
 To create a directory **test** under `hdfs:///`, use the following command:
 
 ```terminal
-curl -i -L -k -u root:<AZDATA_PASSWORD> -X PUT 'https://<gateway-svc-external IP external address>:30443/gateway/default/webhdfs/v1/test?op=MKDIRS'
+curl -i -L -k --anyauth -u root:<AZDATA_PASSWORD> -X PUT 'https://<gateway-svc-external IP external address>:30443/gateway/default/webhdfs/v1/test?op=MKDIRS'
 ```
 
 [!INCLUDE [big-data-cluster-root-user](../includes/big-data-cluster-root-user.md)]
 For endpoints that do not use root, use the following curl command:
 
 ```terminal
-curl -i -L -k -u <AZDATA_USERNAME>:<AZDATA_PASSWORD> -X PUT 'https://<gateway-svc-external IP external address>:30443/gateway/default/webhdfs/v1/test?op=MKDIRS'
+curl -i -L -k --anyauth -u <AZDATA_USERNAME>:<AZDATA_PASSWORD> -X PUT 'https://<gateway-svc-external IP external address>:30443/gateway/default/webhdfs/v1/test?op=MKDIRS'
 ```
 
 ## Next steps
 
@@ -28,7 +28,7 @@ Additional issues will be posted on the [SQL Server Drivers blog](https://techco
 
 - If the client encoding is UTF-8, the driver manager does not always correctly convert from UTF-8 to UTF-16. Currently, data corruption occurs when one or more characters in the string are not valid UTF-8 characters. ASCII characters are mapped correctly. The driver manager attempts this conversion when calling the SQLCHAR versions of the ODBC API (for example, SQLDriverConnectA). The driver manager will not attempt this conversion when calling the SQLWCHAR versions of the ODBC API (for example, SQLDriverConnectW).  
 
-- The *ColumnSize* parameter of **SQLBindParameter** refers to the number of characters in the SQL type, while *BufferLength* is the number of bytes in the application's buffer. However, if the SQL data type is `varchar(n)` or `char(n)`, the application binds the parameter as SQL_C_CHAR or SQL_C_VARCHAR, and the character encoding of the client is UTF-8, you may get a "String data, right truncation" error from the driver even if the value of *ColumnSize* is aligned with the size of the data type on the server. This error occurs since conversions between character encodings may change the length of the data. For example, a right apostrophe character (U+2019) is encoded in CP-1252 as the single byte 0x92, but in UTF-8 as the 3-byte sequence 0xe2 0x80 0x99.
+- The *ColumnSize* parameter of **SQLBindParameter** refers to the number of characters in the SQL type, while *BufferLength* is the number of bytes in the application's buffer. However, if the SQL data type is `varchar(n)` or `char(n)`, the application binds the parameter as SQL_C_CHAR for the C type, and SQL_CHAR or SQL_VARCHAR for the SQL type, and the character encoding of the client is UTF-8, you may get a "String data, right truncation" error from the driver even if the value of *ColumnSize* is aligned with the size of the data type on the server. This error occurs since conversions between character encodings may change the length of the data. For example, a right apostrophe character (U+2019) is encoded in CP-1252 as the single byte 0x92, but in UTF-8 as the 3-byte sequence 0xe2 0x80 0x99.
 
 For example, if your encoding is UTF-8 and you specify 1 for both *BufferLength* and *ColumnSize* in **SQLBindParameter** for an out-parameter, and then attempt to retrieve the preceding character stored in a `char(1)` column on the server (using CP-1252), the driver attempts to convert it to the 3-byte UTF-8 encoding, but cannot fit the result into a 1-byte buffer. In the other direction, it compares *ColumnSize* with the *BufferLength* in **SQLBindParameter** before doing the conversion between the different code pages on the client and server. Because a *ColumnSize* of 1 is less than a *BufferLength* of (for example) 3, the driver generates an error. To avoid this error, ensure that the length of the data after conversion fits into the specified buffer or column. Note that *ColumnSize* cannot be greater than 8000 for the `varchar(n)` type.
 
@@ -86,4 +86,4 @@ For ODBC driver installation instructions, see the following articles:
 - [Installing the Microsoft ODBC Driver for SQL Server on Linux](installing-the-microsoft-odbc-driver-for-sql-server.md)
 - [Installing the Microsoft ODBC Driver for SQL Server on macOS](install-microsoft-odbc-driver-sql-server-macos.md)
 
-For more information, see the [Programming guidelines](programming-guidelines.md) and the [Release notes](release-notes-odbc-sql-server-linux-mac.md).
+For more information, see the [Programming guidelines](programming-guidelines.md) and the [Release notes](release-notes-odbc-sql-server-linux-mac.md).
@@ -20,7 +20,7 @@ This library contains the source code for the Apache Spark Connector for SQL Ser
 
 [Apache Spark](https://spark.apache.org/) is a unified analytics engine for large-scale data processing.
 
-You can build the connector from source or download the jar from the Release section in GitHub. For the latest information about the connector, see [SQL Spark connector GitHub repository](https://github.com/microsoft/sql-spark-connector).
+You can import the connector into your project through the Maven coordinates: `com.microsoft.azure:spark-mssql-connector:1.0.0`. You can also build the connector from source or download the jar from the Release section in GitHub. For the latest information about the connector, see [SQL Spark connector GitHub repository](https://github.com/microsoft/sql-spark-connector).
 
 ## Supported Features
 
@@ -44,14 +44,15 @@ You can build the connector from source or download the jar from the Release sec
 ### Supported Options
 The Apache Spark Connector for SQL Server and Azure SQL supports the options defined here: [SQL DataSource JDBC](https://spark.apache.org/docs/latest/sql-data-sources-jdbc.html)
 
-In addition, the following options are supported
+In addition following options are supported
 
 | Option | Default | Description |
 | --------- | ------------------ | ------------------------------------------ |
-| `reliabilityLevel` | `BEST_EFFORT` | `BEST_EFFORT` or `NO_DUPLICATES`. `NO_DUPLICATES` implements a reliable insert in executor restart scenarios |
-| `dataPoolDataSource` | `none` | `none` implies the value is not set and the connector should write to a single instance of SQL Server. Set this value to data source name to write to a data pool table in a SQL Server Big Data Cluster|
+| `reliabilityLevel` | `BEST_EFFORT` | `BEST_EFFORT` or `NO_DUPLICATES`. `NO_DUPLICATES` implements an reliable insert in executor restart scenarios |
+| `dataPoolDataSource` | `none` | `none` implies the value is not set and the connector should write to SQL Server single instance. Set this value to data source name to write a data pool table in Big Data Clusters|
 | `isolationLevel` | `READ_COMMITTED` | Specify the isolation level |
 | `tableLock` | `false` | Implements an insert with `TABLOCK` option to improve write performance |
+| `schemaCheckEnabled` | `true` | Disables strict data frame and sql table schema check when set to false |
 
 Other [bulk copy options](../jdbc/using-bulk-copy-with-the-jdbc-driver.md#sqlserverbulkcopyoptions) can be set as options on the `dataframe` and will be passed to `bulkcopy` APIs on write
 
@@ -223,4 +224,6 @@ The Apache Spark Connector for Azure SQL and SQL Server is an open-source projec
 
 ## Next steps
 
-Visit the [SQL Spark connector GitHub repository](https://github.com/microsoft/sql-spark-connector).
+Visit the [SQL Spark connector GitHub repository](https://github.com/microsoft/sql-spark-connector).
+
+For information about isolation levels, see [SET TRANSACTION ISOLATION LEVEL (Transact-SQL)](../../t-sql/statements/set-transaction-isolation-level-transact-sql.md).