You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
@@ -107,7 +107,7 @@ In this example, if LOCATION='/webdata/', a PolyBase query will return rows from
107
107
To change the default and only read from the root folder, set the attribute \<polybase.recursive.traversal> to 'false' in the core-site.xml configuration file. This file is located under `<SqlBinRoot>\PolyBase\Hadoop\Conf with SqlBinRoot the bin root of SQl Server`. For example, `C:\\Program Files\\Microsoft SQL Server\\MSSQL13.XD14\\MSSQL\\Binn`.
108
108
109
109
DATA_SOURCE = *external_data_source_name*
110
-
Specifies the name of the external data source that contains the location of the external data. This location is either a Hadoop or Azure blob storage. To create an external data source, use [CREATE EXTERNAL DATA SOURCE](../../t-sql/statements/create-external-data-source-transact-sql.md).
110
+
Specifies the name of the external data source that contains the location of the external data. This location is a Hadoop File System (HDFS), an Azure storage blob container, or Azure Data Lake Store. To create an external data source, use [CREATE EXTERNAL DATA SOURCE](../../t-sql/statements/create-external-data-source-transact-sql.md).
111
111
112
112
FILE_FORMAT = *external_file_format_name*
113
113
Specifies the name of the external file format object that stores the file type and compression method for the external data. To create an external file format, use [CREATE EXTERNAL FILE FORMAT](../../t-sql/statements/create-external-file-format-transact-sql.md).
@@ -155,9 +155,6 @@ This example shows how the three REJECT options interact with each other. For ex
155
155
- Percent of failed rows is recalculated as 50%. The percentage of failed rows has exceeded the 30% reject value.
156
156
- The PolyBase query fails with 50% rejected rows after attempting to return the first 200 rows. Notice that matching rows have been returned before the PolyBase query detects the reject threshold has been exceeded.
157
157
158
-
DATA_SOURCE
159
-
An external data source such as data stored in Azure blob storage or a [shard map manager](https://azure.microsoft.com/documentation/articles/sql-database-elastic-scale-shard-map-management/).
160
-
161
158
SCHEMA_NAME
162
159
The SCHEMA_NAME clause provides the ability to map the external table definition to a table in a different schema on the remote database. Use this clause to disambiguate between schemas that exist on both the local and remote databases.
163
160
@@ -376,8 +373,8 @@ WITH
376
373
(
377
374
DATA_SOURCE = MyExtSrc,
378
375
SCHEMA_NAME ='sys',
379
-
OBJECT_NAME ='dm_exec_requests',
380
-
DISTRIBUTION=
376
+
OBJECT_NAME ='dm_exec_requests',
377
+
DISTRIBUTION=ROUND_ROBIN
381
378
);
382
379
```
383
380
@@ -632,34 +629,24 @@ The column definitions, including the data types and number of columns, must mat
632
629
633
630
Sharded external table options
634
631
635
-
Specifies the external data source (a non-SQL Server data source) and a distribution method for the [Elastic Database query](https://azure.microsoft.com/documentation/articles/sql-database-elastic-query-overview/).
632
+
Specifies the external data source (a non-SQL Server data source) and a distribution method for the [Elastic query](https://azure.microsoft.com/documentation/articles/sql-database-elastic-query-overview/).
636
633
637
634
DATA_SOURCE
638
-
An external data source such as data stored in a Hadoop File System, Azure blob storage, or a [shard map manager](https://azure.microsoft.com/documentation/articles/sql-database-elastic-scale-shard-map-management/).
635
+
The DATA_SOURCE clause defines the external data source (a shard map) that is used for the external table. For an example, see [Create external tables](https://docs.microsoft.com/azure/sql-database/sql-database-elastic-query-horizontal-partitioning#13-create-external-tables).
639
636
640
-
SCHEMA_NAME
641
-
The SCHEMA_NAME clause provides the ability to map the external table definition to a table in a different schema on the remote database. Use this clause to disambiguate between schemas that exist on both the local and remote databases.
642
-
643
-
OBJECT_NAME
644
-
The OBJECT_NAME clause provides the ability to map the external table definition to a table with a different name on the remote database. Use this clause to disambiguate between object names that exist on both the local and remote databases.
637
+
SCHEMA_NAME and OBJECT_NAME
638
+
The SCHEMA_NAME and OBJECT_NAME clauses map the external table definition to a table in a different schema. If omitted, the schema of the remote object is assumed to be “dbo” and its name is assumed to be identical to the external table name being defined. This is useful if the name of your remote table is already taken in the database where you want to create the external table. For example, you want to define an external table to get an aggregate view of catalog views or DMVs on your scaled out data tier. Since catalog views and DMVs already exist locally, you cannot use their names for the external table definition. Instead, use a different name and use the catalog view’s or the DMV’s name in the SCHEMA_NAME and/or OBJECT_NAME clauses. For an example, see [Create external tables](https://docs.microsoft.com/azure/sql-database/sql-database-elastic-query-horizontal-partitioning#13-create-external-tables).
645
639
646
640
DISTRIBUTION
647
-
Optional. This argument is only required for databases of type SHARD_MAP_MANAGER. This argument controls whether a table is treated as a sharded table or a replicated table. With **SHARDED** (*column name*) tables, the data from different tables don't overlap. **REPLICATED** specifies that tables have the same data on every shard. **ROUND_ROBIN** indicates that an application-specific method is used to distribute the data.
648
-
649
-
## Permissions
641
+
The DISTRIBUTION clause specifies the data distribution used for this table. The query processor utilizes the information provided in the DISTRIBUTION clause to build the most efficient query plans.
650
642
651
-
Requires these user permissions:
643
+
- SHARDED means data is horizontally partitioned across the databases. The partitioning key for the data distribution is the <sharding_column_name> parameter.
644
+
- REPLICATED means that identical copies of the table are present on each database. It is your responsibility to ensure that the replicas are identical across the databases.
645
+
- ROUND_ROBIN means that the table is horizontally partitioned using an application-dependent distribution method.
652
646
653
-
-**CREATE TABLE**
654
-
-**ALTER ANY SCHEMA**
655
-
-**ALTER ANY EXTERNAL DATA SOURCE**
656
-
-**ALTER ANY EXTERNAL FILE FORMAT**
657
-
-**CONTROL DATABASE**
658
-
659
-
Note, the login that creates the external data source must have permission to read and write to the external data source, located in Hadoop or Azure blob storage.
647
+
## Permissions
660
648
661
-
> [!IMPORTANT]
662
-
> The ALTER ANY EXTERNAL DATA SOURCE permission grants any principal the ability to create and modify any external data source object, and therefore, it also grants the ability to access all database scoped credentials on the database. This permission must be considered as highly privileged, and therefore must be granted only to trusted principals in the system.
649
+
Users with access to the external table automatically gain access to the underlying remote tables under the credential given in the external data source definition. Avoid undesired elevation of privileges through the credential of the external data source. Use GRANT or REVOKE for an external table just as though it were a regular table. Once you have defined your external data source and your external tables, you can now use full T-SQL over your external tables.
663
650
664
651
## Error Handling
665
652
@@ -689,7 +676,7 @@ Constructs and operations not supported:
689
676
- The DEFAULT constraint on external table columns
690
677
- Data Manipulation Language (DML) operations of delete, insert, and update
691
678
692
-
Only literal predicates defined in a query can be pushed down to the external data source. This is unlike linked servers and accessing where predicates determined during query execution can be used, i.e. when used in connjunction with a nested loop in a query plan. This will often lead to the whole external table being copied locally and then joined to.
679
+
Only literal predicates defined in a query can be pushed down to the external data source. This is unlike linked servers and accessing where predicates determined during query execution can be used, i.e. when used in conjunction with a nested loop in a query plan. This will often lead to the whole external table being copied locally and then joined to.
693
680
694
681
```sql
695
682
\\ Assuming External.Orders is an external table and Customer is a local table.
@@ -705,7 +692,7 @@ Only literal predicates defined in a query can be pushed down to the external da
705
692
706
693
Use of External Tables prevents use of parallelism in the query plan.
707
694
708
-
External tables are implemented as Remote Query and as such the estimated number of rows returned is generally 1000, there are other rules based on the type of predicate used to filter the external table. They are rulesbased estimates rather than estimates based on the actual data in the external table. The optimiser doesn't access the remote data source to obtain a more accurate estimate.
695
+
External tables are implemented as Remote Query and as such the estimated number of rows returned is generally 1000, there are other rules based on the type of predicate used to filter the external table. They are rules-based estimates rather than estimates based on the actual data in the external table. The optimizer doesn't access the remote data source to obtain a more accurate estimate.
709
696
710
697
## Locking
711
698
@@ -726,7 +713,9 @@ WITH
726
713
727
714
## See Also
728
715
729
-
[CREATE EXTERNAL DATA SOURCE](../../t-sql/statements/create-external-data-source-transact-sql.md)
- [Reporting across scaled-out cloud databases](https://docs.microsoft.com/azure/sql-database/sql-database-elastic-query-horizontal-partitioning)
718
+
- [Get started with cross-database queries (vertical partitioning)](https://docs.microsoft.com/azure/sql-database/sql-database-elastic-query-getting-started-vertical)
0 commit comments