Skip to content

Commit 4f76cc9

Browse files
authored
Merge pull request #7177 from abiolao96/aboke1
Aboke1- Pushing latest changes to Doc
2 parents 11e2504 + db9abaa commit 4f76cc9

2 files changed

Lines changed: 96 additions & 5 deletions

File tree

docs/relational-databases/polybase/data-virtualization.md

Lines changed: 9 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -14,17 +14,17 @@ monikerRange: ">= sql-server-ver15 || = sqlallproducts-allversions"
1414

1515
# Use the Data Virtualization Wizard with external tables
1616

17-
One of the key scenarios for SQL Server 2019 CTP 2.0 is the ability to virtualize data such that the data can remain in it’s original location yet you can **virtualize** the data in a SQL Server instance so that it can be queried there like any other table in SQL Server. This will minimize the need for ETL processes.
17+
One of the key scenarios for SQL Server 2019 CTP 2.0 is the ability to virtualize data. This process allows the data to stay in it’s original location, however you can **virtualize** the data in a SQL Server instance so that it can be queried there like any other table in SQL Server. This will minimize the need for ETL processes. This is possible with the use of Polybase connectors. For more information on Data Virtualization please refer to our [Get started with PolyBase](polybase-guide.md) Document.
1818

1919
## Launch the virtualize data wizard
2020

2121
Connect to the master instance using the IP address / port number (31433) obtained at the end of the [deployment script](../../big-data-cluster/quickstart-big-data-cluster-deploy.md). Expand your **Databases** node in the Object Explorer and select one of the databases where you would like to virtualize the data into from an existing SQL Server instance. Right-click on the Database and select **Virtualize Data** from the context menu. This launches the Virtualize Data wizard. You can also launch the Virtualize Data wizard from the command palette by typing Ctrl+Shift+P (in Windows) and Cmd+Shift+P (in Mac).
2222

2323
![Virtualize data wizard](media/data-virtualization/virtualize-data-wizard.png)
24-
24+
bue
2525
## Select a data source
2626

27-
Once you launch the wizard from one of the databases you can see that the Destination Database dropdown in the wizard is auto-filled with the database name you have just selected. You also have an option to change the Destination Database if you need to at this step. The External Data Source Type which is supported right now is SQL Server and we will soon expand this offering to support other data sources. By default, SQL Server is selected. Other data source types such as Oracle, Teradata, MongoDB, etc. will be added here in the future.
27+
Once you launch the wizard from one of the databases you can see that the Destination Database dropdown is auto-filled with the database name you have just selected. You have the option of changing the Destination Database, if you need to so, at this step. The External Data Source Types which is supported right now are SQL Server and Oracle. We will soon expand this to offer support to other data sources. Other data source types such as Teradata, MongoDB, etc. will be added here in the future. By default, SQL Server is selected.
2828

2929
![Select a data source](media/data-virtualization/select-data-source.png)
3030

@@ -52,9 +52,13 @@ The next step is to Configure Credential, so provide a Credential Name, this is
5252

5353
## External Data Table Mapping
5454

55-
In the next window, you will be able to select the databases you want to create external views of. Note: Selecting the parent will include all child tables as well. When the tables are selected, the mapping of the external table can be seen on the right hand side. Here you can make any type changes or change the name of the external tables. Note: Double clicking a selected table will change the mapping view.
55+
In the next window, you will be able to select the tables you want to create external views of. Selecting the Parent databases will include all child tables as well. When the tables are selected, a mapping table can be seen on the right hand side. Here you can make any 'type' changes or change the name of the selected external table itself.
56+
57+
[!NOTE]
58+
Double clicking another selected table will change the mapping view.
5659

57-
[IMPORTANT] Photo type is not yet supported by the data virtualization tool. Creating an external view with a photo type in it will throw an error after creation of table. The table will still be created though.
60+
[!IMPORTANT]
61+
Photo type is not yet supported by the data virtualization tool. Creating an external view with a photo type in it will throw an error after creation of table. The table will still be created though.
5862

5963
## Summary
6064

docs/relational-databases/polybase/polybase-configure-sql-server.md

Lines changed: 87 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -23,6 +23,93 @@ The article explains how to use PolyBase on a SQL Server instance to query exter
2323

2424
If you haven't installed PolyBase, see [PolyBase installation](polybase-installation.md). The installation article explains the prerequisites.
2525

26+
## Configure an External Table
27+
28+
To query the data in your SQL Server data source, you must define an external table to use in Transact-SQL queries. The following steps describe how to configure the external table.
29+
30+
1. Create a master key on the database. This is required to encrypt the credential secret.
31+
```sql
32+
CREATE MASTER KEY ENCRYPTION BY PASSWORD = 'S0me!nfo';
33+
```
34+
35+
2. Create a database scoped credential for.
36+
37+
```sql
38+
/* specify credentials to external data source
39+
* IDENTITY: user name for external source.
40+
* SECRET: password for external source.
41+
*/
42+
CREATE DATABASE SCOPED CREDENTIAL SqlServerCredentials
43+
WITH IDENTITY = 'username', Secret = 'password';
44+
```
45+
46+
3. Create an external data source with [CREATE EXTERNAL DATA SOURCE](../../t-sql/statements/create-external-data-source-transact-sql.md).Specify external data source location and credentials for SQL Server.
47+
48+
```sql
49+
/* LOCATION: Server DNS name or IP address.
50+
* PUSHDOWN: specify whether computation should be pushed down to the source. ON by default.
51+
* CREDENTIAL: the database scoped credential, created above.
52+
*/
53+
CREATE EXTERNAL DATA SOURCE SqlServerInstance
54+
WITH (
55+
LOCATION = 'sqlserver://TestDBs',
56+
-- PUSHDOWN = ON | OFF,
57+
CREDENTIAL = SqlServerCredentials
58+
);
59+
60+
```
61+
62+
4. Create an external file format with [CREATE EXTERNAL FILE FORMAT](../../t-sql/statements/create-external-file-format-transact-sql.md).
63+
64+
```sql
65+
/* specify external file format
66+
* FORMAT TYPE: Type of format in Hadoop - DELIMITEDTEXT, RCFILE, ORC, PARQUET
67+
* for compressed Parquet files, specify DATA_COMPRESSION
68+
*/
69+
CREATE EXTERNAL FILE FORMAT Parquet
70+
WITH (
71+
FORMAT_TYPE = PARQUET
72+
-- , DATA_COMPRESSION = 'org.apache.hadoop.io.compress.SnappyCodec'
73+
);
74+
```
75+
76+
5. Create schemas for external data
77+
78+
```sql
79+
CREATE SCHEMA sqlserver;
80+
GO
81+
82+
```
83+
6. Create external tables that represents data stored in external SQL Server [CREATE EXTERNAL TABLE](../../t-sql/statements/create-external-table-transact-sql.md).
84+
85+
```sql
86+
/* LOCATION: sql server table/view in 'database_name.schema_name.object_name' format
87+
* DATA_SOURCE: the external data source, created above.
88+
*/
89+
CREATE EXTERNAL TABLE sqlserver.customer(
90+
C_CUSTKEY INT NOT NULL,
91+
C_NAME VARCHAR(25) NOT NULL,
92+
C_ADDRESS VARCHAR(40) NOT NULL,
93+
C_NATIONKEY INT NOT NULL,
94+
C_PHONE CHAR(15) NOT NULL,
95+
C_ACCTBAL DECIMAL(15,2) NOT NULL,
96+
C_MKTSEGMENT CHAR(10) NOT NULL,
97+
C_COMMENT VARCHAR(117) NOT NULL
98+
)
99+
WITH (
100+
LOCATION='tpch_10.dbo.customer',
101+
DATA_SOURCE=SqlServerInstance
102+
);
103+
104+
```
105+
1. Create statistics on an external table.
106+
107+
```sql
108+
CREATE STATISTICS CustomerCustKeyStatistics ON sqlserver.customer(C_CUSTKEY) WITH FULLSCAN;
109+
```
110+
111+
112+
26113
## Next steps
27114

28115
To learn more about PolyBase, see [Overview of SQL Server PolyBase](polybase-guide.md).

0 commit comments

Comments
 (0)