| title | Virtualize external data in SQL Server 2019 CTP 2.0 | Microsoft Docs |
|---|---|
| description | |
| author | Abiola |
| ms.author | aboke |
| manager | craigg |
| ms.date | 12/13/2018 |
| ms.topic | conceptual |
| ms.prod | sql |
| ms.technology | polybase |
| monikerRange | >= sql-server-ver15 || = sqlallproducts-allversions |
SQL Server 2019 also allows the ability to virtualize data from a CSV file in HDFS. This process allows the data to stay in its original location, however you can virtualize the data in a SQL Server instance so that it can be queried there like any other table in SQL Server. This will minimize the need for ETL processes. This is possible with the use of Polybase connectors. For more information on Data Virtualization please refer to our Get started with PolyBase Document.
Connect to the HDFS root using the IP address. Expand the elements in the object explorer, Then select one of the CSV from which you would like to virtualize the data into an existing SQL Server instance. Right-click on the file and select Create External Table From CSV File from the context menu. This launches the Virtualize Data wizard. You can also launch the Virtualize Data wizard from the command palette by typing Ctrl+Shift+P (in Windows) and Cmd+Shift+P (in Mac).
Here you can specify which SQL Master Instance you will connect too using the IP, Port, and Credential information. SAved connection can be access via the Active SQL Server connections drop down box.
Note
If you are using a saved connection the other fields will be blocked
Click Next to proceed to the next step in the wizard which sets the Database Master Key.
In this step, you will choose the destination databse you wish to virtualize the data into. The drop down field will contain all acceptable databases in the SQL Master instance specified in the previous screen. Here you can also name the new external table and see the schema it will use.
On this window you will be able to see a preview of the first 50 rows of your CSV file for validation.
Once done viewing the preview, click "Next" to continue
In the next window, you will be able to Modify the columns of the external table you intend to create. you are able to alter the column name, Change the data type, and allow for Nullable rows.
This step provides a summary of your selections. It provides the SQL Master Instance and Proposed External table information.. In this step, you have the option to "Generate Script" which will script out in T-SQL the syntax to create the external data source or Create which will create the External Data Source object.
If you click "Create" you will be able to see the External table created in the Destination database.
If you click, Generate Script you will see the T-SQL query being generated for creating the External Data Source object.
Note
Generate Script should be only visible in the last page of the wizard. Currently it shows in all pages.
For more information on SQL Server Big Data Cluster and related scenarios, see What is SQL Server Big Data Cluster?.







