Skip to content

Latest commit

 

History

History
109 lines (78 loc) · 5.13 KB

File metadata and controls

109 lines (78 loc) · 5.13 KB
title Load sample data
titleSuffix SQL Server 2019 big data clusters
description This tutorial demonstrates how to load sample data into a SQL Server big data cluster. The sample data includes relational data in the SQL Server master instance. It also includes HDFS data in the storage pool. This data supports other tutorials in this section.
author rothja
ms.author jroth
manager craigg
ms.date 12/13/2018
ms.topic tutorial
ms.prod sql
ms.custom seodec18

Tutorial: Load sample data into a SQL Server 2019 big data cluster

This tutorial explains how to use a script to load sample data into a SQL Server 2019 big data cluster (preview). Many of the other tutorials in the documentation use this sample data.

Tip

You can find additional samples for SQL Server 2019 big data cluster (preview) in the sql-server-samples GitHub repository. They are located in the sql-server-samples/samples/features/sql-big-data-cluster/ path.

Prerequisites

Load sample data

The following steps use a bootstrap script to download a SQL Server database backup and load the data into your big data cluster. For ease of use, these steps have been broken out into Windows and Linux sections.

Windows

The following steps describe how to use a Windows client to load the sample data into your big data cluster.

  1. In Windows Powershell, use curl to download the bootstrap script.

    curl -o bootstrap-sample-db.cmd "https://raw.githubusercontent.com/Microsoft/sql-server-samples/master/samples/features/sql-big-data-cluster/bootstrap-sample-db.cmd"
  2. Download the bootstrap-sample-db.sql Transact-SQL script. This script is called by the bootstrap script.

    curl -o bootstrap-sample-db.sql "https://raw.githubusercontent.com/Microsoft/sql-server-samples/master/samples/features/sql-big-data-cluster/bootstrap-sample-db.sql"
  3. The bootstrap script requires the following positional parameters for your big data cluster:

    Parameter Description
    <CLUSTER_NAMESPACE> The name you gave your big data cluster.
    <SQL_MASTER_IP> The IP address of your master instance.
    <SQL_MASTER_SA_PASSWORD> The SA password for the master instance.
    <KNOX_IP> The IP address of the HDFS/Spark Gateway.
    <KNOX_PASSWORD> The password for the HDFS/Spark Gateway.

    [!TIP] Use kubectl to find the IP addresses for the SQL Server master instance and Knox. Run kubectl get svc -n <your-cluster-name> and look at the EXTERNAL-IP addresses for the master instance (endpoint-master-pool) and Knox (service-security-lb or service-security-nodeport).

  4. Run the bootstrap script.

    .\bootstrap-sample-db.cmd <CLUSTER_NAMESPACE> <SQL_MASTER_IP> <SQL_MASTER_SA_PASSWORD> <KNOX_IP> <KNOX_PASSWORD>

Linux

The following steps describe how to use a Linux client to load the sample data into your big data cluster.

  1. Download the bootstrap script, and assign executable permissions to it.

    curl -o bootstrap-sample-db.sh "https://raw.githubusercontent.com/Microsoft/sql-server-samples/master/samples/features/sql-big-data-cluster/bootstrap-sample-db.sh"
    chmod +x bootstrap-sample-db.sh
  2. Download the bootstrap-sample-db.sql Transact-SQL script. This script is called by the bootstrap script.

    curl -o bootstrap-sample-db.sql "https://raw.githubusercontent.com/Microsoft/sql-server-samples/master/samples/features/sql-big-data-cluster/bootstrap-sample-db.sql"
  3. The bootstrap script requires the following positional parameters for your big data cluster:

    Parameter Description
    <CLUSTER_NAMESPACE> The name you gave your big data cluster.
    <SQL_MASTER_IP> The IP address of your master instance.
    <SQL_MASTER_SA_PASSWORD> The SA password for the master instance.
    <KNOX_IP> The IP address of the HDFS/Spark Gateway.
    <KNOX_PASSWORD> The password for the HDFS/Spark Gateway.

    [!TIP] Use kubectl to find the IP addresses for the SQL Server master instance and Knox. Run kubectl get svc -n <your-cluster-name> and look at the EXTERNAL-IP addresses for the master instance (endpoint-master-pool) and Knox (service-security-lb or service-security-nodeport).

  4. Run the bootstrap script.

    ./bootstrap-sample-db.sh <CLUSTER_NAMESPACE> <SQL_MASTER_IP> <SQL_MASTER_SA_PASSWORD> <KNOX_IP> <KNOX_PASSWORD>

Next steps

After the bootstrap script runs, your big data cluster has the sample databases and HDFS data. To start exploring this data and big data clusters, see the Tutorials in this section.