You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
title: What is SQL Server Big Data Cluster? | Microsoft Docs
3
3
description:
4
4
author: rothja
5
5
ms.author: jroth
@@ -9,24 +9,24 @@ ms.topic: overview
9
9
ms.prod: sql
10
10
---
11
11
12
-
# What is SQL Server Aris?
12
+
# What is SQL Server Big Data Cluster?
13
13
14
-
[!INCLUDE[SQL Server vNext](../includes/sssqlv15-md.md)] CTP 2.0 enables you to integrate your "high-value" relational data in SQL Server with your "high-volume" data in big data environments, such as Hadoop.
14
+
[!INCLUDE[SQL Server 2019 CTP 2.0](../includes/sssqlv15-md.md)] CTP 2.0 enables you to integrate your "high-value" relational data in SQL Server with your "high-volume" data in big data environments, such as Hadoop.
15
15
16
16
## Architecture
17
17
18
18
CTP 2.0 allows you to create and deploy a *data pool* that consists of many SQL Server *data pool instances* in your cluster. You can then ingest your high-volume data from HDFS via Spark streaming jobs into the SQL Server data pool instances by partitioning the data and spreading the partitions across the SQL Server data pool instances in the data pool.
19
19
20
20
Once the high-volume data is stored in partitions in the SQL Server data pool instances on the cluster, you can create an *external table* in the SQL Server *master instance* that represents the high-volume data that resides in the partitions stored in the SQL Server data pool instances in your cluster. This external table can be queried in the master instance just like any other table, but in this case a fan-out query will be simultaneously executed against each of the SQL Server data pool instances to query the partitioned data. This fan-out query runs the filter part of the query and local aggregations in parallel across all of the data pool instances. The results of these queries will be brought back to the master instance and you can optionally choose to join the results of the high-volume data fan-out query with the results of a high-value data query in the SQL Server master instance.
21
21
22
-
The following diagram shows the eventual state of the Project Aris architecture:
22
+
The following diagram shows the eventual state of the Big Data Cluster architecture:
Copy file name to clipboardExpand all lines: docs/big-data-cluster/concept-controller.md
+9-9Lines changed: 9 additions & 9 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,5 +1,5 @@
1
1
---
2
-
title: Overview of the SQL Server Aris controller | Microsoft Docs
2
+
title: Overview of the SQL Server Big Data Cluster controller | Microsoft Docs
3
3
description:
4
4
author: rothja
5
5
ms.author: jroth
@@ -9,20 +9,20 @@ ms.topic: conceptual
9
9
ms.prod: sql
10
10
---
11
11
12
-
# Overview of the SQL Server Aris controller
12
+
# Overview of the SQL Server Big Data Cluster controller
13
13
14
14
## What is the cluster Controller?
15
15
The Controller hosts the core logic for building and managing the cluster. It takes care of all interactions with Kubernetes, SQL Servers instances that are part of the cluster as well as Hadoop components like HDFS.
16
16
17
17
The Controller service provides the following core functionality:
18
-
- Manage cluster lifecycle: cluster bootstrap & delete, update configurations and upgrade (upgrade not available in CTP2.0)
18
+
- Manage cluster lifecycle: cluster bootstrap & delete, update configurations and upgrade (upgrade not available inCTP 2.0)
19
19
- Manage master SQL Server instances
20
20
- Manage compute, data and storage pools
21
21
- Expose monitoring tools to observe the state of the cluster
22
22
- Expose troubleshooting tools to detect and repair unexpected issues
23
23
- Manage cluster security: ensure secure cluster endpoints, manage users and roles, configure credentials for intra-cluster communication
24
-
- Manage the workflow of upgrades so that they are implemented safely (not available in CTP2.0)
25
-
- Manage high availability and DR for statefull services in the cluster (not in CTP2.0)
24
+
- Manage the workflow of upgrades so that they are implemented safely (not available inCTP 2.0)
25
+
- Manage high availability and DR for statefull services in the cluster (not inCTP 2.0)
26
26
27
27
## Deploying the Controller service
28
28
The Controller is hosted as a daemon set in the same Kubernetes environment where the customer wants to build out the cluster. This service is installed by a Kubernetes administrator during cluster bootstrap, using the mssqlctl command-line utility:
@@ -31,14 +31,14 @@ The Controller is hosted as a daemon set in the same Kubernetes environment wher
31
31
python mssqlctl.py create cluster <name of your cluster>
32
32
```
33
33
34
-
The buildout workflow will layout on top of Kubernetes a fully functional Aris cluster that includes all the components described in the Overview (TBD add link) section. This workflow creates first the Controller, and once this is deployed, it will coordinate the installation and configuration of rest of the services part of Master, Compute, Data and Storage pools.
34
+
The buildout workflow will layout on top of Kubernetes a fully functional Big Data Cluster that includes all the components described in the Overview (TBD add link) section. This workflow creates first the Controller, and once this is deployed, it will coordinate the installation and configuration of rest of the services part of Master, Compute, Data and Storage pools.
35
35
36
36
## Manging the cluster through the Controller
37
37
Customers are expected to manage the cluster purely through the Controller using either `mssqlctl` APIs or the administration portal this is hosted within the cluster. If customers will deploy additional Kubernetes objects (e.g. pods) into the same namespace, they will not be managed or monitored by the Controller.
38
38
The Controller and the Kubernetes objects (stateful sets, pods, secrets, etc…) created for the cluster reside in a dedicated Kubernetes namespace. The Controller service will be granted permission by the Kubernetes cluster administrator to manage all resources within that namespace. The RBAC policy this scenario is configured automatically as part of initial cluster deployment.
39
39
40
40
### mssqlctl
41
-
`mssqlctl` is a command line utility written in Python that enables cluster administrators to bootstrap and manage the Aris cluster via REST APIs.
41
+
`mssqlctl` is a command line utility written in Python that enables cluster administrators to bootstrap and manage the Big Data Cluster via REST APIs.
42
42
TBD Aquisition experience I.e. from any client machine->pip install....
43
43
44
44
### Cluster Admin Portal
@@ -48,7 +48,7 @@ Once the Controller service is up and running, cluster administrator can leverag
48
48
TBD (see service status, where are the logs)
49
49
50
50
## Controller service security
51
-
All communication to the Controller is conducted via a REST API over HTTPS. Certificates for Controller endpoint will be configured at bootstrap time. A self-signed certificate will be automatically generated for the CTP2.0. In future releases, we will provide a mechanism for customers to provide their certificates from their own certificate authority for production deployments.
51
+
All communication to the Controller is conducted via a REST API over HTTPS. Certificates for Controller endpoint will be configured at bootstrap time. A self-signed certificate will be automatically generated for theCTP 2.0. In future releases, we will provide a mechanism for customers to provide their certificates from their own certificate authority for production deployments.
52
52
53
53
Authentication to Controller endpoint is based on username and password. These credentials are provisioned at cluster bootstrap time using the input for environment variables `CONTROLLER_USERNAME` and `CONTROLLER_PASSWORD`. For example, from a Linux client, you can run below script to set the environment variables:
54
54
@@ -68,4 +68,4 @@ TBD
68
68
69
69
## Next steps
70
70
71
-
-[Deploy SQL Server Aris on Kubernetes](quickstart-sql-server-aris-deploy.md)
71
+
-[Deploy SQL Server Big Data Cluster on Kubernetes](quickstart-big-data-cluster-deploy.md)
title: Data persistence with SQL Server Aris on Kubernetes | Microsoft Docs
2
+
title: Data persistence with SQL Server Big Data Cluster on Kubernetes | Microsoft Docs
3
3
description:
4
4
author: rothja
5
5
ms.author: jroth
@@ -9,24 +9,24 @@ ms.topic: conceptual
9
9
ms.prod: sql
10
10
---
11
11
12
-
# Data persistence with SQL Server Aris on Kubernetes
12
+
# Data persistence with SQL Server Big Data Cluster on Kubernetes
13
13
14
-
[Persistent Volumes](https://kubernetes.io/docs/concepts/storage/persistent-volumes/) provide a plugin model for storage in Kubernetes where how storage is provided is completed abstracted from how it is consumed. Therefore, you can bring your own highly available storage and plug it into the SQL Server Aris cluster. This gives you full control over the type of storage, availability, and performance that you require. Kubernetes supports various kinds of storage solutions including Azure disks/files, NFS, local storage, and more.
14
+
[Persistent Volumes](https://kubernetes.io/docs/concepts/storage/persistent-volumes/) provide a plugin model for storage in Kubernetes where how storage is provided is completed abstracted from how it is consumed. Therefore, you can bring your own highly available storage and plug it into the SQL Server Big Data Cluster cluster. This gives you full control over the type of storage, availability, and performance that you require. Kubernetes supports various kinds of storage solutions including Azure disks/files, NFS, local storage, and more.
15
15
16
16
## Configure persistent volumes
17
17
18
-
The way SQL Server Aris consumes these persistent volumes is by using [Storage Classes](https://kubernetes.io/docs/concepts/storage/storage-classes/). You can create different storage classes for different kind of storage and specify them at the Aris cluster deployment time. You can configure which storage class to use for which purpose (pool). SQL Server Aris creates [persistent volume claims](https://kubernetes.io/docs/concepts/storage/persistent-volumes/#persistentvolumeclaims) with the specified storage class name for each pod that requires persistent volumes. It then mounts the corresponding persistent volume(s) in the pod.
18
+
The way SQL Server Big Data Cluster consumes these persistent volumes is by using [Storage Classes](https://kubernetes.io/docs/concepts/storage/storage-classes/). You can create different storage classes for different kind of storage and specify them at the Big Data Cluster deployment time. You can configure which storage class to use for which purpose (pool). SQL Server Big Data Cluster creates [persistent volume claims](https://kubernetes.io/docs/concepts/storage/persistent-volumes/#persistentvolumeclaims) with the specified storage class name for each pod that requires persistent volumes. It then mounts the corresponding persistent volume(s) in the pod.
19
19
20
20
> [!NOTE]
21
21
> For CTP 2.0, you can only have one storage class with static size and access mode for the whole cluster.
22
22
23
23
## Deployment settings
24
24
25
-
To use persistent storage during deployment, configure the **USE_PERSISTENT_VOLUME** and **STORAGE_CLASS_NAME** flags with mssqlctl. **USE_PERSISTENT_VOLUME** is set to false by default, and, in this case, SQL Server Aris uses emptyDir mounts. If you set the flag to true, you must also provide **STORAGE_CLASS_NAME** as a parameter at the deployment time.
25
+
To use persistent storage during deployment, configure the **USE_PERSISTENT_VOLUME** and **STORAGE_CLASS_NAME** flags with mssqlctl. **USE_PERSISTENT_VOLUME** is set to false by default, and, in this case, SQL Server Big Data Cluster uses emptyDir mounts. If you set the flag to true, you must also provide **STORAGE_CLASS_NAME** as a parameter at the deployment time.
26
26
27
27
## AKS/ACS storage classes
28
28
29
-
AKS and ACS both come with two built-in storage classes **default** and **premium-storage** along with dynamic provisioner for them. You can specify either of those or create their own storage class for deploying Aris cluster with persistent storage enabled.
29
+
AKS and ACS both come with two built-in storage classes **default** and **premium-storage** along with dynamic provisioner for them. You can specify either of those or create their own storage class for deploying Big Data Cluster with persistent storage enabled.
30
30
31
31
## Minikube storage class
32
32
@@ -38,10 +38,10 @@ Kubeadm does not come with a built-in storage class; therefore, we have created
38
38
39
39
## On-premises cluster
40
40
41
-
On-premise clusters obviously do not come with any built-in storage class, therefore you must set up persistent volumes/provisioners beforehand and then use the corresponding storage classes during SQL Server Aris deployment.
41
+
On-premise clusters obviously do not come with any built-in storage class, therefore you must set up persistent volumes/provisioners beforehand and then use the corresponding storage classes during SQL Server Big Data Cluster deployment.
42
42
43
43
## Next steps
44
44
45
45
For complete documentation about volumes in Kubernetes, see the [Kubernetes documentation on Volumes](https://kubernetes.io/docs/concepts/storage/volumes/).
46
46
47
-
For more information about deploying SQL Server Aris, see [How to deploy SQL Server Aris on Kubernetes](sql-server-aris-deployment-guidance.md).
47
+
For more information about deploying SQL Server Big Data Cluster, see [How to deploy SQL Server Big Data Cluster on Kubernetes](deployment-guidance.md).
0 commit comments