| title | What are data pools? |
|---|---|
| titleSuffix | SQL Server big data clusters |
| description | This article describes the data pool in a SQL Server 2019 big data cluster. |
| author | MikeRayMSFT |
| ms.author | mikeray |
| ms.reviewer | mihaelab |
| ms.date | 08/21/2019 |
| ms.topic | conceptual |
| ms.prod | sql |
| ms.technology | big-data-cluster |
[!INCLUDEtsql-appliesto-ssver15-xxxx-xxxx-xxx]
This article describes the role of SQL Server data pools in a [!INCLUDEbig-data-clusters-2019]. The following sections describe the architecture and functionality of a SQL data pool.
This 5-minute video introduces data pools and shows you how to query data from data pools:
A data pool consists of one or more SQL Server data pool instances. SQL data pool instances provide persistent SQL Server storage for the cluster. A data pool is used to ingest data from SQL queries or Spark jobs. To provide better performance across large data sets, data in a data pool is distributed into shards across the member SQL data pool instances.
Data pools enable the creation of scale-out data marts, where external data from multiple sources is ingested into the data pool. Because data is distributed across data pool instances, parallel queries against the curated data are more efficient.
To learn more about the [!INCLUDEbig-data-clusters-2019], see the following resources:
- [What are [!INCLUDEbig-data-clusters-2019]?](big-data-cluster-overview.md)
- [Workshop: Microsoft [!INCLUDEbig-data-clusters-2019] Architecture](https://github.com/Microsoft/sqlworkshops/tree/master/sqlserver2019bigdataclusters)
