| title | What is the storage pool? |
|---|---|
| titleSuffix | SQL Server big data clusters |
| description | This article describes the storage pool in a SQL Server 2019 big data cluster. |
| author | MikeRayMSFT |
| ms.author | mikeray |
| ms.reviewer | mihaelab |
| ms.date | 08/21/2019 |
| ms.topic | conceptual |
| ms.prod | sql |
| ms.technology | big-data-cluster |
What is the storage pool ([!INCLUDEbig-data-clusters-2019])?
[!INCLUDEtsql-appliesto-ssver15-xxxx-xxxx-xxx]
This article describes the role of the SQL Server storage pool in a [!INCLUDEbig-data-clusters-2019]. The following sections describe the architecture and functionality of a SQL storage pool.
The storage pool consists of storage nodes comprised of SQL Server on Linux, Spark, and HDFS. All the storage nodes in a SQL big data cluster are members of an HDFS cluster.
Storage nodes are responsible for:
- Data ingestion through Spark.
- Data storage in HDFS (Parquet and delimited text format). HDFS also provides data persistency, as HDFS data is spread across all the storage nodes in the SQL big data cluster.
- Data access through HDFS and SQL Server endpoints.
To learn more about the [!INCLUDEbig-data-clusters-2019], see the following resources:
- [What are [!INCLUDEbig-data-clusters-2019]?](big-data-cluster-overview.md)
- [Workshop: Microsoft [!INCLUDEbig-data-clusters-2019] Architecture](https://github.com/Microsoft/sqlworkshops/tree/master/sqlserver2019bigdataclusters)
