title

APPROX_COUNT_DISTINCT (Transact-SQL) | Microsoft Docs

ms.custom

ms.date

11/12/2019

ms.prod

sql

ms.prod_service

database-engine, sql-database, sql-data-warehouse, pdw

ms.reviewer

ms.technology

t-sql

ms.topic

language-reference

f1_keywords

APPROX_COUNT_DISTINCT

dev_langs

TSQL

author

joesackmsft

ms.author

josack

monikerRange

>=aps-pdw-2016||=azuresqldb-current||=azure-sqldw-latest||>=sql-server-2016||=sqlallproducts-allversions||>=sql-server-linux-2017||=azuresqldb-mi-current

APPROX_COUNT_DISTINCT (Transact-SQL)

[!INCLUDEappliesto-xx-asdb-asdw-pdw-md]

This function returns the approximate number of unique non-null values in a group.

Transact-SQL Syntax Conventions

Syntax

-- Syntax for Azure SQL Database, Azure SQL Data Warehouse and Parallel Data Warehouse  

APPROX_COUNT_DISTINCT ( expression )

Arguments

expression
An expression of any type, except image, sql_variant, ntext, or text.

Return types

bigint

Remarks

APPROX_COUNT_DISTINCT( expression ) evaluates an expression for each row in a group, and returns the approximate number of unique non-null values in a group. This function is designed to provide aggregations across large data sets where responsiveness is more critical than absolute precision.

APPROX_COUNT_DISTINCT is designed for use in big data scenarios and is optimized for the following conditions:

Access of data sets that are millions of rows or higher and
Aggregation of a column or columns that have many distinct values

The function implementation guarantees up to a 2% error rate within a 97% probability.

APPROX_COUNT_DISTINCT requires less memory than an exhaustive COUNT DISTINCT operation. Given the smaller memory footprint, APPROX_COUNT_DISTINCT is less likely to spill memory to disk compared to a precise COUNT DISTINCT operation. To learn more about the algorithm used to achieve this, see HyperLogLog.

Note

With collation sensitive strings, APPROX_COUNT_DISTINCT uses a binary match and provides results that would have been generated in the presence of BIN collations and not BIN2.

Examples

A. Using APPROX_COUNT_DISTINCT

This example returns the approximate number of different order keys from the orders table.

SELECT APPROX_COUNT_DISTINCT(O_OrderKey) AS Approx_Distinct_OrderKey
FROM dbo.Orders;

[!INCLUDEssResult]

Approx_Distinct_OrderKey
------------------------
15164704

B. Using APPROX_COUNT_DISTINCT with GROUP BY

This example returns the approximate number of different order keys by order status from the orders table.

SELECT O_OrderStatus, APPROX_COUNT_DISTINCT(O_OrderKey) AS Approx_Distinct_OrderKey
FROM dbo.Orders
GROUP BY O_OrderStatus
ORDER BY O_OrderStatus;

[!INCLUDEssResult]

O_OrderStatus                                                    Approx_Distinct_OrderKey
---------------------------------------------------------------- ------------------------
F                                                                7397838
O                                                                7387803
P                                                                388036

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

APPROX_COUNT_DISTINCT (Transact-SQL)

Syntax

Arguments

Return types

Remarks

Examples

A. Using APPROX_COUNT_DISTINCT

B. Using APPROX_COUNT_DISTINCT with GROUP BY

See also

FilesExpand file tree

approx-count-distinct-transact-sql.md

Latest commit

History

approx-count-distinct-transact-sql.md

File metadata and controls

APPROX_COUNT_DISTINCT (Transact-SQL)

Syntax

Arguments

Return types

Remarks

Examples

A. Using APPROX_COUNT_DISTINCT

B. Using APPROX_COUNT_DISTINCT with GROUP BY

See also