Skip to content

Commit b98d6d8

Browse files
authored
Merge pull request #17443 from XiaoyuMSFT/MVfunction
add distinct function example
2 parents a4eacae + eda34e5 commit b98d6d8

1 file changed

Lines changed: 49 additions & 1 deletion

File tree

docs/t-sql/statements/create-materialized-view-as-select-transact-sql.md

Lines changed: 49 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -41,7 +41,7 @@ monikerRange: "=azure-sqldw-latest||=sqlallproducts-allversions"
4141
---
4242
# CREATE MATERIALIZED VIEW AS SELECT (Transact-SQL)
4343

44-
[!INCLUDE [asa](../../includes/applies-to-version/asa.md)]
44+
[!INCLUDE [Azure Synapse Analytics](../../includes/applies-to-version/asa.md)]
4545

4646
This article explains the CREATE MATERIALIZED VIEW AS SELECT T-SQL statement in [!INCLUDE[ssSDW](../../includes/sssdwfull-md.md)] for developing solutions. The article also provides code examples.
4747

@@ -107,6 +107,11 @@ When MIN/MAX aggregates are used in the SELECT list of materialized view definit
107107

108108
A materialized view in Azure data warehouse is similar to an indexed view in SQL Server.  It shares almost the same restrictions as indexed view (see [Create Indexed Views](/sql/relational-databases/views/create-indexed-views) for details) except that a materialized view supports aggregate functions.  
109109

110+
>[!Note]
111+
>Although CREATE MATERIALIZED VIEW does not support COUNT, DISTINCT, COUNT(DISTINCT expression), or COUNT_BIG (DISTINCT expression), SELECT queries with these functions can still benefit from materialized views for faster performance as the Synapse SQL optimizer can automatically re-write those aggregations in the user query to match existing materialized views. For details, check this article's example section.
112+
113+
APPROX_COUNT_DISTINCT is not supported in CREATE MATERIALIZED VIEW AS SELECT.
114+
110115
Only CLUSTERED COLUMNSTORE INDEX is supported by materialized view.
111116

112117
A materialized view cannot reference other views.
@@ -141,6 +146,49 @@ To find out if a SQL statement can benefit from a new materialized view, run the
141146

142147
Requires 1) REFERENCES and CREATE VIEW permission OR 2) CONTROL permission on the schema in which the view is being created.
143148

149+
## Example
150+
A. This example shows how Synapse SQL optimizer automatically uses materialized views to execute a query for better performance even when the query uses functions un-supported in CREATE MATERIALIZED VIEW, such as COUNT(DISTINCT expression). A query used to take multiple seconds to complete now finishes in sub-second without any change in the user query.
151+
152+
``` sql
153+
154+
-- Create a table with ~536 million rows
155+
create table t(a int not null, b int not null, c int not null) with (distribution=hash(a), clustered columnstore index);
156+
157+
insert into t values(1,1,1);
158+
159+
declare @p int =1;
160+
while (@P < 30)
161+
begin
162+
insert into t select a+1,b+2,c+3 from t;
163+
select @p +=1;
164+
end
165+
166+
-- A SELECT query with COUNT_BIG (DISTINCT expression) took multiple seconds to complete and it reads data directly from the base table a.
167+
select a, count_big(distinct b) from t group by a;
168+
169+
-- Create two materialized views, not using COUNT_BIG(DISTINCT expression).
170+
create materialized view V1 with(distribution=hash(a)) as select a, b from dbo.t group by a, b;
171+
172+
-- Clear all cache.
173+
174+
DBCC DROPCLEANBUFFERS;
175+
DBCC freeproccache;
176+
177+
-- Check the estimated execution plan in SQL Server Management Studio. It shows the SELECT query is first step (GET operator) is to read data from the materialized view V1, not from base table a.
178+
select a, count_big(distinct b) from t group by a;
179+
180+
-- Now execute this SELECT query. This time it took sub-second to complete because Synapse SQL engine automatically matches the query with materialized view V1 and uses it for faster query execution. There was no change in the user query.
181+
182+
DECLARE @timerstart datetime2, @timerend datetime2;
183+
SET @timerstart = sysdatetime();
184+
185+
select a, count_big(distinct b) from t group by a;
186+
187+
SET @timerend = sysdatetime()
188+
select DATEDIFF(ms,@timerstart,@timerend);
189+
190+
```
191+
144192

145193
## See also
146194

0 commit comments

Comments
 (0)