Skip to content

Commit d7fa1ab

Browse files
committed
commit
1 parent 287611a commit d7fa1ab

1 file changed

Lines changed: 92 additions & 0 deletions

File tree

Lines changed: 92 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,92 @@
1+
---
2+
title: #Azure Synpase Pathway behind the scenes.
3+
description: #Technical deep dive.
4+
author: #anshul82-ms
5+
ms.author: #anrampal.
6+
ms.service: #Azure Synpase Pathway.
7+
ms.topic: conceptual #Required; leave this attribute/value as-is.
8+
ms.date: #Required; 02/24/2021.
9+
ms.custom: template-concept #Required; leave this attribute/value as-is.
10+
---
11+
12+
<!--Remove all the comments in this template before you sign-off or merge to the
13+
main branch.
14+
-->
15+
16+
<!--
17+
This template provides the basic structure of a concept article.
18+
See the [concept guidance](contribute-how-write-concept.md) in the contributor guide.
19+
20+
To provide feedback on this template contact
21+
[the templates workgroup](mailto:templateswg@microsoft.com).
22+
-->
23+
24+
<!-- 1. H1
25+
Required. Set expectations for what the content covers, so customers know the
26+
content meets their needs. Should NOT begin with a verb.
27+
-->
28+
29+
# Azure Synapse Pathway behind the scenes
30+
31+
Azure Synapse Pathway’s goal is to preserve the functional intent of the original code while optimizing for the modern Synapse SQL data warehouse. Synapse Pathway uses a three-stage process for translating SQL code from a source system to Azure Synapse SQL. Each of the stages preserves and augments the knowledge of the source with original location information and source specific metadata to ensure the highest quality translation.
32+
![Azure Synapse Pathway.](./media/technical-deep-dive/behind-the-scene.png)
33+
34+
## Stage 1 – Lexing and Parsing
35+
SQL Language parsing is a problem that has been solved many times over. Commercial and open source parsers exist to help with the underlying process of taking a source statement or fragment, breaking it down into logical tokens and then executing against a set or parser rules to ensure language consistency.
36+
37+
Synapse Pathway defines source grammars that allow the tool to identify and process the SQL input into an augmented Abstract Syntax Tree (AST) used in further processing.
38+
39+
## Augmented Abstract Syntax Tree (AST)
40+
Synapse Pathway defines a common representation of all objects in an augment Abstract Syntax Tree (AST). The Synapse Pathway AST includes additional statement or fragment metadata that is used to make the proper decision when translating code between systems.
41+
42+
By not just tracking that a token is a function but rather the source system type requirement, the script generation components can make smarter decisions about translating to Synapse SQL.
43+
44+
For example, the source function for the absolute function is defined as:
45+
```sql
46+
ABS( float_expression )
47+
```
48+
Azure Synapse SQL defines the absolute function as:
49+
```sql
50+
ABS ( numeric_expression )
51+
```
52+
In this simple case, Synapse Pathway understands that the conversion in Synapse SQL from float to numeric is an implicit [conversion](..https://docs.microsoft.com/en-us/sql/t-sql/functions/cast-and-convert-transact-sql?view=sql-server-ver15#implicit-conversions) and requires no further type casting. Simple, clean and effective code translation.
53+
54+
Retaining this meta-information about the source statements and fragments assists in the structural differences between platforms – conversions in opt out logic for search condition predicates in a WHERE clause for example.
55+
56+
57+
## Syntax Generation
58+
59+
The last stage of the process is to generate syntax for Synapse SQL. Using the AST structure generated from the source files, Synapse Pathway writes each DDL object to an individual file. The syntax generators utilize in-depth knowledge of the target platform to optimize statements.
60+
61+
For example, a common pattern seen in data loading scenarios is to first delete all of the contents in a staging table and then loaded the data from another staging table in an INSERT/SELECT fashion.
62+
```sql
63+
DELETE staging.table1 ALL;
64+
INSERT INTO staging.table1FROM staging.table2;
65+
```
66+
Synapse SQL has an optimized path for this scenario – a [CREATE TABLE AS SELECT](https://docs.microsoft.com/azure/synapse-analytics/sql-data-warehouse/sql-data-warehouse-develop-ctas). The CTAS statement is a batched based operation and minimally logged driving performance up by utilizing all the compute infrastructure available. Without this insight about Synapse SQL, tools often produce a truncate and INSERT/SELECT statement.
67+
```sql
68+
TRUNCATE TABLE staging.table1;
69+
INSERT INTO staging.table1FROM staging.table2;
70+
```
71+
While not bad, this can be optimized to a DROP TABLE and CTAS to provide better performance.
72+
```sql
73+
DROP TABLE staging.table1;
74+
CREATE TABLE staging.table1
75+
WITH
76+
(
77+
-- Derived from the original table definition
78+
DISTRIBUTION = HASH(column1),
79+
-- Derived from the original table definition
80+
CLUSTERED COLUMNSTORE INDEX
81+
)
82+
AS SELECT * FROM staging.table2;
83+
```
84+
## Next steps
85+
<!-- Add a context sentence for the following links -->
86+
- [Download Azure Synapse Pathway](synapse-pathway-download.md)
87+
- [FAQ](pathway-faq.md)
88+
89+
<!--
90+
Remove all the comments in this template before you sign-off or merge to the
91+
main branch.
92+
-->

0 commit comments

Comments
 (0)