|
| 1 | +--- |
| 2 | +title: #Azure Synpase Pathway behind the scenes. |
| 3 | +description: #Technical deep dive. |
| 4 | +author: #anshul82-ms |
| 5 | +ms.author: #anrampal. |
| 6 | +ms.service: #Azure Synpase Pathway. |
| 7 | +ms.topic: conceptual #Required; leave this attribute/value as-is. |
| 8 | +ms.date: #Required; 02/24/2021. |
| 9 | +ms.custom: template-concept #Required; leave this attribute/value as-is. |
| 10 | +--- |
| 11 | + |
| 12 | +<!--Remove all the comments in this template before you sign-off or merge to the |
| 13 | +main branch. |
| 14 | +--> |
| 15 | + |
| 16 | +<!-- |
| 17 | +This template provides the basic structure of a concept article. |
| 18 | +See the [concept guidance](contribute-how-write-concept.md) in the contributor guide. |
| 19 | +
|
| 20 | +To provide feedback on this template contact |
| 21 | +[the templates workgroup](mailto:templateswg@microsoft.com). |
| 22 | +--> |
| 23 | + |
| 24 | +<!-- 1. H1 |
| 25 | +Required. Set expectations for what the content covers, so customers know the |
| 26 | +content meets their needs. Should NOT begin with a verb. |
| 27 | +--> |
| 28 | + |
| 29 | +# Azure Synapse Pathway behind the scenes |
| 30 | + |
| 31 | +Azure Synapse Pathway’s goal is to preserve the functional intent of the original code while optimizing for the modern Synapse SQL data warehouse. Synapse Pathway uses a three-stage process for translating SQL code from a source system to Azure Synapse SQL. Each of the stages preserves and augments the knowledge of the source with original location information and source specific metadata to ensure the highest quality translation. |
| 32 | +  |
| 33 | + |
| 34 | +## Stage 1 – Lexing and Parsing |
| 35 | +SQL Language parsing is a problem that has been solved many times over. Commercial and open source parsers exist to help with the underlying process of taking a source statement or fragment, breaking it down into logical tokens and then executing against a set or parser rules to ensure language consistency. |
| 36 | + |
| 37 | +Synapse Pathway defines source grammars that allow the tool to identify and process the SQL input into an augmented Abstract Syntax Tree (AST) used in further processing. |
| 38 | + |
| 39 | +## Augmented Abstract Syntax Tree (AST) |
| 40 | +Synapse Pathway defines a common representation of all objects in an augment Abstract Syntax Tree (AST). The Synapse Pathway AST includes additional statement or fragment metadata that is used to make the proper decision when translating code between systems. |
| 41 | + |
| 42 | +By not just tracking that a token is a function but rather the source system type requirement, the script generation components can make smarter decisions about translating to Synapse SQL. |
| 43 | + |
| 44 | +For example, the source function for the absolute function is defined as: |
| 45 | +```sql |
| 46 | +ABS( float_expression ) |
| 47 | +``` |
| 48 | +Azure Synapse SQL defines the absolute function as: |
| 49 | +```sql |
| 50 | +ABS ( numeric_expression ) |
| 51 | +``` |
| 52 | +In this simple case, Synapse Pathway understands that the conversion in Synapse SQL from float to numeric is an implicit [conversion](..https://docs.microsoft.com/en-us/sql/t-sql/functions/cast-and-convert-transact-sql?view=sql-server-ver15#implicit-conversions) and requires no further type casting. Simple, clean and effective code translation. |
| 53 | + |
| 54 | +Retaining this meta-information about the source statements and fragments assists in the structural differences between platforms – conversions in opt out logic for search condition predicates in a WHERE clause for example. |
| 55 | + |
| 56 | + |
| 57 | +## Syntax Generation |
| 58 | + |
| 59 | +The last stage of the process is to generate syntax for Synapse SQL. Using the AST structure generated from the source files, Synapse Pathway writes each DDL object to an individual file. The syntax generators utilize in-depth knowledge of the target platform to optimize statements. |
| 60 | + |
| 61 | +For example, a common pattern seen in data loading scenarios is to first delete all of the contents in a staging table and then loaded the data from another staging table in an INSERT/SELECT fashion. |
| 62 | +```sql |
| 63 | +DELETE staging.table1 ALL; |
| 64 | +INSERT INTO staging.table1…FROM staging.table2; |
| 65 | +``` |
| 66 | +Synapse SQL has an optimized path for this scenario – a [CREATE TABLE AS SELECT](https://docs.microsoft.com/azure/synapse-analytics/sql-data-warehouse/sql-data-warehouse-develop-ctas). The CTAS statement is a batched based operation and minimally logged driving performance up by utilizing all the compute infrastructure available. Without this insight about Synapse SQL, tools often produce a truncate and INSERT/SELECT statement. |
| 67 | +```sql |
| 68 | +TRUNCATE TABLE staging.table1; |
| 69 | +INSERT INTO staging.table1…FROM staging.table2; |
| 70 | +``` |
| 71 | +While not bad, this can be optimized to a DROP TABLE and CTAS to provide better performance. |
| 72 | +```sql |
| 73 | +DROP TABLE staging.table1; |
| 74 | +CREATE TABLE staging.table1 |
| 75 | +WITH |
| 76 | +( |
| 77 | + -- Derived from the original table definition |
| 78 | + DISTRIBUTION = HASH(column1), |
| 79 | + -- Derived from the original table definition |
| 80 | + CLUSTERED COLUMNSTORE INDEX |
| 81 | +) |
| 82 | +AS SELECT * FROM staging.table2; |
| 83 | +``` |
| 84 | +## Next steps |
| 85 | +<!-- Add a context sentence for the following links --> |
| 86 | +- [Download Azure Synapse Pathway](synapse-pathway-download.md) |
| 87 | +- [FAQ](pathway-faq.md) |
| 88 | + |
| 89 | +<!-- |
| 90 | +Remove all the comments in this template before you sign-off or merge to the |
| 91 | +main branch. |
| 92 | +--> |
0 commit comments