The Data Junction

Follow

The Data Junction

Follow

Simplifying Data Integration // Data Transformations with ADF: Merge Sources and Export to Parquet.

Simplifying Data Integration // Data Transformations with ADF: Merge Sources and Export to Parquet.

Arpit Tyagi's photo

··

1 min read

Table of contents

Step 1: Inspecting the CSV File in Data Lake and SQL Table present in Azure SQL DB
Step 2: Overview of the Dataflow for the task and then we will dig deeper into each step of this snapshot. Choose both sources i.e. “SQL DB and CSV file in ADLS”
Step 3: Use Join tool after that and select “Inner join“ for further work.
Step 4: Use Join on Customer id as that is the common field and choose inner join.
Step 5: Apply the filter transformation in the data. I have filtered all cust ids b/w 29500 and 30000.
Step 6: Sink Location would be a Parquet File in data lake so dataset has been chosen accordingly.
Step 7: Integrating Data Flow into a Pipeline: Directing Data to ADLS’s Parquet folder. Data must be saved into Parquet format in Data Lake.
Step 8: Pipeline Execution Success: Ensuring Smooth Data Transfer and Data Transformation.
Step 9: Verifying Parqute file in the data lake in Azure.

Step 1: Inspecting the CSV File in Data Lake and SQL Table present in Azure SQL DB

Step 2: Overview of the Dataflow for the task and then we will dig deeper into each step of this snapshot. Choose both sources i.e. “SQL DB and CSV file in ADLS”

Step 3: Use Join tool after that and select “Inner join“ for further work.

Step 4: Use Join on Customer id as that is the common field and choose inner join.

Step 5: Apply the filter transformation in the data. I have filtered all cust ids b/w 29500 and 30000.

Step 6: Sink Location would be a Parquet File in data lake so dataset has been chosen accordingly.

Step 7: Integrating Data Flow into a Pipeline: Directing Data to ADLS’s Parquet folder. Data must be saved into Parquet format in Data Lake.

Step 8: Pipeline Execution Success: Ensuring Smooth Data Transfer and Data Transformation.

Step 9: Verifying Parqute file in the data lake in Azure.

Azure data-engineering Cloud Engineering Azure Data Factory Azure Data Lake Storage data migration ETL ELT