Simplifying Data Integration // Data Transformations with ADF: Merge Sources and Export to Parquet.
Table of contents
- Step 1: Inspecting the CSV File in Data Lake and SQL Table present in Azure SQL DB
- Step 2: Overview of the Dataflow for the task and then we will dig deeper into each step of this snapshot. Choose both sources i.e. “SQL DB and CSV file in ADLS”
- Step 3: Use Join tool after that and select “Inner join“ for further work.
- Step 4: Use Join on Customer id as that is the common field and choose inner join.
- Step 5: Apply the filter transformation in the data. I have filtered all cust ids b/w 29500 and 30000.
- Step 6: Sink Location would be a Parquet File in data lake so dataset has been chosen accordingly.
- Step 7: Integrating Data Flow into a Pipeline: Directing Data to ADLS’s Parquet folder. Data must be saved into Parquet format in Data Lake.
- Step 8: Pipeline Execution Success: Ensuring Smooth Data Transfer and Data Transformation.
- Step 9: Verifying Parqute file in the data lake in Azure.