Data Solutions: March 2020

Azure Data Factory: Upsert using Data Flows

Today we will learn on how to perform upsert in Azure data factory (ADF) using data flows

Scenario:

We will be ingesting a csv stored in Azure Storage (ADLS V2) into Azure SQL by using Upsert method

Steps:

1. Create two connections (linked Services) in the ADF:

1. One for the csv stored in ADLS

2. One for the target Azure SQL

Read full article »

Informatica Cloud - Copy Files from source to target

Today we will learn on how to copy files using Informatica cloud from source to target server using File processor connector

Steps:

1. Create an Informatica File processor connection to define source and target destination

1. Source file directory: enter source file path (e.g.: \\Source_Server_name\directory1)

2. Target file directory: enter target file path (e.g.: \\target_Server_name\target_dir\current)

3. Select file: All (for selecting all the files)

Read full article »

Informatica Cloud - Ingest Data from Rest Api into Azure SQL

Today we will learn on how to fetch data from Rest Api and ingest into Azure SQL using Informatica Cloud data integration

Steps to Ingest data from REST API to Azure SQL DB via Informatica:

1. Create a test SQL Server Database in Azure

· Setup a new server for the database

· For testing, use “Public endpoint” while setting up Azure server for database (this has to be properly setup when using for production)

· Add your current computer’s IP address in the Azure server firewall

· Add Informatica server’s IP address in the Azure server firewall

o If you don’t add informatica server’s IP, Informatica will throw error “client with IP *** is not allowed to access the server”

· Click Create the server.

· Once the server is created, click OK to create the database

Read full article »

Azure Data Factory - Upsert using Pipeline approach instead of data flows

Today we will learn on how to perform upsert in Azure data factory (ADF) using pipeline approach instead of using data flows

Task:

We will be loading data from a csv (stored in ADLS V2) into Azure SQL with upsert using Azure data factory. Once the data load is finished, we will move the file to Archive directory and add a timestamp to file that will denote when this file was being loaded into database

Benefits of using Pipeline:

As you know, triggering a data flow will add cluster start time (~5 mins) to your job execution time. Although, you can make use of the Time to live (TTL) setting in your Azure integration runtime (IR) to decrease the cluster time but, still a cluster might take around (2 mins) to start a spark context. so, if you are thinking of creating a real time data load process, the pipeline approach will work best as it does not need a cluster to run and can execute in seconds.

Pre-requisites

An Azure Data Factory resource
An Azure Storage account (Gen 2)
An Azure SQL Database

Read full article »

Informatica Cloud - Copy data from SQL to Azure ADLS

Today we will learn on how to load data from SQL database to Azure ADLS as file

Purchase and enable Microsoft Azure data lake (ADLS) Gen 2 connector for Informatica
Register your informatica app with Azure AD

This app will act as a service principal
Note down the client secret and client ID
Give appropriate permissions to Informatica app (in Azure)

Storage blob data contributor
Internet access to Informatica to talk to Azure ADLS

Create an Informatica connection to Azure ADLS Gen 2

Enter the following information:

Client ID
Client Secret
Tenant ID
Account Name (Azure Storage account name)
File System Name (Container name within a storage account)
Directory Path: / (leave as default)

Read full article »

Menu

Azure Data Factory: Upsert using Data Flows

Informatica Cloud - Copy Files from source to target

Informatica Cloud - Ingest Data from Rest Api into Azure SQL

Azure Data Factory - Upsert using Pipeline approach instead of data flows

Informatica Cloud - Copy data from SQL to Azure ADLS