Data Factory - exception handling with files with no data rows but headers only

Today we will learn on how to apply exception handling when loading csv files into Azure SQL database that has no data rows but one header line
Scenario: Suppose you want to load data into Azure SQL incrementally that from csv files generated every N minutes into Azure data lake storage (ADLS). You want to apply exception handling to make sure when empty files comes in, data factory will not fail.

Solution: Please make sure you are generating empty files (from the source) when there is no data (i.e.: without header row). Data factory will succeed if the file is completely blank. This is out of the box data factory's functionality. In case of header row, it will fail. So tweak your source logic.


Steps after source file is generated:
  • Create a Get Metadata task in data factory
    • In the Field list, specify an argument "exists"
    • We will use this argument in the if condition to see if the file exists in the folder. 
      • This is done to prevent from error "path doesn't exist / invalid source"

  • Drag "If Condition" task and connect it to the Get Metadata task (as shown above in the first picture)
    • Specify an expression to check if the file exists (dataset path is valid)
    • Expression: @and(bool(activity('check if file exists').output.exists),bool(true))

    • In the True condition, insert your incremental load logic.
      • copy activity to copy data from source to target
  • You are done.
  • Run your Pipeline.
  • Here after successfully loading data into SQL, i am moving the file to an archive ADLS container.




Note: Above image shows that now both blank files (0 bytes) as well as files with data rows are being copied to the Archive folder after successful load in SQL.
  • In case of Blank file without header row, Data factory will succeed and proceed to the next task. No data is written in the database and data factory pipeline is successful.


No comments:

Post a Comment