Import data to Azure Cosmos DB using Azure DocumentDB Data Migration Tool

Today we will learn how to import data from our SQL Server database to Cosmos DB on Azure.
We will use "AdventureWorks2016" sample database for this article. Here we are now going to migrate our SQL Database from MS SQL Server database to Document DB on Azure using Azure DocumentDB Data Migration Tool.

Step-1 Get the database backup file ready(.Bak)

  • Now I have restored this database to my on-prem SQL server database. I did this by using restore and move the SQL query. You can do this by restore the database wizard also.
  • So, I have AdventureWorks database available in my database list and I am querying some data to be imported on the Cosmos. You can choose the data as per your business need which means in a real environment you can do the same with your real database. For this article, we have taken the sample AdventureWorks database.
  • Here we see several tables and views of AdventureWorks database, I am going to use View “vStoreWithAddresses”
  • You might have noticed that the names of the views and tables are in dot notation where first part denotes the primary entity name.
  • For example, "vSalesPerson" is prefixed by Sales Entity. And therefore, the full name is [Sales.vSalesPerson]
  • This view joins many tables and returns 712 rows on selecting.
  • Similarly, if we fetch those addresses where adresstype = ‘Main office’, we will get 701 rows. But we need to modify this query in dot notation so that we can use partition key in cosmos DB to display the denormalized data in JSON format.
USE [AdventureWorks2016] 
GO 
SELECT CAST([BusinessEntityID] as varchar) as [id] 
,[Name] as [name] ,[AddressType] as [address.addressType] 
,[AddressLine1] as [address.addressLine1] 
,[AddressLine2] as [address.addressLine2] 
,[City] as [address.location.city] 
,[StateProvinceName] as [address.location.stateProvinceName] 
,[PostalCode] as [address.postalCode] 
,[CountryRegionName] as [address.countryRegionName] 
FROM [Sales].[vStoreWithAddresses] 
WHERE [AddressType] = 'Main Office' 
GO 
  • In the above query Address Type, line1, Line2, postalCode, countryRegionName are part of the Address and therefore prefixed by address, and City and StateProvineName are part of the Location. 
  • We will import this to cosmos Db and will see how cosmos DB leverages this notation and gives us the nested format of data.
  • So as the result of the above query we get 701 filtered rows as below:
Step-2 Download the Document DB data Migration tool
  • You will get a compressed folder in your download file location. Extract it and run the dtui.exe for GUI based installation.
Step -3 Run the Document DB Data Migration Tool
  • Once you extract the files, you will see two executables files dt.exe and dtui, select dtui.exe for windows using wizard as below
  • Double click on this setup and start the configuration settings.
  • It will pop up a dialog box, click on next, and select SQL from “import from” dropdown because our data source is SQL. We enter a dot(.) as the nested separator. because we want to use it to determine the nested data. Here .(dot) will work as a delimiter for us.
  • We click next, and on the next screen, we have to enter the connection string to connect with our SQL server.
  • I am trying to connect with AdventureWorks2016 database from my local SQL server instance. You can try with any database you want to query with cosmos DB.
  • My connection string is data source=.;initial catalog=AdventureWorks2016;integrated security=true
  • Now copy-paste the select view query as below:
  • Click on Next and on the next screen, we will enter the details of our cosmos DB account.
  • Go to your cosmos DB account on the Azure portal. Search Key and copy the Primary Key and paste in the data migration tool.
  • Copy-paste this connection string in Data migration tool and remember to append this connection string with the database name. Because this string points to your cosmos DB account, not the database. So, you need to specify the database name explicitly in which you want to import the data.
  • Here I am adding my connection string with Database=mydb;
  • This DB does not exist, so Cosmos DB creates it for us. Similarly, I am entering mystore for the collection. This collection does not exist so it will be generated by cosmos DB .
  • We specify a partition key as : /address/postalCode
  • If you expand Advanced Options, you will see a Text area to enter index policy.
  • Right-click on it and you will see two options – default and Range. We will use Default for now.
  • Click Next few screens and Click on Import, the tool will start importing the result of the query to the Cosmos DB in JSON document format.
  • Now go to your Cosmos DB account on Azure Portal and refresh it. In the overview blade, you can see a new collection. 

  • Clicking on this collection, you can see our data imported in the form of documents.
  • By clicking on every document, you can see the entire JSON formatted document on the right side area, you can easily observe the structure with nested Address and Location properties and this is because we used dot limiter for our column alias in our SQL server view.
  • For example, location is further nested into city and stateprovincename. This looks similar to a normalized separate table in SQL, but here denormalized data managed in the form of nested elements, as we can see in the below example.
  • So, all set up now, we get successfully imported the SQL data to cosmos DB.
  • Now we are ready to query with SQL API.
  • Happy Learning with CosmosDB !! 😊

Comments