A few months ago, I was involved in an interesting project where a large customer (not to be named due to NDA) needed to migrate their entire Azure cloud subscription to another subscription. This was a difficult and arduous process that involved several PaaS technologies, besides SQL Server, that I didn't have experience with before.
But it presented very interesting challenges and opportunities to learn new things.
One of these was the need to migrate an entire Azure Cosmos DB with Table Storage API account from one subscription to another.
Normally, migrating an Azure Cosmos DB account would be relatively straightforward. Most of our migrations involved accounts using NoSQL API and MongoDB API.
These two APIs are natively supported as Linked Service types in Azure Data Factory. So, it was fairly simple setting up pipelines for those.
However, ADF doesn't offer a native Linked Service type for an Azure Cosmos DB using the Tables API.
The closest that it offers is the simple Azure Tables Storage account. But unfortunately, it didn't work with the Cosmos DB account (despite claims from Microsoft staff that it's supposed to be compatible).
After some more research done by Daniel Kleiman (the talented), we found a little-known open-source tool developed by Microsoft staff. This tool is called "The Data Migration Desktop Tool", known as "DMT" for short.
It's a relatively bare-bones migration tool but it works well with multiple data sources and destination types, including Cosmos DB Tables API accounts (amongst others).
However, using this tool as-is out-of-the-box was not good enough. While it supports migration from one Azure Cosmos DB Tables API account to another, it's only able to do so per one specific table at a time. It's even capable of doing incremental upserts automatically, which is a huge plus.
Setting up the tool involves editing a JSON-formatted settings file where the source and destination are specified. It's usually located in the path "C:\windows-package\migrationsettings.json".
An example settings file for migrating from one CosmosDB TablesAPI account to another would look like this:
{
"Source": "AzureTableAPI",
"Sink": "AzureTableAPI",
"SourceSettings": {
"ConnectionString": "DefaultEndpointsProtocol=https;AccountName=<CosmosDBAccountName>;AccountKey=<Tables API Key>;TableEndpoint=https://<CosmosDBAccountName>.table.cosmos.azure.com:443/;",
"Table": "SourceTableName"
},
"SinkSettings": {
"ConnectionString": "DefaultEndpointsProtocol=https;AccountName=<CosmosDBAccountName>;AccountKey=<Tables API Key>;TableEndpoint=https://<CosmosDBAccountName>.table.cosmos.azure.com:443/;",
"Table": "SinkTableName"
}
}
We would need to edit this settings file and generate valid settings per each table, execute the migration, and then repeat the same thing for each table.
But we had hundreds of tables to migrate... So what could we do?
Luckily, that's exactly the sort of thing for which Powershell was specially made for!
So, using the experience I gathered in Powershell over the years, I wrote a script that does exactly that!
Executes the Get-AzCosmosDBTable cmdlet to retrieve the list of all tables in the source CosmosDB account.
Executes the same Get-AzCosmosDBTable cmdlet to retrieve the list of existing tables in the destination CosmosDB account.
For each source table that doesn't exist in the destination, run the New-AzCosmosDBTable cmdlet to create it at the destination.
Generate the migration settings for the current table and save it in the settings file using the cmdlets ConvertTo-Json (to convert an object to a JSON doc) and Out-File (to store the JSON in a file).
Run the Start-Process cmdlet to run the DMT executable file and perform the migration.
You can find the example script file in the Madeira Toolbox GitHub repository here:
Note: You would need the Powershell modules Az.Accounts and Az.CosmosDB for this purpose, but the example script provided above already includes automatically installing these modules if they don't exist already.
Conclusion
The example script, available in the Madeira Toolbox GitHub repository, serves as a practical resource for anyone facing a similar migration task. It's also a demonstration of how PowerShell can be a powerful tool for managing complex migrations and orchestrating data transfer between Azure Cosmos DB accounts.
Overall, this project not only provided valuable experience with PaaS technologies and Azure services but also highlighted the importance of exploring alternative tools and leveraging scripting capabilities to overcome limitations and simplify complex tasks. By combining the right tools and techniques, we successfully achieved the migration of the entire Azure Cosmos DB with Table Storage API account.
To learn more about DMT, check out these additional resources:
Comments