Cosmos DB Data Migration Between Containers using Databricks and PySpark
top of page

Cosmos DB Data Migration Between Containers using Databricks and PySpark

Updated: Nov 18, 2021


What is Azure Cosmos DB?

Azure Cosmos DB is a fully managed, fast and cost-effective NoSQL database with multi-write data distribution to any Azure region. Cosmos DB cost of all database operations is measured by Request Units. A Request Unit (RU) is an abstraction of system resources, CPU, IO and memory. When you create a Cosmos DB database, you can provision several RUs per database or for each container (data organization units, similar to what we call tables in RDBMS). If RUs were provisioned to the database, they are shared among all the database containers. Once you provision your throughput, it doesn't matter if you run fewer queries. You will pay on an hourly basis in increments of 100 RUs/sec 24 hours a day 7 days a week.


Suppose your container usage grows and your read and write operations consume more RUs than allocated. In that case, your operations will either be throttled if you are using manual throughput, or your overall throughput will be auto-scaled. However, if you want to avoid getting unexpected bills from Azure, it's better to prevent the auto-scale option and use Azure Functions to change the throughput based on activity, time of the day or day of the week.


Note that there is no way to change from database dedicated throughput to container dedicated throughput once a container is created.

You will need to create a new container, provision it with dedicated throughput and migrate your data from the old to the new container.


You will face the same issue if you want to change a container partition key, a field that divides the data into the logical subsets and moves the data across physical partitions. Here you can read more on partitioning in Cosmos DB. You will also need to create a new container and transfer the data if you need to change the partition key.


What is the fastest and easiest way to migrate container data? Read my article on mssqltips.com to find out.

0 comments

STAY IN TOUCH

Get New posts delivered straight to your inbox

Thank you for subscribing!

bottom of page