Some tips & tricks we learnt while migrating 225 million #Azure Storage Table Entities in @AzureCosmosDB (SQL API) Container using @cerebratasoft Cerulean. A thread...
1/ Co-locate your resources - make sure your #Azure Storage, @AzureCosmosDB account and the VM are in the same region. This will not only reduce the latency but also will save you on egress charges.
2/ Turn off indexing on your @AzureCosmosDB container while migration process is running. This will preserve some of the allocated RU/s.
3/ Allocate enough throughput on your @AzureCosmosDB container while migration process is running. This will prevent throttling errors and will speed up the migration process considerably.
4/ Choose "Inserts" over "Upserts" when creating documents in your @AzureCosmosDB container. I'm told that insert operations are cheaper than upsert operations.
5/ Handle "Throttling" errors. They are transient errors and should be retried. You should try to recreate the documents in your @AzureCosmosDB container if you encounter these errors.
6/ Do bulk operation. Instead of creating individual documents in your @AzureCosmosDB container, group them together in batch of a maximum of 100 documents and upload the batch.
7/ Paralllize what you can. Upload document batches in parallel in your @AzureCosmosDB container. We were able to migrate those 225 million #Azure Storage Table Entities in less than 20 hours when batches were uploaded in parallel (sequential process took 55 hours!).
8/ A detailed blog post on this coming soon. More information about @cerebratasoft Cerulean can be found on our website at https://cerebrata.com .