- (Exam Topic 2)
You are implementing an Azure Data Factory data flow that will use an Azure Cosmos DB (SQL API) sink to write a dataset. The data flow will use 2,000 Apache Spark partitions.
You need to ensure that the ingestion from each Spark partition is balanced to optimize throughput. Which sink setting should you configure?
Correct Answer:
C
Batch size: An integer that represents how many objects are being written to Cosmos DB collection in each batch. Usually, starting with the default batch size is sufficient. To further tune this value, note:
Cosmos DB limits single request's size to 2MB. The formula is "Request Size = Single Document Size * Batch Size". If you hit error saying "Request size is too large", reduce the batch size value.
The larger the batch size, the better throughput the service can achieve, while make sure you allocate enough RUs to empower your workload.
Reference: https://docs.microsoft.com/en-us/azure/data-factory/connector-azure-cosmos-db
- (Exam Topic 2)
You are designing an Azure Cosmos DB Core (SQL) API solution to store data from IoT devices. Writes from the devices will be occur every second.
The following is a sample of the data.
You need to select a partition key that meets the following requirements for writes: Minimizes the partition skew
Avoids capacity limits Avoids hot partitions What should you do?
Correct Answer:
D
Use a partition key with a random suffix. Distribute the workload more evenly is to append a random number at the end of the partition key value. When you distribute items in this way, you can perform parallel write operations across partitions.
Reference: https://docs.microsoft.com/en-us/azure/cosmos-db/sql/synthetic-partition-keys
- (Exam Topic 2)
You have an Azure Cosmos DB Core (SQL) API account that is used by 10 web apps.
You need to analyze the data stored in the account by using Apache Spark to create machine learning models. The solution must NOT affect the performance of the web apps.
Which two actions should you perform? Each correct answer presents part of the solution. NOTE: Each correct selection is worth one point.
Correct Answer:
AD
Reference:
https://github.com/microsoft/MCW-Cosmos-DB-Real-Time-Advanced-Analytics/blob/main/Hands-on lab/H
- (Exam Topic 2)
You need to configure an Apache Kafka instance to ingest data from an Azure Cosmos DB Core (SQL) API account. The data from a container named telemetry must be added to a Kafka topic named iot. The solution must store the data in a compact binary format.
Which three configuration items should you include in the solution? Each correct answer presents part of the solution.
NOTE: Each correct selection is worth one point.
Correct Answer:
CDF
C: Avro is binary format, while JSON is text.
F: Kafka Connect for Azure Cosmos DB is a connector to read from and write data to Azure Cosmos DB. The Azure Cosmos DB sink connector allows you to export data from Apache Kafka topics to an Azure Cosmos DB database. The connector polls data from Kafka to write to containers in the database based on the topics subscription.
D: Create the Azure Cosmos DB sink connector in Kafka Connect. The following JSON body defines config for the sink connector.
Extract:
"connector.class": "com.azure.cosmos.kafka.connect.sink.CosmosDBSinkConnector", "key.converter": "org.apache.kafka.connect.json.AvroConverter" "connect.cosmos.containers.topicmap": "hotels#kafka"
Reference:
https://docs.microsoft.com/en-us/azure/cosmos-db/sql/kafka-connector-sink https://www.confluent.io/blog/kafka-connect-deep-dive-converters-serialization-explained/
- (Exam Topic 2)
Note: This question is part of a series of questions that present the same scenario. Each question in the series contains a unique solution that might meet the stated goals. Some question sets might have more than one correct solution, while others might not have a correct solution.
After you answer a question in this section, you will NOT be able to return to it. As a result, these questions will not appear in the review screen.
You have an Azure Cosmos DB Core (SQL) API account named account 1 that uses autoscale throughput. You need to run an Azure function when the normalized request units per second for a container in account1 exceeds a specific value.
Solution: You configure an Azure Monitor alert to trigger the function. Does this meet the goal?
Correct Answer:
A
You can set up alerts from the Azure Cosmos DB pane or the Azure Monitor service in the Azure portal. Note: Alerts are used to set up recurring tests to monitor the availability and responsiveness of your Azure
Cosmos DB resources. Alerts can send you a notification in the form of an email, or execute an Azure Function when one of your metrics reaches the threshold or if a specific event is logged in the activity log.
Reference: https://docs.microsoft.com/en-us/azure/cosmos-db/create-alerts