DP-203 Dumps

DP-203 Free Practice Test

Microsoft DP-203: Data Engineering on Microsoft Azure

QUESTION 61

- (Exam Topic 3)
You are designing an Azure Databricks table. The table will ingest an average of 20 million streaming events per day.
You need to persist the events in the table for use in incremental load pipeline jobs in Azure Databricks. The solution must minimize storage costs and incremental load times.
What should you include in the solution?

Correct Answer: A
The Databricks ABS-AQS connector uses Azure Queue Storage (AQS) to provide an optimized file source that lets you find new files written to an Azure Blob storage (ABS) container without repeatedly listing all of the files.
This provides two major advantages:
DP-203 dumps exhibitLower costs: no more costly LIST API requests made to ABS.
Reference:
https://docs.microsoft.com/en-us/azure/databricks/spark/latest/structured-streaming/aqs

QUESTION 62

- (Exam Topic 1)
You need to integrate the on-premises data sources and Azure Synapse Analytics. The solution must meet the data integration requirements.
Which type of integration runtime should you use?

Correct Answer: C

QUESTION 63

- (Exam Topic 3)
You manage an enterprise data warehouse in Azure Synapse Analytics.
Users report slow performance when they run commonly used queries. Users do not report performance changes for infrequently used queries.
You need to monitor resource utilization to determine the source of the performance issues. Which metric should you monitor?

Correct Answer: C
Monitor and troubleshoot slow query performance by determining whether your workload is optimally leveraging the adaptive cache for dedicated SQL pools.
Reference:
https://docs.microsoft.com/en-us/azure/synapse-analytics/sql-data-warehouse/sql-data-warehouse-how-to-monit

QUESTION 64

- (Exam Topic 3)
You create an Azure Databricks cluster and specify an additional library to install. When you attempt to load the library to a notebook, the library in not found.
You need to identify the cause of the issue. What should you review?

Correct Answer: C
Cluster-scoped Init Scripts: Init scripts are shell scripts that run during the startup of each cluster node before the Spark driver or worker JVM starts. Databricks customers use init scripts for various purposes such as installing custom libraries, launching background processes, or applying enterprise security policies.
Logs for Cluster-scoped init scripts are now more consistent with Cluster Log Delivery and can be found in the same root folder as driver and executor logs for the cluster.
Reference:
https://databricks.com/blog/2018/08/30/introducing-cluster-scoped-init-scripts.html

QUESTION 65

- (Exam Topic 3)
You have a self-hosted integration runtime in Azure Data Factory.
The current status of the integration runtime has the following configurations:
DP-203 dumps exhibit Status: Running
DP-203 dumps exhibit Type: Self-Hosted
DP-203 dumps exhibit Version: 4.4.7292.1
DP-203 dumps exhibit Running / Registered Node(s): 1/1
DP-203 dumps exhibit High Availability Enabled: False
DP-203 dumps exhibit Linked Count: 0
DP-203 dumps exhibit Queue Length: 0
DP-203 dumps exhibit Average Queue Duration. 0.00s
The integration runtime has the following node details:
DP-203 dumps exhibit Name: X-M
DP-203 dumps exhibit Status: Running
DP-203 dumps exhibit Version: 4.4.7292.1
DP-203 dumps exhibit Available Memory: 7697MB
DP-203 dumps exhibit CPU Utilization: 6%
DP-203 dumps exhibit Network (In/Out): 1.21KBps/0.83KBps
DP-203 dumps exhibit Concurrent Jobs (Running/Limit): 2/14
DP-203 dumps exhibit Role: Dispatcher/Worker
DP-203 dumps exhibit Credential Status: In Sync
Use the drop-down menus to select the answer choice that completes each statement based on the information presented.
NOTE: Each correct selection is worth one point.
DP-203 dumps exhibit
Solution:
Box 1: fail until the node comes back online
We see: High Availability Enabled: False
Note: Higher availability of the self-hosted integration runtime so that it's no longer the single point of failure in your big data solution or cloud data integration with Data Factory.
Box 2: lowered We see:
Concurrent Jobs (Running/Limit): 2/14 CPU Utilization: 6%
Note: When the processor and available RAM aren't well utilized, but the execution of concurrent jobs reaches a node's limits, scale up by increasing the number of concurrent jobs that a node can run
Reference:
https://docs.microsoft.com/en-us/azure/data-factory/create-self-hosted-integration-runtime

Does this meet the goal?

Correct Answer: A