[Feb-2022] Verified DP-200 dumps Q&As - DP-200 dumps with Correct Answers
The Best Azure Data Engineer Associate Study Guide for the DP-200 Exam
Microsoft Implementing an Azure Data Solution Exam Certification Details:
| Duration | 120 mins |
| Exam Price | $165 (USD) |
| Sample Questions | Microsoft Implementing an Azure Data Solution Sample Questions |
| Number of Questions | 40-60 |
| Books / Training | Course DP-200T01-A: Implementing an Azure Data Solution |
| Exam Code | DP-200 |
| Exam Name | Microsoft Certified - Azure Data Engineer Associate |
| Passing Score | 700 / 1000 |
| Schedule Exam | Pearson VUE |
NEW QUESTION 53
You need to provision the polling data storage account.
How should you configure the storage account? To answer, drag the appropriate Configuration Value to the correct Setting. Each Configuration Value may be used once, more than once, or not at all. You may need to drag the split bar between panes or scroll to view content.
NOTE: Each correct selection is worth one point.
Answer:
Explanation:
Explanation
Account type: StorageV2
You must create new storage accounts as type StorageV2 (general-purpose V2) to take advantage of Data Lake Storage Gen2 features.
Scenario: Polling data is stored in one of the two locations:
* An on-premises Microsoft SQL Server 2019 database named PollingData
* Azure Data Lake Gen 2
Data in Data Lake is queried by using PolyBase
Replication type: RA-GRS
Scenario: All services and processes must be resilient to a regional Azure outage.
Geo-redundant storage (GRS) is designed to provide at least 99.99999999999999% (16 9's) durability of objects over a given year by replicating your data to a secondary region that is hundreds of miles away from the primary region. If your storage account has GRS enabled, then your data is durable even in the case of a complete regional outage or a disaster in which the primary region isn't recoverable.
If you opt for GRS, you have two related options to choose from:
* GRS replicates your data to another data center in a secondary region, but that data is available to be read only if Microsoft initiates a failover from the primary to secondary region.
* Read-access geo-redundant storage (RA-GRS) is based on GRS. RA-GRS replicates your data to another data center in a secondary region, and also provides you with the option to read from the secondary region. With RA-GRS, you can read from the secondary region regardless of whether Microsoft initiates a failover from the primary to secondary region.
References:
https://docs.microsoft.com/bs-cyrl-ba/azure/storage/blobs/data-lake-storage-quickstart-create-account
https://docs.microsoft.com/en-us/azure/storage/common/storage-redundancy-grs
NEW QUESTION 54
Note: This question is part of a series of questions that present the same scenario. Each question in the series contains a unique solution that might meet the stated goals. Some question sets might have more than one correct solution, while others might not have a correct solution.
After you answer a question in this section, you will NOT be able to return to it. As a result, these questions will not appear in the review screen.
You have an Azure SQL database named DB1 that contains a table named Table1. Table1 has a field named Customer_ID that is varchar(22).
You need to implement masking for the Customer_ID field to meet the following requirements:
* The first two prefix characters must be exposed.
* The last four suffix characters must be exposed.
* All other characters must be masked.
Solution: You implement data masking and use a custom string function mask.
Does this meet the goal?
- A. Yes
- B. No
Answer: B
Explanation:
Must use Custom Text data masking, which exposes the first and last characters and adds a custom padding string in the middle.
Reference:
https://docs.microsoft.com/en-us/azure/sql-database/sql-database-dynamic-data-masking-get-started
NEW QUESTION 55
You have an Azure SQL data warehouse.
Using PolyBase, you create table named [Ext].[Items] to query Parquet files stored in Azure Data Lake Storage Gen2 without importing the data to the data warehouse.
The external table has three columns.
You discover that the Parquet files have a fourth column named ItemID.
Which command should you run to add the ItemID column to the external table?
- A. Option A
- B. Option B
- C. Option D
- D. Option C
Answer: A
Explanation:
Explanation
References:
https://docs.microsoft.com/en-us/sql/t-sql/statements/create-external-table-transact-sql
NEW QUESTION 56
You manage the Microsoft Azure Databricks environment for a company. You must be able to access a private Azure Blob Storage account. Data must be available to all Azure Databricks workspaces. You need to provide the data access.
Which three actions should you perform in sequence? To answer, move the appropriate actions from the list of actions to the answer area and arrange them in the correct order.
Answer:
Explanation:
Step 1: Create a secret scope
Step 2: Add secrets to the scope
Note: dbutils.secrets.get(scope = "<scope-name>", key = "<key-name>") gets the key that has been stored as a secret in a secret scope.
Step 3: Mount the Azure Blob Storage container
You can mount a Blob Storage container or a folder inside a container through Databricks File System - DBFS. The mount is a pointer to a Blob Storage container, so the data is never synced locally.
Note: To mount a Blob Storage container or a folder inside a container, use the following command:
Python
dbutils.fs.mount(
source = "wasbs://<your-container-name>@<your-storage-account-name>.blob.core.windows.net", mount_point = "/mnt/<mount-name>", extra_configs = {"<conf-key>":dbutils.secrets.get(scope = "<scope-name>", key = "<key-name>")}) where:
dbutils.secrets.get(scope = "<scope-name>", key = "<key-name>") gets the key that has been stored as a secret in a secret scope.
References:
https://docs.databricks.com/spark/latest/data-sources/azure/azure-storage.html
NEW QUESTION 57
Your company uses Azure SQL Database and Azure Blob storage.
All data at rest must be encrypted by using the company's own key. The solution must minimize administrative effort and the impact to applications which use the database.
You need to configure security.
What should you implement? To answer, select the appropriate option in the answer area.
NOTE: Each correct selection is worth one point.
Answer:
Explanation:
References:
https://docs.microsoft.com/en-us/azure/sql-database/transparent-data-encryption-azure-sql
https://docs.microsoft.com/en-us/azure/storage/common/storage-service-encryption
NEW QUESTION 58
Note: This question is part of a series of questions that present the same scenario. Each question in the series contains a unique solution that might meet the stated goals. Some question sets might have more than one correct solution, while others might not have a correct solution.
After you answer a question in this scenario, you will NOT be able to return to it. As a result, these questions will not appear in the review screen.
You have a container named Sales in an Azure Cosmos DB database. Sales has 120 GB of data. Each entry in Sales has the following structure.
The partition key is set to the OrderId attribute.
Users report that when they perform queries that retrieve data by ProductName, the queries take longer than expected to complete.
You need to reduce the amount of time it takes to execute the problematic queries.
Solution: You change the partition key to include ProductName.
Does this meet the goal?
- A. Yes
- B. No
Answer: B
Explanation:
Explanation
One option is to have a lookup collection "ProductName" for the mapping of "ProductName" to "OrderId".
References:
https://azure.microsoft.com/sv-se/blog/azure-cosmos-db-partitioning-design-patterns-part-1/
NEW QUESTION 59
A company uses Microsoft Azure SQL Database to store sensitive company data. You encrypt the data and only allow access to specified users from specified locations.
You must monitor data usage, and data copied from the system to prevent data leakage.
You need to configure Azure SQL Database to email a specific user when data leakage occurs.
Which three actions should you perform in sequence? To answer, move the appropriate actions from the list of actions to the answer area and arrange them in the correct order.
Answer:
Explanation:
Explanation
NEW QUESTION 60
You need to mask tier 1 data. Which functions should you use? To answer, select the appropriate option in the answer area.
NOTE: Each correct selection is worth one point.
Answer:
Explanation:
Explanation:
A: Default
Full masking according to the data types of the designated fields.
For string data types, use XXXX or fewer Xs if the size of the field is less than 4 characters (char, nchar, varchar, nvarchar, text, ntext).
B: email
C: Custom text
Custom StringMasking method which exposes the first and last letters and adds a custom padding string in the middle. prefix,[padding],suffix Tier 1 Database must implement data masking using the following masking logic:
References:
https://docs.microsoft.com/en-us/sql/relational-databases/security/dynamic-data-masking
NEW QUESTION 61
You need to process and query ingested Tier 9 data.
Which two options should you use? Each correct answer presents part of the solution.
NOTE: Each correct selection is worth one point.
- A. Azure Stream Analytics
- B. Azure Notification Hub
- C. Apache Kafka statements
- D. Azure Cache for Redis
- E. Transact-SQL statements
- F. Azure Event Grid
Answer: A,F
Explanation:
Event Hubs provides a Kafka endpoint that can be used by your existing Kafka based applications as an alternative to running your own Kafka cluster.
You can stream data into Kafka-enabled Event Hubs and process it with Azure Stream Analytics, in the following steps:
* Create a Kafka enabled Event Hubs namespace.
* Create a Kafka client that sends messages to the event hub.
* Create a Stream Analytics job that copies data from the event hub into an Azure blob storage.
Scenario:
Tier 9 reporting must be moved to Event Hubs, queried, and persisted in the same Azure region as the company's main office References:
https://docs.microsoft.com/en-us/azure/event-hubs/event-hubs-kafka-stream-analytics
NEW QUESTION 62
You are designing a new Lambda architecture on Microsoft Azure.
The real-time processing layer must meet the following requirements:
Ingestion:
Receive millions of events per second
Act as a fully managed Platform-as-a-Service (PaaS) solution
Integrate with Azure Functions
Stream processing:
Process on a per-job basis
Provide seamless connectivity with Azure services
Use a SQL-based query language
Analytical data store:
Act as a managed service
Use a document store
Provide data encryption at rest
You need to identify the correct technologies to build the Lambda architecture using minimal effort. Which technologies should you use? To answer, select the appropriate options in the answer area.
NOTE: Each correct selection is worth one point.
Answer:
Explanation:
Explanation
Box 1: Azure Event Hubs
This portion of a streaming architecture is often referred to as stream buffering. Options include Azure Event Hubs, Azure IoT Hub, and Kafka.
NEW QUESTION 63
You need to develop a pipeline for processing data. The pipeline must meet the following requirements.
*Scale up and down resources for cost reduction.
*Use an in-memory data processing engine to speed up ETL and machine learning operations.
*Use streaming capabilities.
*Provide the ability to code in SQL, Python, Scala, and R.
*Integrate workspace collaboration with Git.
What should you use?
- A. Azure Stream Analytics
- B. Azure SQL Data Warehouse
- C. HDInsight Spark Cluster
- D. HDInsight Hadoop Cluster
Answer: A
NEW QUESTION 64
You implement an Azure SQL Data Warehouse instance.
You plan to migrate the largest fact table to Azure SQL Data Warehouse. The table resides on Microsoft SQL Server on-premises and is 10 terabytes (TB) is size.
Incoming queries use the primary key Sale Key column to retrieve data as displayed in the following table:
You need to distribute the large fact table across multiple nodes to optimize performance of the table.
Which technology should you use?
- A. round robin distributed table with clustered index
- B. hash distributed table with clustered index
- C. round robin distributed table with clustered ColumnStore index
- D. heap table with distribution replicate
- E. hash distributed table with clustered ColumnStore index
Answer: E
Explanation:
Hash-distributed tables improve query performance on large fact tables.
Columnstore indexes can achieve up to 100x better performance on analytics and data warehousing workloads and up to 10x better data compression than traditional rowstore indexes.
References:
https://docs.microsoft.com/en-us/azure/sql-data-warehouse/sql-data-warehouse-tables-distribute
https://docs.microsoft.com/en-us/sql/relational-databases/indexes/columnstore-indexes-query-performance
NEW QUESTION 65
Your company uses Microsoft Azure SQL Database configure with Elastic pool. You use Elastic Database jobs to run queries across all databases in the pod.
You need to analyze, troubleshoot, and report on components responsible for running Elastic Database jobs.
You need to determine the component responsible for running job service tasks.
Which components should you use for each Elastic pool job services task? To answer, drag the appropriate component to the correct task. Each component may be used once, more than once, or not at all. You may need to drag the split bar between panes or scroll to view content.
NOTE: Each correct selection is worth one point.
Answer:
Explanation:
References:
https://docs.microsoft.com/en-us/azure/sql-database/sql-database-job-automation-overview
NEW QUESTION 66
You are developing a solution that will stream to Azure Stream Analytics. The solution will have both streaming data and reference data.
Which input type should you use for the reference data?
- A. Azure IoT Hub
- B. Azure Event Hubs
- C. Azure Blob storage
- D. Azure Cosmos DB
Answer: C
Explanation:
Explanation
Stream Analytics supports Azure Blob storage and Azure SQL Database as the storage layer for Reference Data.
References:
https://docs.microsoft.com/en-us/azure/stream-analytics/stream-analytics-use-reference-data
NEW QUESTION 67
Your company uses Microsoft Azure SQL Database configure with Elastic pool. You use Elastic Database jobs to run queries across all databases in the pod.
You need to analyze, troubleshoot, and report on components responsible for running Elastic Database jobs.
You need to determine the component responsible for running job service tasks.
Which components should you use for each Elastic pool job services task? To answer, drag the appropriate component to the correct task. Each component may be used once, more than once, or not at all. You may need to drag the split bar between panes or scroll to view content.
NOTE: Each correct selection is worth one point.
Answer:
Explanation:
Explanation
Execution results and diagnostics: Azure Storage
Job launcher and tracker: Job Service
Job metadata and state: Control database
The Job database is used for defining jobs and tracking the status and history of job executions. The Job database is also used to store agent metadata, logs, results, job definitions, and also contains many useful stored procedures, and other database objects, for creating, running, and managing jobs using T-SQL.
References:
https://docs.microsoft.com/en-us/azure/sql-database/sql-database-job-automation-overview
NEW QUESTION 68
You implement an Azure SQL Data Warehouse instance.
You plan to migrate the largest fact table to Azure SQL Data Warehouse. The table resides on Microsoft SQL Server on-premises and is 10 terabytes (TB) is size.
Incoming queries use the primary key Sale Key column to retrieve data as displayed in the following table:
You need to distribute the large fact table across multiple nodes to optimize performance of the table.
Which technology should you use?
- A. round robin distributed table with clustered index
- B. hash distributed table with clustered index
- C. round robin distributed table with clustered ColumnStore index
- D. heap table with distribution replicate
- E. hash distributed table with clustered ColumnStore index
Answer: E
Explanation:
Explanation
Hash-distributed tables improve query performance on large fact tables.
Columnstore indexes can achieve up to 100x better performance on analytics and data warehousing workloads and up to 10x better data compression than traditional rowstore indexes.
References:
https://docs.microsoft.com/en-us/azure/sql-data-warehouse/sql-data-warehouse-tables-distribute
https://docs.microsoft.com/en-us/sql/relational-databases/indexes/columnstore-indexes-query-performance
NEW QUESTION 69
You are monitoring an Azure Stream Analytics job.
You discover that the Backlogged Input Events metric is increasing slowly and is consistently non-zero.
You need to ensure that the job can handle all the events.
What should you do?
- A. Increase the number of streaming units (SUs).
- B. Change the compatibility level of the Stream Analytics job.
- C. Remove any named consumer groups from the connection and use $default.
- D. Create an additional output stream for the existing input stream.
Answer: A
Explanation:
Explanation
Backlogged Input Events: Number of input events that are backlogged. A non-zero value for this metric implies that your job isn't able to keep up with the number of incoming events. If this value is slowly increasing or consistently non-zero, you should scale out your job. You should increase the Streaming Units.
Note: Streaming Units (SUs) represents the computing resources that are allocated to execute a Stream Analytics job. The higher the number of SUs, the more CPU and memory resources are allocated for your job.
Reference:
https://docs.microsoft.com/bs-cyrl-ba/azure/stream-analytics/stream-analytics-monitoring
NEW QUESTION 70
You have an Azure SQL database named DB1 in the Each US 2 region.
You need to build a secondary geo-replicated copy of DB1 in the West US region on a new server.
Which three actions should you perform in sequence? To answer, move the appropriate actions from the list of actions to the answer area and arrange them in the correct order.
Answer:
Explanation:
References:
https://docs.microsoft.com/en-us/azure/sql-database/sql-database-active-geo-replication-portal
NEW QUESTION 71
Which counter should you monitor for real-time processing to meet the technical requirements?
- A. Data Conversion Errors
- B. CPU % utilization
- C. SU% Utilization
- D. Concurrent users
Answer: C
Explanation:
Scenario:
* Real-time processing must be monitored to ensure that workloads are sized properly based on actual usage patterns.
* The sales data including the documents in JSON format, must be gathered as it arrives and analyzed online by using Azure Stream Analytics.
Streaming Units (SUs) represents the computing resources that are allocated to execute a Stream Analytics job. The higher the number of SUs, the more CPU and memory resources are allocated for your job. This capacity lets you focus on the query logic and abstracts the need to manage the hardware to run your Stream Analytics job in a timely manner.
References:
https://docs.microsoft.com/en-us/azure/stream-analytics/stream-analytics-streaming-unit-consumption Manage and troubleshoot Azure data solutions Question Set 1
NEW QUESTION 72
Use the following login credentials as needed:
Azure Username: xxxxx
Azure Password: xxxxx
The following information is for technical support purposes only:
Lab Instance: 10543936
You need to create an elastic pool that contains an Azure SQL database named db2 and a new SQL database named db3.
To complete this task, sign in to the Azure portal.
Answer:
Explanation:
See the explanation below.
Explanation
Step 1: Create a new SQL database named db3
1. Select SQL in the left-hand menu of the Azure portal. If SQL is not in the list, select All services, then type SQL in the search box.
2. Select + Add to open the Select SQL deployment option page. Select Single Database. You can view additional information about the different databases by selecting Show details on the Databases tile.
3. Select Create:
4. Enter the required fields if necessary.
5. Leave the rest of the values as default and select Review + Create at the bottom of the form.
6. Review the final settings and select Create. Use Db3 as database name.
On the SQL Database form, select Create to deploy and provision the resource group, server, and database.
Step 2: Create your elastic pool using the Azure portal.
1. Select Azure SQL in the left-hand menu of the Azure portal. If Azure SQL is not in the list, select All services, then type Azure SQL in the search box.
2. Select + Add to open the Select SQL deployment option page.
3. Select Elastic pool from the Resource type drop-down in the SQL Databases tile. Select Create to create your elastic pool.
4. Configure your elastic pool with the following values:
Name: Provide a unique name for your elastic pool, such as myElasticPool.
Subscription: Select your subscription from the drop-down.
ResourceGroup: Select the resource group.
Server: Select the server
5. Select Configure elastic pool
6. On the Configure page, select the Databases tab, and then choose to Add database.
7. Add the Azure SQL database named db2, and the new SQL database named db3 that you created in Step 1.
8. Select Review + create to review your elastic pool settings and then select Create to create your elastic pool.
Reference:
https://docs.microsoft.com/bs-latn-ba/azure/sql-database/sql-database-elastic-pool-failover-group-tutorial
NEW QUESTION 73
The data engineering team manages Azure HDInsight clusters. The team spends a large amount of time creating and destroying clusters daily because most of the data pipeline process runs in minutes.
You need to implement a solution that deploys multiple HDInsight clusters with minimal effort.
What should you implement?
- A. Azure Databricks
- B. Azure Traffic Manager
- C. Ambari web user interface
- D. Azure Resource Manager templates
Answer: D
Explanation:
Explanation
A Resource Manager template makes it easy to create the following resources for your application in a single, coordinated operation:
* HDInsight clusters and their dependent resources (such as the default storage account).
* Other resources (such as Azure SQL Database to use Apache Sqoop).
In the template, you define the resources that are needed for the application. You also specify deployment parameters to input values for different environments. The template consists of JSON and expressions that you use to construct values for your deployment.
References:
https://docs.microsoft.com/en-us/azure/hdinsight/hdinsight-hadoop-create-linux-clusters-arm-templates
NEW QUESTION 74
You have an Azure subscription that contains an Azure Databricks environment and an Azure Storage account.
You need to implement secure communication between Databricks and the storage account.
You create an Azure key vault.
Which four actions should you perform in sequence? To answer, move the actions from the list of actions to the answer area and arrange them in the correct order.
Answer:
Explanation:
Explanation
Step 1: Mount the storage account
Step 2: Retrieve an access key from the storage account.
Step 3: Add a secret to the key vault.
Step 4: Add a secret scope to the Databricks environment.
Managing secrets begins with creating a secret scope.
To reference secrets stored in an Azure Key Vault, you can create a secret scope backed by Azure Key Vault.
References:
https://docs.microsoft.com/en-us/azure/azure-databricks/store-secrets-azure-key-vault
NEW QUESTION 75
......
DP-200 certification guide Q&A from Training Expert Prep4sureExam: https://www.prep4sureexam.com/DP-200-dumps-torrent.html