Associate-Data-Practitioner Google Cloud Associate Data Practitioner (ADP Exam) exact Exam Questions

Question # 4

Your organization’s business analysts require near real-time access to streaming data. However, they are reporting that their dashboard queries are loading slowly. After investigating BigQuery query performance, you discover the slow dashboard queries perform several joins and aggregations.

You need to improve the dashboard loading time and ensure that the dashboard data is as up-to-date as possible. What should you do?

Disable BiqQuery query result caching.

Modify the schema to use parameterized data types.

Create a scheduled query to calculate and store intermediate results.

Create materialized views.

Full Access

Question # 5

You are designing an application that will interact with several BigQuery datasets. You need to grant the application’s service account permissions that allow it to query and update tables within the datasets, and list all datasets in a project within your application. You want to follow the principle of least privilege. Which pre-defined IAM role(s) should you apply to the service account?

roles/bigquery.jobUser and roles/bigquery.dataOwner

roles/bigquery.connectionUser and roles/bigquery.dataViewer

roles/bigquery.admin

roles/bigquery.user and roles/bigquery.filteredDataViewer

Full Access

Answer:

Explanation:

roles/bigquery.jobUser:

This role allows a user or service account to run BigQuery jobs, including queries. This is necessary for the application to interact with and query the tables.

From Google Cloud documentation: "BigQuery Job User can run BigQuery jobs, including queries, load jobs, export jobs, and copy jobs."

roles/bigquery.dataOwner:

This role grants full control over BigQuery datasets and tables. It allows the service account to update tables, which is a requirement of the application.

From Google Cloud documentation: "BigQuery Data Owner can create, delete, and modify BigQuery datasets and tables. BigQuery Data Owner can also view data and run queries."

Why other options are incorrect:

B. roles/bigquery.connectionUser and roles/bigquery.dataViewer:

roles/bigquery.connectionUser is used for external connections, which is not required for this task. roles/bigquery.dataViewer only allows viewing data, not updating it.

C. roles/bigquery.admin:

roles/bigquery.admin grants excessive permissions. Following the principle of least privilege, this role is too broad.

D. roles/bigquery.user and roles/bigquery.filteredDataViewer:

roles/bigquery.user grants the ability to run queries, but not the ability to modify data. roles/bigquery.filteredDataViewer only provides permission to view filtered data, which is not sufficient for updating tables.

Principle of Least Privilege:

The principle of least privilege is a security concept that states that a user or service account should be granted only the permissions necessary to perform its intended tasks.

By assigning roles/bigquery.jobUser and roles/bigquery.dataOwner, we provide the application with the exact permissions it needs without granting unnecessary access.

Google Cloud Documentation References:

BigQuery IAM roles:https://cloud.google.com/bigquery/docs/access-control-basic-roles

IAM best practices:https://cloud.google.com/iam/docs/best-practices-for-using-iam

Question # 6

Your team wants to create a monthly report to analyze inventory data that is updated daily. You need to aggregate the inventory counts by using only the most recent month of data, and save the results to be used in a Looker Studio dashboard. What should you do?

Create a materialized view in BigQuery that uses the SUM( ) function and the DATE_SUB( ) function.

Create a saved query in the BigQuery console that uses the SUM( ) function and the DATE_SUB( ) function. Re-run the saved query every month, and save the results to a BigQuery table.

Create a BigQuery table that uses the SUM( ) function and the _PARTITIONDATE filter.

Create a BigQuery table that uses the SUM( ) function and the DATE_DIFF( ) function.

Full Access

Question # 7

You need to transfer approximately 300 TB of data from your company's on-premises data center to Cloud Storage. You have 100 Mbps internet bandwidth, and the transfer needs to be completed as quickly as possible. What should you do?

Use Cloud Client Libraries to transfer the data over the internet.

Use the gcloud storage command to transfer the data over the internet.

Compress the data, upload it to multiple cloud storage providers, and then transfer the data to Cloud Storage.

Request a Transfer Appliance, copy the data to the appliance, and ship it back to Google.

Full Access

Question # 8

Your organization is building a new application on Google Cloud. Several data files will need to be stored in Cloud Storage. Your organization has approved only two specific cloud regions where these data files can reside. You need to determine a Cloud Storage bucket strategy that includes automated high availability. What should you do?

Create a dual-region bucket, and upload the files to this bucket.

Create a single-region bucket in each of the two regions, and use the gcloud storage command to replicate the data across the buckets in both regions.

Create a multi-region bucket, and upload the files to this bucket.

Create a single-region bucket in each of the two regions, and use Storage Transfer Service to replicate the data across the buckets in both regions.

Full Access

Question # 9

You need to create a new data pipeline. You want a serverless solution that meets the following requirements:

• Data is streamed from Pub/Sub and is processed in real-time.

• Data is transformed before being stored.

• Data is stored in a location that will allow it to be analyzed with SQL using Looker.

Which Google Cloud services should you recommend for the pipeline?

1. Dataproc Serverless

2. Bigtable

1. Cloud Composer

2. Cloud SQL for MySQL

1. BigQuery

2. Analytics Hub

1. Dataflow

2. BigQuery

Full Access

Question # 10

Your organization has decided to move their on-premises Apache Spark-based workload to Google Cloud. You want to be able to manage the code without needing to provision and manage your own cluster. What should you do?

Migrate the Spark jobs to Dataproc Serverless.

Configure a Google Kubernetes Engine cluster with Spark operators, and deploy the Spark jobs.

Migrate the Spark jobs to Dataproc on Google Kubernetes Engine.

Migrate the Spark jobs to Dataproc on Compute Engine.

Full Access

Question # 11

You work for a home insurance company. You are frequently asked to create and save risk reports with charts for specific areas using a publicly available storm event dataset. You want to be able to quickly create and re-run risk reports when new data becomes available. What should you do?

Export the storm event dataset as a CSV file. Import the file to Google Sheets, and use cell data in the worksheets to create charts.

Copy the storm event dataset into your BigQuery project. Use BigQuery Studio to query and visualize the data in Looker Studio.

Reference and query the storm event dataset using SQL in BigQuery Studio. Export the results to Google Sheets, and use cell data in the worksheets to create charts.

Reference and query the storm event dataset using SQL in a Colab Enterprise notebook. Display the table results and document with Markdown, and use Matplotlib to create charts.

Full Access

Question # 12

Your company’s ecommerce website collects product reviews from customers. The reviews are loaded as CSV files daily to a Cloud Storage bucket. The reviews are in multiple languages and need to be translated to Spanish. You need to configure a pipeline that is serverless, efficient, and requires minimal maintenance. What should you do?

Load the data into BigQuery using Dataproc. Use Apache Spark to translate the reviews by invoking the Cloud Translation API. Set BigQuery as the sink.U

Use a Dataflow templates pipeline to translate the reviews using the Cloud Translation API. Set BigQuery as the sink.

Load the data into BigQuery using a Cloud Run function. Use the BigQuery ML create model statement to train a translation model. Use the model to translate the product reviews within BigQuery.

Load the data into BigQuery using a Cloud Run function. Create a BigQuery remote function that invokes the Cloud Translation API. Use a scheduled query to translate new reviews.

Full Access

Question # 13

You used BigQuery ML to build a customer purchase propensity model six months ago. You want to compare the current serving data with the historical serving data to determine whether you need to retrain the model. What should you do?

Compare the two different models.

Evaluate the data skewness.

Evaluate data drift.

Compare the confusion matrix.

Full Access

Question # 14

Your company is building a near real-time streaming pipeline to process JSON telemetry data from small appliances. You need to process messages arriving at a Pub/Sub topic, capitalize letters in the serial number field, and write results to BigQuery. You want to use a managed service and write a minimal amount of code for underlying transformations. What should you do?

Use a Pub/Sub to BigQuery subscription, write results directly to BigQuery, and schedule a transformation query to run every five minutes.

Use a Pub/Sub to Cloud Storage subscription, write a Cloud Run service that is triggered when objects arrive in the bucket, performs the transformations, and writes the results to BigQuery.

Use the “Pub/Sub to BigQuery” Dataflow template with a UDF, and write the results to BigQuery.

Use a Pub/Sub push subscription, write a Cloud Run service that accepts the messages, performs the transformations, and writes the results to BigQuery.

Full Access

Question # 15

Your organization needs to implement near real-time analytics for thousands of events arriving each second in Pub/Sub. The incoming messages require transformations. You need to configure a pipeline that processes, transforms, and loads the data into BigQuery while minimizing development time. What should you do?

Use a Google-provided Dataflow template to process the Pub/Sub messages, perform transformations, and write the results to BigQuery.

Create a Cloud Data Fusion instance and configure Pub/Sub as a source. Use Data Fusion to process the Pub/Sub messages, perform transformations, and write the results to BigQuery.

Load the data from Pub/Sub into Cloud Storage using a Cloud Storage subscription. Create a Dataproc cluster, use PySpark to perform transformations in Cloud Storage, and write the results to BigQuery.

Use Cloud Run functions to process the Pub/Sub messages, perform transformations, and write the results to BigQuery.

Full Access

Question # 16

Your team is building several data pipelines that contain a collection of complex tasks and dependencies that you want to execute on a schedule, in a specific order. The tasks and dependencies consist of files in Cloud Storage, Apache Spark jobs, and data in BigQuery. You need to design a system that can schedule and automate these data processing tasks using a fully managed approach. What should you do?

Use Cloud Scheduler to schedule the jobs to run.

Use Cloud Tasks to schedule and run the jobs asynchronously.

Create directed acyclic graphs (DAGs) in Cloud Composer. Use the appropriate operators to connect to Cloud Storage, Spark, and BigQuery.

Create directed acyclic graphs (DAGs) in Apache Airflow deployed on Google Kubernetes Engine. Use the appropriate operators to connect to Cloud Storage, Spark, and BigQuery.

Full Access

Question # 17

You are using your own data to demonstrate the capabilities of BigQuery to your organization’s leadership team. You need to perform a one-time load of the files stored on your local machine into BigQuery using as little effort as possible. What should you do?

Write and execute a Python script using the BigQuery Storage Write API library.

Create a Dataproc cluster, copy the files to Cloud Storage, and write an Apache Spark job using the spark-bigquery-connector.

Execute the bq load command on your local machine.

Create a Dataflow job using the Apache Beam FileIO and BigQueryIO connectors with a local runner.

Full Access

Question # 18

You are a database administrator managing sales transaction data by region stored in a BigQuery table. You need to ensure that each sales representative can only see the transactions in their region. What should you do?

Add a policy tag in BigQuery.

Create a row-level access policy.

Create a data masking rule.

Grant the appropriate 1AM permissions on the dataset.

Full Access

Question # 19

You need to create a data pipeline for a new application. Your application will stream data that needs to be enriched and cleaned. Eventually, the data will be used to train machine learning models. You need to determine the appropriate data manipulation methodology and which Google Cloud services to use in this pipeline. What should you choose?

ETL; Dataflow -> BigQuery

ETL; Cloud Data Fusion -> Cloud Storage

ELT; Cloud Storage -> Bigtable

ELT; Cloud SQL -> Analytics Hub

Full Access

Question # 20

Your team needs to analyze large datasets stored in BigQuery to identify trends in user behavior. The analysis will involve complex statistical calculations, Python packages, and visualizations. You need to recommend a managed collaborative environment to develop and share the analysis. What should you recommend?

Create a Colab Enterprise notebook and connect the notebook to BigQuery. Share the notebook with your team. Analyze the data and generate visualizations in Colab Enterprise.

Create a statistical model by using BigQuery ML. Share the query with your team. Analyze the data and generate visualizations in Looker Studio.

Create a Looker Studio dashboard and connect the dashboard to BigQuery. Share the dashboard with your team. Analyze the data and generate visualizations in Looker Studio.

Connect Google Sheets to BigQuery by using Connected Sheets. Share the Google Sheet with your team. Analyze the data and generate visualizations in Gooqle Sheets.

Full Access

Question # 21

You manage a BigQuery table that is used for critical end-of-month reports. The table is updated weekly with new sales data. You want to prevent data loss and reporting issues if the table is accidentally deleted. What should you do?

Configure the time travel duration on the table to be exactly seven days. On deletion, re-create the deleted table solely from the time travel data.

Schedule the creation of a new snapshot of the table once a week. On deletion, re-create the deleted table using the snapshot and time travel data.

Create a clone of the table. On deletion, re-create the deleted table by copying the content of the clone.

Create a view of the table. On deletion, re-create the deleted table from the view and time travel data.

Full Access

Question # 22

You are a data analyst working with sensitive customer data in BigQuery. You need to ensure that only authorized personnel within your organization can query this data, while following the principle of least privilege. What should you do?

Enable access control by using IAM roles.

Update dataset privileges by using the SQL GRANT statement.

Export the data to Cloud Storage, and use signed URLs to authorize access.

Encrypt the data by using customer-managed encryption keys (CMEK).

Full Access

Question # 23

You are migrating data from a legacy on-premises MySQL database to Google Cloud. The database contains various tables with different data types and sizes, including large tables with millions of rowsand transactional data. You need to migrate this data while maintaining data integrity, and minimizing downtime and cost. What should you do?

Set up a Cloud Composer environment to orchestrate a custom data pipeline. Use a Python script to extract data from the MySQL database and load it to MySQL on Compute Engine.

Export the MySQL database to CSV files, transfer the files to Cloud Storage by using Storage Transfer Service, and load the files into a Cloud SQL for MySQL instance.

Use Database Migration Service to replicate the MySQL database to a Cloud SQL for MySQL instance.

Use Cloud Data Fusion to migrate the MySQL database to MySQL on Compute Engine.

Full Access

Question # 24

You work for an online retail company. Your company collects customer purchase data in CSV files and pushes them to Cloud Storage every 10 minutes. The data needs to be transformed and loaded into BigQuery for analysis. The transformation involves cleaning the data, removing duplicates, and enriching it with product information from a separate table in BigQuery. You need to implement a low-overhead solution that initiates data processing as soon as the files are loaded into Cloud Storage. What should you do?

Use Cloud Composer sensors to detect files loading in Cloud Storage. Create a Dataproc cluster, and use a Composer task to execute a job on the cluster to process and load the data into BigQuery.

Schedule a direct acyclic graph (DAG) in Cloud Composer to run hourly to batch load the data from Cloud Storage to BigQuery, and process the data in BigQuery using SQL.

Use Dataflow to implement a streaming pipeline using anOBJECT_FINALIZEnotification from Pub/Sub to read the data from Cloud Storage, perform the transformations, and write the data to BigQuery.

Create a Cloud Data Fusion job to process and load the data from Cloud Storage into BigQuery. Create anOBJECT_FINALIZE notification in Pub/Sub, and trigger a Cloud Run function to start the Cloud Data Fusion job as soon as new files are loaded.

Full Access

Question # 25

You work for a financial services company that handles highly sensitive data. Due to regulatory requirements, your company is required to have complete and manual control of data encryption. Which type of keys should you recommend to use for data storage?

Use customer-supplied encryption keys (CSEK).

Use a dedicated third-party key management system (KMS) chosen by the company.

Use Google-managed encryption keys (GMEK).

Use customer-managed encryption keys (CMEK).

Full Access

Question # 26

Your company has several retail locations. Your company tracks the total number of sales made at each location each day. You want to use SQL to calculate the weekly moving average of sales by location to identify trends for each store. Which query should you use?

Option A

Option B

Option C

Option D

Full Access

Month End Sale Limited Time 70% Discount Offer - Ends in 0d 00h 00m 00s - Coupon code: buysanta

Exact2Pass Menu

Exact2Pass

Answer:

Explanation:

Answer:

Explanation:

Answer:

Explanation:

Answer:

Explanation:

Answer:

Explanation:

Answer:

Explanation:

Answer:

Explanation:

Answer:

Explanation:

Answer:

Explanation:

Answer:

Explanation:

Answer:

Explanation:

Answer:

Explanation:

Answer:

Explanation:

Answer:

Explanation:

Answer:

Explanation:

Answer:

Explanation:

Answer:

Explanation:

Answer:

Explanation:

Answer:

Explanation:

Answer:

Explanation:

Answer:

Explanation:

Answer:

Explanation:

Answer:

Explanation:

SubFooter