Practice Free Associate Data Practitioner Exam Online Questions – Page 2

Question #9

Your organization has several datasets in BigQuery. The datasets need to be shared with your external partners so that they can run SQL queries without needing to copy the data to their own projects. You have organized each partner’s data in its own BigQuery dataset. Each partner should be able to access only their data. You want to share the data while following Google-recommended practices.

What should you do?

A . Use Analytics Hub to create a listing on a private data exchange for each partner dataset. Allow each partner to subscribe to their respective listings.
B . Create a Dataflow job that reads from each BigQuery dataset and pushes the data into a dedicated Pub/Sub topic for each partner. Grant each partner the pubsub. subscriber IAM role.
C . Export the BigQuery data to a Cloud Storage bucket. Grant the partners the storage.objectUser IAM role on the bucket.
D . Grant the partners the bigquery.user IAM role on the BigQuery project.

Reveal Solution Hide Solution

Correct Answer: A
A

Explanation:

Using Analytics Hub to create a listing on a private data exchange for each partner dataset is the Google-recommended practice for securely sharing BigQuery data with external partners. Analytics Hub allows you to manage data sharing at scale, enabling partners to query datasets directly without needing to copy the data into their own projects. By creating separate listings for each partner dataset and allowing only the respective partner to subscribe, you ensure that partners can access only their specific data, adhering to the principle of least privilege. This approach is secure, efficient, and designed for scenarios involving external data sharing.

Question #12

Your data science team needs to collaboratively analyze a 25 TB BigQuery dataset to support the development of a machine learning model. You want to use Colab Enterprise notebooks while ensuring efficient data access and minimizing cost.

What should you do?

A . Export the BigQuery dataset to Google Drive. Load the dataset into the Colab Enterprise notebook using Pandas.
B . Use BigQuery magic commands within a Colab Enterprise notebook to query and analyze the data.
C . Create a Dataproc cluster connected to a Colab Enterprise notebook, and use Spark to process the data in BigQuery.
D . Copy the BigQuery dataset to the local storage of the Colab Enterprise runtime, and analyze the data using Pandas.

Reveal Solution Hide Solution

Correct Answer: B
B

Explanation:

Comprehensive and Detailed In-Depth

For a 25 TB dataset, efficiency and cost require minimizing data movement and leveraging BigQuery’s scalability within Colab Enterprise.

Option A: Exporting 25 TB to Google Drive and loading via Pandas is impractical (size limits, transfer costs) and slow.

Option B: BigQuery magic commands (%%bigquery) in Colab Enterprise allow direct querying of BigQuery data, keeping processing in the cloud, reducing costs, and enabling collaboration.

Option C: Dataproc with Spark adds cluster costs and complexity, unnecessary when BigQuery can handle the workload.

Option D: Copying 25 TB to local storage is infeasible due to size and cost.

Extract from Google Documentation: From "Using BigQuery with Colab Enterprise"

(https://cloud.google.com/colab/docs/bigquery): "You can use BigQuery magic commands (%%bigquery) in Colab Enterprise to execute SQL queries directly against BigQuery datasets, providing efficient access to large-scale data without moving it."

Reference: Google Cloud Documentation – "Colab Enterprise with BigQuery"

(https://cloud.google.com/colab/docs).

Question #12

A . Export the BigQuery dataset to Google Drive. Load the dataset into the Colab Enterprise notebook using Pandas.
B . Use BigQuery magic commands within a Colab Enterprise notebook to query and analyze the data.
C . Create a Dataproc cluster connected to a Colab Enterprise notebook, and use Spark to process the data in BigQuery.
D . Copy the BigQuery dataset to the local storage of the Colab Enterprise runtime, and analyze the data using Pandas.

Reveal Solution Hide Solution

Question #14

You have a BigQuery dataset containing sales data. This data is actively queried for the first 6 months. After that, the data is not queried but needs to be retained for 3 years for compliance reasons. You need to implement a data management strategy that meets access and compliance requirements, while keeping cost and administrative overhead to a minimum.

What should you do?

A . Use BigQuery long-term storage for the entire dataset. Set up a Cloud Run function to delete the data from BigQuery after 3 years.
B . Partition a BigQuery table by month. After 6 months, export the data to Coldline storage.
Implement a lifecycle policy to delete the data from Cloud Storage after 3 years.
C . Set up a scheduled query to export the data to Cloud Storage after 6 months. Write a stored procedure to delete the data from BigQuery after 3 years.
D . Store all data in a single BigQuery table without partitioning or lifecycle policies.

Reveal Solution Hide Solution

Correct Answer: B
B

Explanation:

Partitioning the BigQuery table by month allows efficient querying of recent data for the first 6 months, reducing query costs. After 6 months, exporting the data to Coldline storage minimizes storage costs for data that is rarely accessed but needs to be retained for compliance. Implementing a lifecycle policy in Cloud Storage automates the deletion of the data after 3 years, ensuring compliance while reducing administrative overhead. This approach balances cost efficiency and compliance requirements effectively.

Question #14

A . Use BigQuery long-term storage for the entire dataset. Set up a Cloud Run function to delete the data from BigQuery after 3 years.
B . Partition a BigQuery table by month. After 6 months, export the data to Coldline storage.
Implement a lifecycle policy to delete the data from Cloud Storage after 3 years.
C . Set up a scheduled query to export the data to Cloud Storage after 6 months. Write a stored procedure to delete the data from BigQuery after 3 years.
D . Store all data in a single BigQuery table without partitioning or lifecycle policies.

Reveal Solution Hide Solution

Question #16

Your company’s customer support audio files are stored in a Cloud Storage bucket. You plan to analyze the audio files’ metadata and file content within BigQuery to create inference by using BigQuery ML. You need to create a corresponding table in BigQuery that represents the bucket containing the audio files.

What should you do?

A . Create an external table.
B . Create a temporary table.
C . Create a native table.
D . Create an object table.

Reveal Solution Hide Solution

Correct Answer: D
D

Explanation:

To analyze audio files stored in a Cloud Storage bucket and represent them in BigQuery, you should create an object table. Object tables in BigQuery are designed to represent objects stored in Cloud Storage, including their metadata. This enables you to query the metadata of audio files directly from BigQuery without duplicating the data. Once the object table is created, you can use it in conjunction with other BigQuery ML workflows for inference and analysis.

Question #16