Practice Free Professional Data Engineer Exam Online Questions
Government regulations in your industry mandate that you have to maintain an auditable record of access to certain types of data. Assuming that all expiring logs will be archived correctly, where should you store data that is subject to that mandate?
- A . Encrypted on Cloud Storage with user-supplied encryption keys. A separate decryption key will be given to each authorized user.
- B . In a BigQuery dataset that is viewable only by authorized personnel, with the Data Access log used to
provide the auditability. - C . In Cloud SQL, with separate database user names to each user. The Cloud SQL Admin activity logs will be used to provide the auditability.
- D . In a bucket on Cloud Storage that is accessible only by an AppEngine service that collects user information and logs the access before providing a link to the bucket.
Flowlogistic’s CEO wants to gain rapid insight into their customer base so his sales team can be better informed in the field. This team is not very technical, so they’ve purchased a visualization tool to simplify the creation of BigQuery reports. However, they’ve been overwhelmed by all the data in the table, and are spending a lot of money on queries trying to find the data they need. You want to solve their problem in the most cost-effective way.
What should you do?
- A . Export the data into a Google Sheet for virtualization.
- B . Create an additional table with only the necessary columns.
- C . Create a view on the table to present to the virtualization tool.
- D . Create identity and access management (IAM) roles on the appropriate columns, so only they appear in a query.
You are operating a Cloud Dataflow streaming pipeline. The pipeline aggregates events from a Cloud Pub/Sub subscription source, within a window, and sinks the resulting aggregation to a Cloud Storage bucket. The source has consistent throughput. You want to monitor an alert on behavior of the pipeline with Cloud Stackdriver to ensure that it is processing data.
Which Stackdriver alerts should you create?
- A . An alert based on a decrease of subscription/num_undelivered_messages for the source and a rate of change increase of instance/storage/used_bytes for the destination
- B . An alert based on an increase of subscription/num_undelivered_messages for the source and a rate of change decrease of instance/storage/used_bytes for the destination
- C . An alert based on a decrease of instance/storage/used_bytes for the source and a rate of change increase of subscription/num_undelivered_messages for the destination
- D . An alert based on an increase of instance/storage/used_bytes for the source and a rate of change decrease of subscription/num_undelivered_messages for the destination
You have enabled the free integration between Firebase Analytics and Google BigQuery. Firebase now automatically creates a new table daily in BigQuery in the format app_events_YYYYMMDD. You want to query all of the tables for the past 30 days in legacy SQL.
What should you do?
- A . Use the TABLE_DATE_RANGE function
- B . Use the WHERE_PARTITIONTIME pseudo column
- C . Use WHERE date BETWEEN YYYY-MM-DD AND YYYY-MM-DD
- D . Use SELECT IF.(date >= YYYY-MM-DD AND date <= YYYY-MM-DD
A
Explanation:
Reference: https://cloud.google.com/blog/products/gcp/using-bigquery-and-firebase-analytics-to-understandyour-mobile-app?hl=am
You are developing an application on Google Cloud that will automatically generate subject labels for users’ blog posts. You are under competitive pressure to add this feature quickly, and you have no additional developer resources. No one on your team has experience with machine learning.
What should you do?
- A . Call the Cloud Natural Language API from your application. Process the generated Entity Analysis as
labels. - B . Call the Cloud Natural Language API from your application. Process the generated Sentiment Analysis as labels.
- C . Build and train a text classification model using TensorFlow. Deploy the model using Cloud Machine
Learning Engine. Call the model from your application and process the results as labels. - D . Build and train a text classification model using TensorFlow. Deploy the model using a Kubernetes Engine cluster. Call the model from your application and process the results as labels.
Your chemical company needs to manually check documentation for customer order. You use a pull subscription in Pub/Sub so that sales agents get details from the order. You must ensure that you do not process orders twice with different sales agents and that you do not add more complexity to this workflow.
What should you do?
- A . Create a transactional database that monitors the pending messages.
- B . Create a new Pub/Sub push subscription to monitor the orders processed in the agent’s system.
- C . Use Pub/Sub exactly-once delivery in your pull subscription.
- D . Use a Deduphcate PTransform in Dataflow before sending the messages to the sales agents.
C
Explanation:
Pub/Sub exactly-once delivery is a feature that guarantees that subscriptions do not receive duplicate deliveries of messages based on a Pub/Sub-defined unique message ID. This feature is only supported by the pull subscription type, which is what you are using in this scenario. By enabling exactly-once delivery, you can ensure that each order is processed only once by a sales agent, and that no order is lost or duplicated. This also simplifies your workflow, as you do not need to create a separate database or subscription to monitor the pending or processed messages.
Reference: Exactly-once delivery | Cloud Pub/Sub Documentation
Cloud Pub/Sub Exactly-once Delivery feature is now Generally Available (GA)
You are designing a system that requires an ACID-compliant database. You must ensure that the system requires minimal human intervention in case of a failure.
What should you do?
- A . Configure a Cloud SQL for MySQL instance with point-in-time recovery enabled.
- B . Configure a Cloud SQL for PostgreSQL instance with high availability enabled.
- C . Configure a Bigtable instance with more than one cluster.
- D . Configure a BJgQuery table with a multi-region configuration.
B
Explanation:
The best option to meet the ACID compliance and minimal human intervention requirements is to configure a Cloud SQL for PostgreSQL instance with high availability enabled. Key reasons: Cloud SQL for PostgreSQL provides full ACID compliance, unlike Bigtable which provides only atomicity and consistency guarantees. Enabling high availability removes the need for manual failover as Cloud SQL will automatically failover to a standby replica if the leader instance goes down. Point-in-time recovery in MySQL requires manual intervention to restore data if needed. BigQuery does not provide transactional guarantees required for an ACID database. Therefore, a Cloud SQL for PostgreSQL instance with high availability meets the ACID and minimal intervention requirements best. The automatic failover will ensure availability and uptime without administrative effort.
You are designing a system that requires an ACID-compliant database. You must ensure that the system requires minimal human intervention in case of a failure.
What should you do?
- A . Configure a Cloud SQL for MySQL instance with point-in-time recovery enabled.
- B . Configure a Cloud SQL for PostgreSQL instance with high availability enabled.
- C . Configure a Bigtable instance with more than one cluster.
- D . Configure a BJgQuery table with a multi-region configuration.
B
Explanation:
The best option to meet the ACID compliance and minimal human intervention requirements is to configure a Cloud SQL for PostgreSQL instance with high availability enabled. Key reasons: Cloud SQL for PostgreSQL provides full ACID compliance, unlike Bigtable which provides only atomicity and consistency guarantees. Enabling high availability removes the need for manual failover as Cloud SQL will automatically failover to a standby replica if the leader instance goes down. Point-in-time recovery in MySQL requires manual intervention to restore data if needed. BigQuery does not provide transactional guarantees required for an ACID database. Therefore, a Cloud SQL for PostgreSQL instance with high availability meets the ACID and minimal intervention requirements best. The automatic failover will ensure availability and uptime without administrative effort.
You are designing a system that requires an ACID-compliant database. You must ensure that the system requires minimal human intervention in case of a failure.
What should you do?
- A . Configure a Cloud SQL for MySQL instance with point-in-time recovery enabled.
- B . Configure a Cloud SQL for PostgreSQL instance with high availability enabled.
- C . Configure a Bigtable instance with more than one cluster.
- D . Configure a BJgQuery table with a multi-region configuration.
B
Explanation:
The best option to meet the ACID compliance and minimal human intervention requirements is to configure a Cloud SQL for PostgreSQL instance with high availability enabled. Key reasons: Cloud SQL for PostgreSQL provides full ACID compliance, unlike Bigtable which provides only atomicity and consistency guarantees. Enabling high availability removes the need for manual failover as Cloud SQL will automatically failover to a standby replica if the leader instance goes down. Point-in-time recovery in MySQL requires manual intervention to restore data if needed. BigQuery does not provide transactional guarantees required for an ACID database. Therefore, a Cloud SQL for PostgreSQL instance with high availability meets the ACID and minimal intervention requirements best. The automatic failover will ensure availability and uptime without administrative effort.
You are designing a system that requires an ACID-compliant database. You must ensure that the system requires minimal human intervention in case of a failure.
What should you do?
- A . Configure a Cloud SQL for MySQL instance with point-in-time recovery enabled.
- B . Configure a Cloud SQL for PostgreSQL instance with high availability enabled.
- C . Configure a Bigtable instance with more than one cluster.
- D . Configure a BJgQuery table with a multi-region configuration.
B
Explanation:
The best option to meet the ACID compliance and minimal human intervention requirements is to configure a Cloud SQL for PostgreSQL instance with high availability enabled. Key reasons: Cloud SQL for PostgreSQL provides full ACID compliance, unlike Bigtable which provides only atomicity and consistency guarantees. Enabling high availability removes the need for manual failover as Cloud SQL will automatically failover to a standby replica if the leader instance goes down. Point-in-time recovery in MySQL requires manual intervention to restore data if needed. BigQuery does not provide transactional guarantees required for an ACID database. Therefore, a Cloud SQL for PostgreSQL instance with high availability meets the ACID and minimal intervention requirements best. The automatic failover will ensure availability and uptime without administrative effort.
