Practice Free PCAD-31-02 Exam Online Questions – Page 2

Question #11

Which actions are valid techniques for handling erroneous categorical values in a dataset? (Choose two)

A . Converting all values to integers
B . Replacing inconsistent labels with a standardized value
C . Removing rows with invalid labels
D . Normalizing using min-max scaling

Correct Answer: BC

Question #12

Which SQL clause is most appropriate when you need to filter records that meet a specific condition during data retrieval in an analytics pipeline?

A . ORDER BY
B . GROUP BY
C . HAVING
D . WHERE

Reveal Solution Hide Solution

Correct Answer: D

Question #13

Which element is essential to justify a conclusion drawn from a dataset?

A . The use of color in plots
B . The origin of the dataset
C . Logical reasoning and supported metrics
D . File format of the source data

Reveal Solution Hide Solution

Correct Answer: C

Question #14

Why is it important to adjust data presentations based on the audience’s background?

A . To avoid using charts altogether
B . To simplify all metrics to percentages only
C . To ensure the data is understood and supports actionable insights
D . To include as many technical terms as possible

Reveal Solution Hide Solution

Correct Answer: C

Question #15

Which technique would be most appropriate to handle missing numerical values in a dataset intended for machine learning?

A . Replacing with NULL
B . Dropping all columns
C . Imputation using mean or median
D . Filling with random values

Reveal Solution Hide Solution

Correct Answer: C

Question #16

What is the most appropriate way to ensure that a column used as a foreign key contains only valid references to a parent DataFrame in Pandas?

A . Use df.dropna() on the foreign key column
B . Use .isin() to compare the foreign key column against the parent key column
C . Use df.sort_values() to sort both columns before merging
D . Use df.merge() with how=’outer’

Reveal Solution Hide Solution

Correct Answer: B

Question #17

Which operations are valid when working with Pandas Series? (choose two)

A . Arithmetic vector operations
B . Merging by index using merge()
C . Applying NumPy universal functions
D . Using .columns to rename

Reveal Solution Hide Solution

Correct Answer: AC

Question #18

When performing bootstrapping on a dataset with 500 observations, what is a typical procedure?

A . Creating samples by removing all duplicates
B . Generating multiple datasets of the same size by randomly sampling with replacement
C . Scaling all values between 0 and 1 before resampling
D . Drawing one sample and calculating the mean only once

Reveal Solution Hide Solution

Correct Answer: B

Question #19

Which techniques can be used to select a subset of rows and columns from a DataFrame using labels? (choose two)

A . df.loc[:, ‘col1’]
B . df.iloc[0:5, ‘col2’]
C . df.loc[2:7, [‘col1’, ‘col2’]]
D . df[‘col1’, ‘col2’]

Reveal Solution Hide Solution

Correct Answer: AC

Question #20

Which term best describes the process of combining customer data from multiple systems into a single unified dataset?

A . Data binning
B . Data warehousing
C . Data normalization
D . Data integration

Reveal Solution Hide Solution

Correct Answer: D

1 2 3 4 5

Exams