Practice Free PCAD-31-02 Exam Online Questions
Why is it important to use a test dataset separate from the training dataset when evaluating a machine learning model?
- A . To reduce the number of iterations during training
- B . To optimize the model’s training accuracy
- C . To evaluate the model’s performance on unseen data
- D . To ensure the loss function is minimized
Which practices are typically part of the data integration process? (Choose two)
- A . Schema alignment
- B . Data encryption
- C . Format standardization
- D . Neural network training
Which approaches help tailor data visualizations for non-technical stakeholders? (choose two)
- A . Use jargon-rich annotations
- B . Focus on trend and summary metrics
- C . Add context through narrative or titles
- D . Use raw code outputs for clarity
Which method is typically used in Pandas to check if a column contains only values within an expected range?
- A . df.sort_values()
- B . df.clip()
- C . df.isin()
- D . df.sample()
Which best practices help enhance database security when building Python data analysis pipelines? (choose two)
- A . Limit database user permissions to only necessary operations
- B . Construct queries using f-strings for readability
- C . Sanitize input by replacing dangerous characters with asterisks
- D . Store credentials securely using environment variables
Which methods are typically used to assess relationships between variables in exploratory data analysis? (choose two)
- A . Correlation matrix
- B . Histogram analysis
- C . Scatter plots
- D . Box plots
What is the primary reason for converting all categorical labels to lowercase during the data cleaning process?
- A . To reduce memory usage in the dataset
- B . To improve data visualization aesthetics
- C . To avoid treating the same category as different due to case differences
- D . To make string comparison operations slower
Which methods can be used to validate data types and structure in a Pandas DataFrame? (Choose two)
- A . df.dtypes
- B . df.info()
- C . df.to_csv()
- D . df.memory_usage()
What may occur if a model is evaluated using the same data it was trained on?
- A . Overgeneralization
- B . Data leakage
- C . Overfitting
- D . Underfitting
Which transformation method adjusts the mean of the dataset to 0 and the standard deviation to 1?
- A . One-hot encoding
- B . Log transformation
- C . Z-score normalization
- D . Min-max normalization
