Practice Free D-DS-FN-23 Exam Online Questions
You submit a Map Reduce job to a Hadoop cluster. However, you notice that although the job was successfully submitted, it is not completing.
What should be done to identify the issue?
- A . Ensure TaskTracker is running
- B . Ensure JobTracker is running
- C . Ensure NameNode is running
- D . Ensure DataNode is running
In time series analysis, what statement describes a MA(q) process?
- A . Current deviation from the time series mean depends on the q previous deviations
- B . Current deviation from the time series mean depends on the quotient q
- C . Current time series value depends on the q previous values
- D . Current time series value depends on the fitted polynomial of order q
When creating a project sponsor presentation, what is the main objective?
- A . Show that you met the project goals
- B . Show how you met the project goals
- C . Show how well the model will meet the SLA (service level agreement)
- D . Clearly describe the methods and techniques used
In which lifecycle stage are test and training data sets created?
- A . Model building
- B . Model planning
- C . Discovery
- D . Data preparation
Consider a scale that has five (5) values that range from “not important” to “very important”.
Which data classification best describes this data?
- A . Ordinal
- B . Nominal
- C . Real
- D . Ratio
You have been assigned to perform a study of the daily revenue effect of a pricing model of online transactions. All data currently available to you has been loaded into your analytics database. This includes revenue data, pricing data, and online transaction data.
You discover that all data comes in different levels of granularity. The transaction data has timestamps consisting of day, hour, minutes, and seconds. Pricing is stored at the daily level and revenue data is only reported monthly.
What is the next step?
- A . Report back to the business owner that the current data model does not support the business question.
- B . Interpolate a daily model for revenue from the monthly revenue data.
- C . Aggregate all data to the monthly level in order to create a monthly revenue model.
- D . Disregard revenue as the key reason in the pricing model and create a daily model based on pricing and transactions only.
You are provided with the following list.
Which window function is missing?
cume_dist()
dense_rank()
rank()
percent_rank()
first_value()
last_value()
lag()
lead()
ntile()
- A . row_preceding()
- B . row_number()
- C . median()
- D . cumulative_sum()
Based on the graphic, what should be done to begin addressing chart junk?
- A . Remove the vertical gridlines
- B . Remove the legend
- C . Reduce the font size in the axis
You have been assigned to do a study of the daily revenue effect of a pricing model of online transactions. When have you completed the analytics lifecycle?
- A . You have written documentation, and the code has been handed off to the Data Base Administrator and business operations.
- B . You have a completely developed model, and the results have shown statistically acceptable results.
- C . You have presented the results of the model to both the internal analytics team and the business owner of the project.
- D . You have a completely developed model based on both a sample of the data and the entire set of data available.
Which analytic technique would be appropriate to estimate blood pressure based on age and weight?
- A . Naïve Bayesian classification
- B . Linear regression
- C . K-means clustering
- D . Time series analysis