Practice Free D-DS-FN-23 Exam Online Questions
Which phase of the data analytic lifecycle includes conducting project sponsor interviews and drafting a problem statement?
- A . Operationalize
- B . Model planning
- C . Model building
- D . Discovery
You have created a Logistic Regression model to predict customer churn for your company. The company’s Marketing department wants to use your model to identify at-risk customers and offer incentives to keep them from leaving.
Using two different thresholds for the model provides the two confusion matrices shown in the graphic. Marketing understands the relative costs of missing at-risk customers versus offering incentives to customers who are not at risk. Therefore, you need their advice on how to set the appropriate threshold on the churn model.
You are meeting with the Marketing team. In the meeting, you plan to state: “Raising the threshold from 0.5 to 0.75 reduces the number of unnecessary incentives that can be offered, at the cost of missing more of the customers who churned.”
What is the most appropriate visual to reinforce this statement?
A)
B)
C)
D)
- A . Option A
- B . Option B
- C . Option C
- D . Option D
You are using MADlib for Linear Regression analysis.
Which value does the statement return?
SELECT (linregr(depvar, indepvar)).r2 FROM zeta1;
- A . Goodness of fit
- B . Coefficients
- C . Standard error
- D . P-value
Variable D is not significantly impacting the dependent variable.
After seeing your findings, the majority of your team agreed that variable B should be positively impacting the dependent variable.
What is a possible reason the coefficient for variable B was negative and not positive?
- A . Variable B is interacting with another variable due to correlated inputs
- B . Variable B needs a quadratic transformation due to its relationship to the dependent variable
- C . The information gain from variable B is already provided by another variable
- D . Variable B needs a logarithmic transformation due to its relationship to the dependent variable
Refer to the Exhibit.
You are working on creating an OLAP query that outputs several rows of with summary rows of subtotals and grand totals in addition to regular rows that may contain NULL as shown in the exhibit.
Which function can you use in your query to distinguish the row from a regular row to a subtotal row?
- A . GROUPING
- B . RANK
- C . GROUP_ID
- D . ROLLUP
In a Logistic Regression, the coefficient for “age” equals -3.
What is the correct interpretation of the Logistic Regression coefficient, holding all other variables constant?
- A . When age decreases by 3 units, the odds of response are multiplied by e(-3) or 0.05
- B . When age increases by 1 unit, the odds of response are multiplied by e(-3) or 0.05
- C . For every 1 unit increase in age, the dependent variable is multiplied by -3
- D . For every 1 unit increase in age, the dependent variable is reduced by 3 units
In a user-defined aggregate function, what is FFUNC?
- A . Optional final calculation function
- B . Window function
- C . State transition function
- D . Segment-level calculation function
You have run a Linear Regression model on the data shown in the graphic.
Which value is a reasonable guess for R-squared?
- A . -.8
- B . .8
- C . .25
- D . 1.25
A data scientist wants to predict the probability of death from heart disease based on three risk factors: age, gender, and blood cholesterol level.
What is the most appropriate method for this project?
- A . Logistic regression
- B . Linear regression
- C . K-means clustering
- D . Apriori algorithm
In time series analysis, what function is examined to identify the order of the moving average component of an ARIMA model?
- A . Exponential function
- B . Arithmetic mean function
- C . Autocorrelation function
- D . Geometric mean function