Practice Free D-DS-FN-23 Exam Online Questions
Imagine you are trying to hire a Data Scientist for your team. In addition to technical ability and quantitative background, which additional essential trait would you look for in people applying for this position?
- A . Communication skill
- B . Scientific background
- C . Domain expertise
- D . Well Organized
Which word or phrase completes the statement; “Discovering relationships is to Association Rules as generating forecasts is to __________.”?
- A . Clustering
- B . Classification
- C . Text Analysis
- D . Time Series Analysis
You received 100,000 home loan records and want to quickly determine if there is any correlation between mortgage age and mortgage amount before conducting advanced analysis.
Which tool should be used for the preliminary analysis?
- A . Scatter plot
- B . Stacked Bar chart
- C . Box and Whisker plot
- D . Histogram
You are using a Logistic Regression model to determine if an applicant’s gender is a factor in determining whether or not they receive a bank loan. When you plot the results, you notice that the regression coefficient is zero.
What can be determined?
- A . Sample size of the data is too small
- B . Applicant’s gender influences the loan decision
- C . Sample size of the data is too large
- D . Applicant’s gender does not influence the loan decision
Only three variables―A, B, and C―have significant correlation with sales
You build a linear regression model on the dependent variable of sales with the independent variables of A, B, and C. The results of the regression are seen in the exhibit.
Which interpretation is supported by the analysis?
- A . Variables A, B, and C are significantly impacting sales, but are not effectively estimating sales
- B . Variables A, B, and C are significantly impacting sales and are effectively estimating sales
- C . Due to the R2 of 0.10, the model is not valid C the linear regression should be rerun with all 15 variables forced into the model to increase the R2
- D . Due to the R2 of 0.10, the model is not valid C a different analytical model should be attempted
What should be subtracted to remove a simple linear trend from a time series?
- A . Least-squares-fit line
- B . Expected squared deviation
- C . Expected absolute deviation
- D . Cubic-spline
A business colleague who is new to Hadoop approaches you with a question. The colleague wants to know the best approach to access their data. The colleague has previously worked extensively with SQL and databases.
Which query interface should be recommended?
- A . Hive
- B . Pig
- C . Howl
- D . HBase
What is the primary bottleneck in text classification?
- A . The availablilty of tagged training data.
- B . The ability to parse unstructured text data.
- C . The high dimensionality of text data.
- D . The fact that text corpora are dynamic.
On analyzing your time series data you suspect that the data represented as y1, y2, y3, … , yn-1, yn may have a trend component that is quadratic in nature.
Which pattern of data will indicate that the trend in the time series data is quadratic in nature?
- A . (y3-y2) C (y2-y1) = ………= (yn-yn-1)-(yn-1-yn-2)
- B . (y2-y1) = (y3-y2) = ……. = (yn-yn-1)
- C . ((y2-y1) /y1 ) * 100% = …….((yn-yn-1)/yn-1) * 100%
- D . (y4-y2) C (y3-y1) = ………= (yn-yn-2)-(yn-1-yn-3)
Your colleague, who is new to Hadoop, approaches you with a question. They want to know how best to access their data. This colleague has a strong background in data flow languages and programming.
Which query interface would you recommend?
- A . Pig
- B . Hive
- C . Howl
- D . HBase