Practice Free MLS-C01 Exam Online Questions
A Machine Learning Specialist uploads a dataset to an Amazon S3 bucket protected with server-side encryption using AWS KMS.
How should the ML Specialist define the Amazon SageMaker notebook instance so it can read the same dataset from Amazon S3?
- A . Define security group(s) to allow all HTTP inbound/outbound traffic and assign those security group(s) tothe Amazon SageMaker notebook instance.
- B . Сonfigure the Amazon SageMaker notebook instance to have access to the VPC. Grant permission in theKMS key policy to the notebook’s KMS role.
- C . Assign an IAM role to the Amazon SageMaker notebook with S3 read access to the dataset.
Grantpermission in the KMS key policy to that role. - D . Assign the same KMS key used to encrypt data in Amazon S3 to the Amazon SageMaker notebookinstance.
A machine learning (ML) engineer is using Amazon SageMaker automatic model tuning (AMT) to optimize a model’s hyperparameters. The ML engineer notices that the tuning jobs take a long time to run. The tuning jobs continue even when the jobs are not significantly improving against the objective metric.
The ML engineer needs the training jobs to optimize the hyperparameters more quickly.
How should the ML engineer configure the SageMaker AMT data types to meet these requirements?
- A . Set Strategy to the Bayesian value.
- B . Set RetryStrategy to a value of 1.
- C . Set ParameterRanges to the narrow range inferred from previous hyperparameter jobs.
- D . Set TrainingJobEarlyStoppingType to the AUTO value.
A Machine Learning Specialist is applying a linear least squares regression model to a dataset with 1 000 records and 50 features Prior to training, the ML Specialist notices that two features are perfectly linearly dependent
Why could this be an issue for the linear least squares regression model?
- A . It could cause the backpropagation algorithm to fail during training
- B . It could create a singular matrix during optimization which fails to define a unique solution
- C . It could modify the loss function during optimization causing it to fail during training
- D . It could introduce non-linear dependencies within the data which could invalidate the linear assumptions of the model
A data scientist wants to use Amazon Forecast to build a forecasting model for inventory demand for a retail company. The company has provided a dataset of historic inventory demand for its products as a .csv file stored in an Amazon S3 bucket. The table below shows a sample of the dataset.
How should the data scientist transform the data?
- A . Use ETL jobs in AWS Glue to separate the dataset into a target time series dataset and an item metadata dataset. Upload both datasets as .csv files to Amazon S3.
- B . Use a Jupyter notebook in Amazon SageMaker to separate the dataset into a related time series dataset and an item metadata dataset. Upload both datasets as tables in Amazon Aurora.
- C . Use AWS Batch jobs to separate the dataset into a target time series dataset, a related time series dataset, and an item metadata dataset. Upload them directly to Forecast from a local machine.
- D . Use a Jupyter notebook in Amazon SageMaker to transform the data into the optimized protobuf recordIO format. Upload the dataset in this format to Amazon S3.
A data scientist stores financial datasets in Amazon S3. The data scientist uses Amazon Athena to query the datasets by using SQL.
The data scientist uses Amazon SageMaker to deploy a machine learning (ML) model. The data scientist wants to obtain inferences from the model at the SageMaker endpoint However, when the data …. ntist attempts to invoke the SageMaker endpoint, the data scientist receives SOL statement failures. The data scientist’s 1AM user is currently unable to invoke the SageMaker endpoint.
Which combination of actions will give the data scientist’s 1AM user the ability to invoke the SageMaker endpoint? (Select THREE.)
- A . Attach the AmazonAthenaFullAccess AWS managed policy to the user identity.
- B . Include a policy statement for the data scientist’s 1AM user that allows the 1AM user to perform the sagemaker: lnvokeEndpoint action,
- C . Include an inline policy for the data scientist’s 1AM user that allows SageMaker to read S3 objects
- D . Include a policy statement for the data scientist’s 1AM user that allows the 1AM user to perform the sagemakerGetRecord action.
- E . Include the SQL statement "USING EXTERNAL FUNCTION ml_function_name" in the Athena SQL query.
- F . Perform a user remapping in SageMaker to map the 1AM user to another 1AM user that is on the hosted endpoint.
A Machine Learning Specialist is assigned to a Fraud Detection team and must tune an XGBoost model, which is working appropriately for test data. However, with unknown data, it is not working as expected.
The existing parameters are provided as follows.
Which parameter tuning guidelines should the Specialist follow to avoid overfitting?
- A . Increase the max_depth parameter value.
- B . Lower the max_depth parameter value.
- C . Update the objective to binary:logistic.
- D . Lower the min_child_weight parameter value.
A company is building a new version of a recommendation engine. Machine learning (ML) specialists need to keep adding new data from users to improve personalized recommendations. The ML specialists gather data from the users’ interactions on the platform and from sources such as external websites and social media.
The pipeline cleans, transforms, enriches, and compresses terabytes of data daily, and this data is stored in Amazon S3. A set of Python scripts was coded to do the job and is stored in a large Amazon EC2 instance. The whole process takes more than 20 hours to finish, with each script taking at least an hour. The company wants to move the scripts out of Amazon EC2 into a more managed solution that will eliminate the need to maintain servers.
Which approach will address all of these requirements with the LEAST development effort?
- A . Load the data into an Amazon Redshift cluster. Execute the pipeline by using SQL. Store the results in Amazon S3.
- B . Load the data into Amazon DynamoDB. Convert the scripts to an AWS Lambda function. Execute the pipeline by triggering Lambda executions. Store the results in Amazon S3.
- C . Create an AWS Glue job. Convert the scripts to PySpark. Execute the pipeline. Store the results in Amazon S3.
- D . Create a set of individual AWS Lambda functions to execute each of the scripts. Build a step function by using the AWS Step Functions Data Science SDK. Store the results in Amazon S3.
A tourism company uses a machine learning (ML) model to make recommendations to customers. The company uses an Amazon SageMaker environment and set hyperparameter tuning completion criteria to MaxNumberOfTrainingJobs.
An ML specialist wants to change the hyperparameter tuning completion criteria. The ML specialist wants to stop tuning immediately after an internal algorithm determines that tuning job is unlikely to improve more than 1% over the objective metric from the best training job.
Which completion criteria will meet this requirement?
- A . MaxRuntimelnSeconds
- B . TargetObjectiveMetricValue
- C . CompleteOnConvergence
- D . MaxNumberOfTrainingJobsNotlmproving
A machine learning specialist needs to analyze comments on a news website with users across the globe. The specialist must find the most discussed topics in the comments that are in either English or Spanish.
What steps could be used to accomplish this task? (Choose two.)
- A . Use an Amazon SageMaker BlazingText algorithm to find the topics independently from language.
Proceed with the analysis. - B . Use an Amazon SageMaker seq2seq algorithm to translate from Spanish to English, if necessary.
Use a SageMaker Latent Dirichlet Allocation (LDA) algorithm to find the topics. - C . Use Amazon Translate to translate from Spanish to English, if necessary. Use Amazon Comprehend topic modeling to find the topics.
- D . Use Amazon Translate to translate from Spanish to English, if necessary. Use Amazon Lex to extract topics form the content.
- E . Use Amazon Translate to translate from Spanish to English, if necessary. Use Amazon SageMaker Neural Topic Model (NTM) to find the topics.
A data scientist has developed a machine learning translation model for English to Japanese by using Amazon SageMaker’s built-in seq2seq algorithm with 500,000 aligned sentence pairs. While testing with sample sentences, the data scientist finds that the translation quality is reasonable for an example as short as five words. However, the quality becomes unacceptable if the sentence is 100 words long.
Which action will resolve the problem?
- A . Change preprocessing to use n-grams.
- B . Add more nodes to the recurrent neural network (RNN) than the largest sentence’s word count.
- C . Adjust hyperparameters related to the attention mechanism.
- D . Choose a different weight initialization type.