Practice Free MLS-C01 Exam Online Questions
A media company is building a computer vision model to analyze images that are on social media. The model consists of CNNs that the company trained by using images that the company stores in Amazon S3. The company used an Amazon SageMaker training job in File mode with a single Amazon EC2 On-Demand Instance.
Every day, the company updates the model by using about 10,000 images that the company has collected in the last 24 hours. The company configures training with only one epoch. The company wants to speed up training and lower costs without the need to make any code changes.
Which solution will meet these requirements?
- A . Instead of File mode, configure the SageMaker training job to use Pipe mode. Ingest the data from a pipe.
- B . Instead Of File mode, configure the SageMaker training job to use FastFile mode with no Other changes.
- C . Instead Of On-Demand Instances, configure the SageMaker training job to use Spot Instances.
Make no Other changes. - D . Instead Of On-Demand Instances, configure the SageMaker training job to use Spot Instances.
Implement model checkpoints.
A manufacturing company uses machine learning (ML) models to detect quality issues. The models use images that are taken of the company’s product at the end of each production step. The company has thousands of machines at the production site that generate one image per second on average. The company ran a successful pilot with a single manufacturing machine. For the pilot, ML specialists used an industrial PC that ran AWS IoT Greengrass with a long-running AWS Lambda function that uploaded the images to Amazon S3. The uploaded images invoked a Lambda function that was written in Python to perform inference by using an Amazon SageMaker endpoint that ran a custom model. The inference results were forwarded back to a web service that was hosted at the production site to prevent faulty products from being shipped.
The company scaled the solution out to all manufacturing machines by installing similarly configured industrial PCs on each production machine. However, latency for predictions increased beyond acceptable limits. Analysis shows that the internet connection is at its capacity limit.
How can the company resolve this issue MOST cost-effectively?
- A . Set up a 10 Gbps AWS Direct Connect connection between the production site and the nearest AWS Region. Use the Direct Connect connection to upload the images. Increase the size of the instances and the number of instances that are used by the SageMaker endpoint.
- B . Extend the long-running Lambda function that runs on AWS IoT Greengrass to compress the images and upload the compressed files to Amazon S3. Decompress the files by using a separate Lambda function that invokes the existing Lambda function to run the inference pipeline.
- C . Use auto scaling for SageMaker. Set up an AWS Direct Connect connection between the production site and the nearest AWS Region. Use the Direct Connect connection to upload the images.
- D . Deploy the Lambda function and the ML models onto the AWS IoT Greengrass core that is running on the industrial PCs that are installed on each machine. Extend the long-running Lambda function that runs on AWS IoT Greengrass to invoke the Lambda function with the captured images and run the inference on the edge component that forwards the results directly to the web service.
A Machine Learning team runs its own training algorithm on Amazon SageMaker. The training algorithm requires external assets. The team needs to submit both its own algorithm code and algorithm-specific parameters to Amazon SageMaker.
What combination of services should the team use to build a custom algorithm in Amazon SageMaker? (Choose two.)
- A . AWS Secrets Manager
- B . AWS CodeStar
- C . Amazon ECR
- D . Amazon ECS
- E . Amazon S3
A Machine Learning Specialist deployed a model that provides product recommendations on a company’s website Initially, the model was performing very well and resulted in customers buying more products on average However within the past few months the Specialist has noticed that the effect of product recommendations has diminished and customers are starting to return to their original habits of spending less The Specialist is unsure of what happened, as the model has not changed from its initial deployment over a year ago
Which method should the Specialist try to improve model performance?
- A . The model needs to be completely re-engineered because it is unable to handle product inventory changes
- B . The model’s hyperparameters should be periodically updated to prevent drift
- C . The model should be periodically retrained from scratch using the original data while adding a regularization term to handle product inventory changes
- D . The model should be periodically retrained using the original training data plus new data as product inventory changes
A Machine Learning Specialist prepared the following graph displaying the results of k-means for k = [1:10]
Considering the graph, what is a reasonable selection for the optimal choice of k?
- A . 1
- B . 4
- C . 7
- D . 10
A Data Scientist is building a linear regression model and will use resulting p-values to evaluate the statistical significance of each coefficient. Upon inspection of the dataset, the Data Scientist discovers that most of the features are normally distributed. The plot of one feature in the dataset is shown in the graphic.
What transformation should the Data Scientist apply to satisfy the statistical assumptions of the linear regression model?
- A . Exponential transformation
- B . Logarithmic transformation
- C . Polynomial transformation
- D . Sinusoidal transformation
An e-commerce company needs a customized training model to classify images of its shirts and pants products The company needs a proof of concept in 2 to 3 days with good accuracy.
Which compute choice should the Machine Learning Specialist select to train and achieve good accuracy on the model quickly?
- A . m5 4xlarge (general purpose)
- B . r5.2xlarge (memory optimized)
- C . p3.2xlarge (GPU accelerated computing)
- D . p3 8xlarge (GPU accelerated computing)
A Machine Learning Specialist is implementing a full Bayesian network on a dataset that describes public transit in New York City. One of the random variables is discrete, and represents the number of minutes New Yorkers wait for a bus given that the buses cycle every 10 minutes, with a mean of 3 minutes.
Which prior probability distribution should the ML Specialist use for this variable?
- A . Poisson distribution
- B . Uniform distribution
- C . Normal distribution
- D . Binomial distribution
A company wants to classify user behavior as either fraudulent or normal. Based on internal research, a Machine Learning Specialist would like to build a binary classifier based on two features: age of account and transaction month.
The class distribution for these features is illustrated in the figure provided.
Based on this information which model would have the HIGHEST accuracy?
- A . Long short-term memory (LSTM) model with scaled exponential linear unit (SELL))
- B . Logistic regression
- C . Support vector machine (SVM) with non-linear kernel
- D . Single perceptron with tanh activation function