Practice Free MLS-C01 Exam Online Questions
A machine learning (ML) specialist is building a credit score model for a financial institution. The ML specialist has collected data for the previous 3 years of transactions and third-party metadata that is related to the transactions.
After the ML specialist builds the initial model, the ML specialist discovers that the model has low accuracy for both the training data and the test data. The ML specialist needs to improve the accuracy of the model.
Which solutions will meet this requirement? (Select TWO.)
- A . Increase the number of passes on the existing training data. Perform more hyperparameter tuning.
- B . Increase the amount of regularization. Use fewer feature combinations.
- C . Add new domain-specific features. Use more complex models.
- D . Use fewer feature combinations. Decrease the number of numeric attribute bins.
- E . Decrease the amount of training data examples. Reduce the number of passes on the existing training data.
While working on a neural network project, a Machine Learning Specialist discovers that some features in the data have very high magnitude resulting in this data being weighted more in the cost function.
What should the Specialist do to ensure better convergence during backpropagation?
- A . Dimensionality reduction
- B . Data normalization
- C . Model regulanzation
- D . Data augmentation for the minority class
A company is building a predictive maintenance model based on machine learning (ML). The data is stored in a fully private Amazon S3 bucket that is encrypted at rest with AWS Key Management Service (AWS KMS) CMKs. An ML specialist must run data preprocessing by using an Amazon SageMaker Processing job that is triggered from code in an Amazon SageMaker notebook. The job should read data from Amazon S3, process it, and upload it back to the same S3 bucket. The preprocessing code is stored in a container image in Amazon Elastic Container Registry (Amazon ECR). The ML specialist needs to grant permissions to ensure a smooth data preprocessing workflow.
Which set of actions should the ML specialist take to meet these requirements?
- A . Create an IAM role that has permissions to create Amazon SageMaker Processing jobs, S3 read and write access to the relevant S3 bucket, and appropriate KMS and ECR permissions. Attach the role to the SageMaker notebook instance. Create an Amazon SageMaker Processing job from the notebook.
- B . Create an IAM role that has permissions to create Amazon SageMaker Processing jobs. Attach the role to the SageMaker notebook instance. Create an Amazon SageMaker Processing job with an IAM role that has read and write permissions to the relevant S3 bucket, and appropriate KMS and ECR permissions.
- C . Create an IAM role that has permissions to create Amazon SageMaker Processing jobs and to access Amazon ECR. Attach the role to the SageMaker notebook instance. Set up both an S3 endpoint and a KMS endpoint in the default VPC. Create Amazon SageMaker Processing jobs from the notebook.
- D . Create an IAM role that has permissions to create Amazon SageMaker Processing jobs. Attach the role to the SageMaker notebook instance. Set up an S3 endpoint in the default VPC. Create Amazon SageMaker Processing jobs with the access key and secret key of the IAM user with appropriate KMS and ECR permissions.
A company is using Amazon Polly to translate plaintext documents to speech for automated
company announcements However company acronyms are being mispronounced in the current documents.
How should a Machine Learning Specialist address this issue for future documents?
- A . Convert current documents to SSML with pronunciation tags
- B . Create an appropriate pronunciation lexicon.
- C . Output speech marks to guide in pronunciation
- D . Use Amazon Lex to preprocess the text files for pronunciation
A company wants to predict the classification of documents that are created from an application. New documents are saved to an Amazon S3 bucket every 3 seconds. The company has developed three versions of a machine learning (ML) model within Amazon SageMaker to classify document text. The company wants to deploy these three versions to predict the classification of each document.
Which approach will meet these requirements with the LEAST operational overhead?
- A . Configure an S3 event notification that invokes an AWS Lambda function when new documents are created. Configure the Lambda function to create three SageMaker batch transform jobs, one batch transform job for each model for each document.
- B . Deploy all the models to a single SageMaker endpoint. Treat each model as a production variant. Configure an S3 event notification that invokes an AWS Lambda function when new documents are created. Configure the Lambda function to call each production variant and return the results of each model.
- C . Deploy each model to its own SageMaker endpoint Configure an S3 event notification that invokes an AWS Lambda function when new documents are created. Configure the Lambda function to call each endpoint and return the results of each model.
- D . Deploy each model to its own SageMaker endpoint. Create three AWS Lambda functions.
Configure each Lambda function to call a different endpoint and return the results. Configure three
S3 event notifications to invoke the Lambda functions when new documents are created.
A machine learning (ML) specialist uploads a dataset to an Amazon S3 bucket that is protected by server-side encryption with AWS KMS keys (SSE-KMS). The ML specialist needs to ensure that an Amazon SageMaker notebook instance can read the dataset that is in Amazon S3.
Which solution will meet these requirements?
- A . Define security groups to allow all HTTP inbound and outbound traffic. Assign the security groups to the SageMaker notebook instance.
- B . Configure the SageMaker notebook instance to have access to the VPC. Grant permission in the
AWS Key Management Service (AWS KMS) key policy to the notebook’s VPC. - C . Assign an IAM role that provides S3 read access for the dataset to the SageMaker notebook. Grant permission in the KMS key policy to the 1AM role.
- D . Assign the same KMS key that encrypts the data in Amazon S3 to the SageMaker notebook instance.
A company sells thousands of products on a public website and wants to automatically identify products with potential durability problems. The company has 1.000 reviews with date, star rating, review text, review summary, and customer email fields, but many reviews are incomplete and have empty fields. Each review has already been labeled with the correct durability result.
A machine learning specialist must train a model to identify reviews expressing concerns over product durability. The first model needs to be trained and ready to review in 2 days.
What is the MOST direct approach to solve this problem within 2 days?
- A . Train a custom classifier by using Amazon Comprehend.
- B . Build a recurrent neural network (RNN) in Amazon SageMaker by using Gluon and Apache MXNet.
- C . Train a built-in BlazingText model using Word2Vec mode in Amazon SageMaker.
- D . Use a built-in seq2seq model in Amazon SageMaker.
A financial services company wants to adopt Amazon SageMaker as its default data science environment. The company’s data scientists run machine learning (ML) models on confidential financial data. The company is worried about data egress and wants an ML engineer to secure the environment.
Which mechanisms can the ML engineer use to control data egress from SageMaker? (Choose three.)
- A . Connect to SageMaker by using a VPC interface endpoint powered by AWS PrivateLink.
- B . Use SCPs to restrict access to SageMaker.
- C . Disable root access on the SageMaker notebook instances.
- D . Enable network isolation for training jobs and models.
- E . Restrict notebook presigned URLs to specific IPs used by the company.
- F . Protect data with encryption at rest and in transit. Use AWS Key Management Service (AWS KMS) to manage encryption keys.
A machine learning (ML) specialist must develop a classification model for a financial services company. A domain expert provides the dataset, which is tabular with 10,000 rows and 1,020 features. During exploratory data analysis, the specialist finds no missing values and a small percentage of duplicate rows. There are correlation scores of > 0.9 for 200 feature pairs. The mean value of each feature is similar to its 50th percentile.
Which feature engineering strategy should the ML specialist use with Amazon SageMaker?
- A . Apply dimensionality reduction by using the principal component analysis (PCA) algorithm.
- B . Drop the features with low correlation scores by using a Jupyter notebook.
- C . Apply anomaly detection by using the Random Cut Forest (RCF) algorithm.
- D . Concatenate the features with high correlation scores by using a Jupyter notebook.
A company will use Amazon SageMaker to train and host a machine learning (ML) model for a marketing campaign. The majority of data is sensitive customer data. The data must be encrypted at rest. The company wants AWS to maintain the root of trust for the master keys and wants encryption key usage to be logged.
Which implementation will meet these requirements?
- A . Use encryption keys that are stored in AWS Cloud HSM to encrypt the ML data volumes, and to encrypt the model artifacts and data in Amazon S3.
- B . Use SageMaker built-in transient keys to encrypt the ML data volumes. Enable default encryption for new Amazon Elastic Block Store (Amazon EBS) volumes.
- C . Use customer managed keys in AWS Key Management Service (AWS KMS) to encrypt the ML data volumes, and to encrypt the model artifacts and data in Amazon S3.
- D . Use AWS Security Token Service (AWS STS) to create temporary tokens to encrypt the ML storage volumes, and to encrypt the model artifacts and data in Amazon S3.