Practice Free AIF-C01 Exam Online Questions
A bank has fine-tuned a large language model (LLM) to expedite the loan approval process. During an external audit of the model, the company discovered that the model was approving loans at a faster pace for a specific demographic than for other demographics.
How should the bank fix this issue MOST cost-effectively?
- A . Include more diverse training data. Fine-tune the model again by using the new data.
- B . Use Retrieval Augmented Generation (RAG) with the fine-tuned model.
- C . Use AWS Trusted Advisor checks to eliminate bias.
- D . Pre-train a new LLM with more diverse training data.
What does inference refer to in the context of AI?
- A . The process of creating new AI algorithms
- B . The use of a trained model to make predictions or decisions on unseen data
- C . The process of combining multiple AI models into one model
- D . The method of collecting training data for AI systems
B
Explanation:
Inference = applying a trained ML model to new, unseen data to make predictions, classifications, or generate outputs.
A is algorithm research, C refers to ensemble learning, D is data collection.
Reference: AWS ML Glossary C Inference
In AWS, which service is used to extract text from scanned documents?
- A . Amazon Transcribe
- B . Amazon Rekognition
- C . Amazon Textract
- D . Amazon Polly
D
Explanation:
Amazon Transcribe is a service that uses machine learning to convert audio data to text. Amazon Polly is a machine learning service that converts text to speech. Amazon Textract is a machine learning service that can extract text from scanned documents. Amazon Rekognition is a cloud-based image and video analysis service that makes it easy to add advanced computer vision capabilities to your applications.
A company wants to extract key insights from large policy documents to increase employee efficiency.
- A . Regression
- B . Clustering
- C . Summarization
- D . Classification
C
Explanation:
Summarization is a natural language processing (NLP) task that condenses long documents into concise, meaningful summaries while retaining the key information.
Regression predicts numerical values.
Clustering groups similar items.
Classification assigns data into predefined categories.
Reference: AWS NLP Use Cases C Summarization
A loan company is building a generative AI-based solution to offer new applicants discounts based on specific business criteria. The company wants to build and use an AI model responsibly to minimize bias that could negatively affect some customers.
Which actions should the company take to meet these requirements? (Select TWO.)
- A . Detect imbalances or disparities in the data.
- B . Ensure that the model runs frequently.
- C . Evaluate the model’s behavior so that the company can provide transparency to stakeholders.
- D . Use the Recall-Oriented Understudy for Gisting Evaluation (ROUGE) technique to ensure that the model is 100% accurate.
- E . Ensure that the model’s inference time is within the accepted limits.
A,C
Explanation:
To build an AI model responsibly and minimize bias, it is essential to ensure fairness and transparency throughout the model development and deployment process. This involves detecting and mitigating data imbalances and thoroughly evaluating the model’s behavior to understand its impact on different groups.
Option A (Correct): "Detect imbalances or disparities in the data": This is correct because identifying and addressing data imbalances or disparities is a critical step in reducing bias. AWS provides tools like Amazon SageMaker Clarify to detect bias during data preprocessing and model training.
Option C (Correct): "Evaluate the model’s behavior so that the company can provide transparency to stakeholders": This is correct because evaluating the model’s behavior for fairness and accuracy is key to ensuring that stakeholders understand how the model makes decisions. Transparency is a crucial aspect of responsible AI.
Option B: "Ensure that the model runs frequently" is incorrect because the frequency of model runs does not address bias.
Option D: "Use the Recall-Oriented Understudy for Gisting Evaluation (ROUGE) technique to ensure that the model is 100% accurate" is incorrect because ROUGE is a metric for evaluating the quality of text summarization models, not for minimizing bias.
Option E: "Ensure that the model’s inference time is within the accepted limits" is incorrect as it relates to performance, not bias reduction.
AWS AI Practitioner
Reference: Amazon SageMaker Clarify: AWS offers tools such as SageMaker Clarify for detecting bias in datasets and models, and for understanding model behavior to ensure fairness and transparency.
Responsible AI Practices: AWS promotes responsible AI by advocating for fairness, transparency, and inclusivity in model development and deployment.
A company stores millions of PDF documents in an Amazon S3 bucket. The company needs to extract the text from the PDFs, generate summaries of the text, and index the summaries for fast searching.
Which combination of AWS services will meet these requirements? (Select TWO.)
- A . Amazon Translate
- B . Amazon Bedrock
- C . Amazon Transcribe
- D . Amazon Polly
- E . Amazon Textract
B, E
Explanation:
Amazon Textract (E) automatically extracts text and structured data from scanned documents, such as PDFs.
Amazon Bedrock (B) offers access to LLMs (such as Amazon Titan or Anthropic Claude) for tasks like summarization and generating embeddings for search.
Workflow:
Amazon Textract extracts text from PDFs in S3.
Amazon Bedrock LLMs summarize the extracted text.
(Optional: Summaries can be indexed using Amazon OpenSearch or another search solution.)
A (Translate) is for language translation, not extraction or summarization.
C (Transcribe) is for audio to text, not PDFs.
D (Polly) is for text-to-speech.
“Amazon Textract extracts text, forms, and tables from scanned documents… Bedrock provides generative AI models to perform summarization and other text generation tasks.”
(Reference: Amazon Textract, Amazon Bedrock, AWS GenAI RAG Reference)
A company trained an ML model on Amazon SageMaker to predict customer credit risk. The model shows 90% recall on training data and 40% recall on unseen testing data.
Which conclusion can the company draw from these results?
- A . The model is overfitting on the training data.
- B . The model is underfitting on the training data.
- C . The model has insufficient training data.
- D . The model has insufficient testing data.
A
Explanation:
The ML model shows 90% recall on training data but only 40% recall on unseen testing data, indicating a significant performance drop. This discrepancy suggests the model has learned the training data too well, including noise and specific patterns that do not generalize to new data, which is a classic sign of overfitting.
Exact Extract from AWS AI Documents:
From the Amazon SageMaker Developer Guide:
"Overfitting occurs when a model performs well on training data but poorly on unseen test data, as it has learned patterns specific to the training set, including noise, that do not generalize. A large gap between training and testing performance metrics, such as recall, is a common indicator of overfitting."
(Source: Amazon SageMaker Developer Guide, Model Evaluation and Overfitting)
Detailed
Option A: The model is overfitting on the training data. This is the correct answer. The significant drop in recall from 90% (training) to 40% (testing) indicates the model is overfitting, as it performs well on training data but fails to generalize to unseen data.
Option B: The model is underfitting on the training data. Underfitting occurs when the model performs poorly on both training and testing data due to insufficient learning. With 90% recall on training data, the model is not underfitting.
Option C: The model has insufficient training data. Insufficient training data could lead to poor performance, but the high recall on training data (90%) suggests the model has learned the training data well, pointing to overfitting rather than a lack of data.
Option D: The model has insufficient testing data. Insufficient testing data might lead to unreliable test metrics, but it does not explain the large performance gap between training and testing, which is more indicative of overfitting.
Reference: Amazon SageMaker Developer Guide: Model Evaluation and Overfitting
(https://docs.aws.amazon.com/sagemaker/latest/dg/model-evaluation.html) AWS AI Practitioner Learning Path: Module on Model Performance and Evaluation
AWS Documentation: Understanding Overfitting and Underfitting (https://aws.amazon.com/machine-learning/)
A company is using supervised learning to train an AI model on a small labeled dataset that is specific to a target task.
Which step of the foundation model (FM) lifecycle does this describe?
- A . Fine-tuning
- B . Data selection
- C . Pre-training
- D . Evaluation
A
Explanation:
Fine-tuning involves training an already pre-trained FM on a smaller, labeled dataset for task specialization.
Data selection is about curating training data.
Pre-training is the initial training phase on massive datasets.
Evaluation happens after training, not during.
Reference: AWS Documentation C Fine-tuning in Amazon Bedrock
A company wants to learn about generative AI applications in an experimental environment.
Which solution will meet this requirement MOST cost-effectively?
- A . Amazon Q Developer
- B . Amazon SageMaker JumpStart
- C . Amazon Bedrock PartyRock
- D . Amazon Q Business
A company is using a large language model (LLM) on Amazon Bedrock to build a chatbot. The chatbot processes customer support requests. To resolve a request, the customer and the chatbot must interact a few times.
Which solution gives the LLM the ability to use content from previous customer messages?
- A . Turn on model invocation logging to collect messages.
- B . Add messages to the model prompt.
- C . Use Amazon Personalize to save conversation history.
- D . Use Provisioned Throughput for the LLM.
B
Explanation:
The company is building a chatbot using an LLM on Amazon Bedrock, and the chatbot needs to use content from previous customer messages to resolve requests. Adding previous messages to the model prompt (also known as providing conversation history) enables the LLM to maintain context across interactions, allowing it to respond coherently based on the ongoing conversation.
Exact Extract from AWS AI Documents:
From the AWS Bedrock User Guide:
"To enable a large language model (LLM) to maintain context in a conversation, you can include previous messages in the model prompt. This approach, often referred to as providing conversation history, allows the LLM to generate responses that are contextually relevant toprior interactions."
(Source: AWS Bedrock User Guide, Building Conversational Applications)
Detailed
Option A: Turn on model invocation logging to collect messages. Model invocation logging records interactions for auditing or debugging but does not provide the LLM with access to previous messages during inference to maintain conversation context.
Option B: Add messages to the model prompt. This is the correct answer. Including previous messages in the prompt gives the LLM the conversation history it needs to respond appropriately, a common practice for chatbots on Amazon Bedrock.
Option C: Use Amazon Personalize to save conversation history. Amazon Personalize is for building
recommendation systems, not for managing conversation history in a chatbot. This option is irrelevant.
Option D: Use Provisioned Throughput for the LLM. Provisioned Throughput in Amazon Bedrock ensures consistent performance for model inference but does not address the need to use previous messages in the conversation.
Reference: AWS Bedrock User Guide: Building Conversational Applications (https://docs.aws.amazon.com/bedrock/latest/userguide/conversational-apps.html)
AWS AI Practitioner Learning Path: Module on Generative AI and Chatbots
Amazon Bedrock Developer Guide: Managing Conversation Context (https://aws.amazon.com/bedrock/)
