Generative AI foundation models in SageMaker Canvas - Amazon SageMaker AI

Generative AI foundation models in SageMaker Canvas

Amazon SageMaker Canvas provides generative AI foundation models that you can use to start conversational chats. These content generation models are trained on large amounts of text data to learn the statistical patterns and relationships between words, and they can produce coherent text that is statistically similar to the text on which they were trained. You can use this capability to increase your productivity by doing the following:

  • Generate content, such as document outlines, reports, and blogs

  • Summarize text from large corpuses of text, such as earnings call transcripts, annual reports, or chapters of user manuals

  • Extract insights and key takeaways from large passages of text, such as meeting notes or narratives

  • Improve text and catch grammatical errors or typos

The foundation models are a combination of Amazon SageMaker JumpStart and Amazon Bedrock large language models (LLMs). Canvas offers the following models:

Model Type Description

Amazon Titan

Amazon Bedrock model

Amazon Titan is a powerful, general-purpose language model that you can use for tasks such as summarization, text generation (such as creating a blog post), classification, open-ended Q&A, and information extraction. It is pretrained on large datasets, making it suitable for complex tasks and reasoning. To continue supporting best practices in the responsible use of AI, Amazon Titan foundation models are built to detect and remove harmful content in the data, reject inappropriate content in the user input, and filter model outputs that contain inappropriate content (such as hate speech, profanity, and violence).

Anthropic Claude Instant

Amazon Bedrock model

Anthropic's Claude Instant is a faster and more cost-effective yet still very capable model. This model can handle a range of tasks including casual dialogue, text analysis, summarization, and document question answering. Just like Claude-2, Claude Instant can support up to 100,000 tokens in each prompt, equivalent to about 200 pages of information.

Anthropic Claude-2

Amazon Bedrock model

Claude-2 is Anthropic's most powerful model, which excels at a wide range of tasks from sophisticated dialogue and creative content generation to detailed instruction following. Claude-2 can take up to 100,000 tokens in each prompt, equivalent to about 200 pages of information. It can generate longer responses compared to its prior version. It supports use cases such as question answering, information extraction, removing PII, content generation, multiple-choice classification, roleplay, comparing text, summarization, and document Q&A with citation.

Falcon-7B-Instruct

JumpStart model

Falcon-7B-Instruct has 7 billion parameters and was fine-tuned on a mixture of chat and instruct datasets. It is suitable as a virtual assistant and performs best when following instructions or engaging in conversation. Since the model was trained on large amounts of English-language web data, it carries the stereotypes and biases commonly found online and is not suitable for languages other than English. Compared to Falcon-40B-Instruct, Falcon-7B-Instruct is a slightly smaller and more compact model.

Falcon-40B-Instruct

JumpStart model

Falcon-40B-Instruct has 40 billion parameters and was fine-tuned on a mixture of chat and instruct datasets. It is suitable as a virtual assistant and performs best when following instructions or engaging in conversation. Since the model was trained on large amounts of English-language web data, it carries the stereotypes and biases commonly found online and is not suitable for languages other than English. Compared to Falcon-7B-Instruct, Falcon-40B-Instruct is a slightly larger and more powerful model.

Jurassic-2 Mid

Amazon Bedrock model

Jurassic-2 Mid is a high-performance text generation model trained on a massive corpus of text (current up to mid 2022). It is highly versatile, general-purpose, and capable of composing human-like text and solving complex tasks such as question answering, text classification, and many others. This model offers zero-shot instruction capabilities, allowing it to be directed with only natural language and without the use of examples. It performs up to 30% faster than its predecessor, the Jurassic-1 model.

Jurassic-2 Mid is AI21’s mid-sized model, carefully designed to strike the right balance between exceptional quality and affordability.

Jurassic-2 Ultra

Amazon Bedrock model

Jurassic-2 Ultra is a high-performance text generation model trained on a massive corpus of text (current up to mid 2022). It is highly versatile, general-purpose, and capable of composing human-like text and solving complex tasks such as question answering, text classification, and many others. This model offers zero-shot instruction capabilities, allowing it to be directed with only natural language and without the use of examples. It performs up to 30% faster than its predecessor, the Jurassic-1 model.

Compared to Jurassic-2 Mid, Jurassic-2 Ultra is a slightly larger and more powerful model.

Llama-2-7b-Chat

JumpStart model

Llama-2-7b-Chat is a foundation model by Meta that is suitable for engaging in meaningful and coherent conversations, generating new content, and extracting answers from existing notes. Since the model was trained on large amounts of English-language internet data, it carries the biases and limitations commonly found online and is best-suited for tasks in English.

Llama-2-13B-Chat

Amazon Bedrock model

Llama-2-13B-Chat by Meta was fine-tuned on conversational data after initial training on internet data. It is optimized for natural dialog and engaging chat abilities, making it well-suited as a conversational agent. Compared to the smaller Llama-2-7b-Chat, Llama-2-13B-Chat has nearly twice as many parameters, allowing it to remember more context and produce more nuanced conversational responses. Like Llama-2-7b-Chat, Llama-2-13B-Chat was trained on English-language data and is best-suited for tasks in English.

Llama-2-70B-Chat

Amazon Bedrock model

Like Llama-2-7b-Chat and Llama-2-13B-Chat, the Llama-2-70B-Chat model by Meta is optimized for engaging in natural and meaningful dialog. With 70 billion parameters, this large conversational model can remember more extensive context and produce highly coherent responses when compared to the more compact model versions. However, this comes at the cost of slower responses and higher resource requirements. Llama-2-70B-Chat was trained on large amounts of English-language internet data and is best-suited for tasks in English.

Mistral-7B

JumpStart model

Mistral-7B by Mistral.AI is an excellent general purpose language model suitable for a wide range of natural language (NLP) tasks like text generation, summarization, and question answering. It utilizes grouped-query attention (GQA) which allows for faster inference speeds, making it perform comparably to models with twice or three times as many parameters. It was trained on a mixture of text data including books, websites, and scientific papers in the English language, so it is best-suited for tasks in English.

Mistral-7B-Chat

JumpStart model

Mistral-7B-Chat is a conversational model by Mistral.AI based on Mistral-7B. While Mistral-7B is best for general NLP tasks, Mistral-7B-Chat has been further fine-tuned on conversational data to optimize its abilities for natural, engaging chat. As a result, Mistral-7B-Chat generates more human-like responses and remembers the context of previous responses. Like Mistral-7B, this model is best-suited for English language tasks.

MPT-7B-Instruct

JumpStart model

MPT-7B-Instruct is a model for long-form instruction following tasks and can assist you with writing tasks including text summarization and question-answering to save you time and effort. This model was trained on large amounts of fine-tuned data and can handle larger inputs, such as complex documents. Use this model when you want to process large bodies of text or want the model to generate long responses.

The foundation models from Amazon Bedrock are currently only available in the US East (N. Virginia) and US West (Oregon) Regions. Additionally, when using foundation models from Amazon Bedrock, you are charged based on the volume of input tokens and output tokens, as specified by each model provider. For more information, see the Amazon Bedrock pricing page. The JumpStart foundation models are deployed on SageMaker AI Hosting instances, and you are charged for the duration of usage based on the instance type used. For more information about the cost of different instance types, see the Amazon SageMaker AI Hosting: Real-Time Inference section on the SageMaker AI pricing page.

Document querying is an additional feature that you can use to query and get insights from documents stored in indexes using Amazon Kendra. With this functionality, you can generate content from the context of those documents and receive responses that are specific to your business use case, as opposed to responses that are generic to the large amounts of data on which the foundation models were trained. For more information about indexes in Amazon Kendra, see the Amazon Kendra Developer Guide.

If you would like to get responses from any of the foundation models that are customized to your data and use case, you can fine-tune foundation models. To learn more, see Fine-tune foundation models.

If you'd like to get predictions from an Amazon SageMaker JumpStart foundation model through an application or website, you can deploy the model to a SageMaker AI endpoint. SageMaker AI endpoints host your model, and you can send requests to the endpoint through your application code to receive predictions from the model. For more information, see Deploy your models to an endpoint.