Core concepts

The following core concepts are essential to understanding and getting the most out of AIP Agent Studio. You can learn more about applying these concepts in the Getting started tutorial.

AIP Agents

AIP Agents are interactive assistants equipped with enterprise-specific information and tools.

System prompt

System prompts are instructions for a large language model (LLM), written in natural language. Start with the most important information, such as an overview of the task, followed by the necessary data and guidance on using parameters and tools. Remember, an LLM only has access to the information you specifically provide.

Prompting strategy

Prompting strategy refers to how prompts are composed to guide the LLM effectively. A prompting strategy involves structuring prompts to achieve the desired outcomes, whether through single completion or more complex, iterative processes.

Single completion

Single completion is a strategy where the LLM generates a response in one attempt without iterative steps. This is suitable for straightforward tasks and ensures a faster time to first token (TTFT).

Chain-of-thought

Chain-of-thought reasoning inserts additional instructions into your prompt to enable the LLM to use tools and take multiple iterative steps to achieve its goal. This approach handles more complex tasks by breaking them down into manageable steps, though it may increase TTFT due to added complexity. In AIP Agent Studio, this iterative process can be viewed using the Inspect reasoning option.

Retrieval-augmented generation (RAG)

Retrieval-augmented generation leverages external data sources to provide the LLM with relevant information dynamically. This method enhances the LLM's responses by ensuring they are based on the most current and contextually appropriate data.

Retrieval context

Retrieval context refers to the specific information retrieved in response to each message, which is used to generate a response. The process is as follows:

  1. User sends a new message.
  2. Based on the user's message, fetch relevant content from configured data sources.
  3. Include the relevant content along with the user's message to the LLM.

This process is compatible with a single-completion prompting strategy, providing a faster TTFT than chain-of-thought. Review retrieval context documentation.

Time to first token (TTFT)

Time to first token (TTFT) measures how quickly the LLM begins generating a response after receiving a prompt. A lower TTFT indicates a faster response time, crucial for maintaining a smooth user experience. Optimizing TTFT involves minimizing the complexity and length of the prompt and ensuring efficient retrieval of relevant context. Chain-of-thought reasoning can increase TTFT due to added complexity.

Parameters

Parameters are variables within prompts that customize and control the LLM's behavior. They allow for dynamic input and can be adjusted based on the task requirements.

Tools

Tools are external functionalities or APIs that the LLM can use to perform specific actions or retrieve information beyond its base capabilities. AIP Agents with tools configured will use chain-of-thought reasoning.

Embeddings

Embeddings are numerical representations of text that capture semantic meaning. They are used to compare and retrieve similar pieces of text efficiently. In AIP Agent Studio, embeddings help identify relevant documents and context to provide accurate responses.

Context window

The context window refers to the amount of text (usually measured in tokens) that an LLM can process at one time. In AIP Agent Studio, the context window includes the system prompt, conversation history, and the information injected to assist the LLM (including retrieval context, parameters, and tools). Exceeding the context window will result in an error, prompting users to create a new session to continue their interaction.