Getting started

Before configuring an evaluation suite, the associated Logic function must be published. Note that Evaluations does not yet support Logic functions where the final output is Ontology edits.

  1. Begin by selecting the 🔨 icon on the right-side toolbar and select Set up tests.

Evaluations side panel in a AIP Logic function.

  1. Select Add evaluation function from the configuration panel on the left side of the evaluation suite.

A new evaluation suite.

  1. You can choose from a series of built-in or Marketplace deployed Functions. You also have the option of selecting a custom Evaluation function.

Evaluation function selection window.

Built-in Evaluation functions

Examples of built-in Evaluation functions include:

  • Exact string match: Checks if the actual string is exactly equal to the expected string.
  • Integer range: Checks if the actual value lies within the range of expected values. Only integers are supported.
  • Exact boolean match: Checks if the actual boolean is exactly equal to the expected boolean.
  • Exact object match: Checks if the actual object is exactly equal to the expected object.
  • Floating-point range: Checks if the actual value lies within the range of expected values. All numeric types are supported as parameters.
  • Temporal range: Checks if the actual value lies within the range of expected values. Only Date and Timestamp values are supported.

Marketplace deployed Functions

Selecting a Marketplace deployed Function will open a set-up wizard to guide you through the installation process. Below is an example of a Marketplace Function, with more to come:

  • Rubric grader: A general purpose LLM-backed evaluator for grading generated text based on a dynamic marking rubric.

Custom Evaluation functions

Custom Evaluation functions allow you to select previously published Functions. These can be Functions on Objects written in Code Repository or other AIP Logic functions. Currently, custom Evaluation functions must return either boolean or numeric types.

Configure the Evaluation function by selecting parameters for the actual Logic function output value and the expected output value. Depending on the Evaluation function, you may need to configure other parameters.

The Produced metrics field allows you to name the metric displayed in the evaluations metrics dashboard. For example, instead of the default "isExactMatch", you may choose to rename the metric to something more semantically meaningful to your use case, like "classificationIsCorrect."

Evaluation function configuration panel with function parameters.

Write test cases by selecting Add test case. Give each test case a name and select the input(s) and their respective expected values. The actual output value is automatically included as part of the test case and does not need to be configured.

Test Case configuration screen.

After saving, you can run test cases in the evaluation suite and begin to collect metrics to view in the evaluations metrics dashboard. Additionally, you can run test cases directly in the main Logic page by select the 🔨 icon and selecting Run all tests.

Metrics in AIP Logic's Evaluation side panel.