evaluate()
Run an evaluation experiment on a dataset.Signature
The target function to evaluate. Can be:
- An async function:
(input: TInput, config?: TargetConfigT) => Promise<TOutput> - A sync function:
(input: TInput, config?: TargetConfigT) => TOutput - An object with
invokemethod
Evaluation options
EvaluateOptions properties
EvaluateOptions properties
The dataset to evaluate on. Can be:
- Dataset name (string)
- Array of examples
- Async iterable of examples
A list of evaluators to run on each example.
A list of summary evaluators to run on the entire dataset.
A prefix to provide for your experiment name.
A free-form description of the experiment.
Metadata to attach to the experiment.
The maximum concurrency for predictions/evaluations.
The maximum number of concurrent predictions to run. Defaults to
maxConcurrency when set.The maximum number of concurrent evaluators to run. Defaults to
maxConcurrency when set.The LangSmith client to use.
The number of repetitions to perform. Each example will be run this many times.
Whether to use attachments for the experiment.
Example evaluators
Row-level evaluator
Evaluators run on each example and return feedback:Summary evaluator
Summary evaluators run on the entire dataset:Evaluator return types
Evaluators can return:- Single evaluation result:
- Multiple results:
- Array of results: