Examples
This section provides practical examples of using Viteval for different scenarios.
Basic Text Evaluation
ts
import { evaluate, scorers } from 'viteval';
evaluate('Simple QA', {
data: async () => [
{ input: "What is the capital of France?", expected: "Paris" },
{ input: "What is 2+2?", expected: "4" },
],
task: async (input) => {
return await callYourLLM(input);
},
scorers: [scorers.levenshtein],
threshold: 0.8,
});
Custom Dataset
ts
import { defineDataset } from 'viteval/dataset';
const mathDataset = defineDataset({
name: 'basic-math',
data: async () => {
const problems = [];
for (let i = 0; i < 100; i++) {
const a = Math.floor(Math.random() * 10);
const b = Math.floor(Math.random() * 10);
problems.push({
input: `What is ${a} + ${b}?`,
expected: String(a + b),
});
}
return problems;
},
});
evaluate('Math solver', {
data: () => mathDataset.data(),
task: async (input) => await solveMath(input),
scorers: [scorers.exactMatch],
threshold: 0.9,
});
Multiple Scorers
ts
evaluate('Content quality', {
data: async () => loadQuestions(),
task: async (input) => await generateContent(input),
scorers: [
scorers.factual, // Must be factually correct
scorers.answerSimilarity, // Must be semantically similar
scorers.moderation, // Must be safe content
],
threshold: 0.8,
});
Runnable Examples
There are a number of examples available in the Viteval GitHub repository.
Simple Evaluation Simple text-based evaluation
Complex Evaluation Complex & real-world evaluation