fore-ai

0.0.3 • Public • Published

The fore client package

The foresight library within fore SDK allows you to easily evaluate the performance of your LLM system based on a variety of metrics.

You can sign-up as a beta tester at https://foreai.co.

Quick start

  1. Install the package using npm:

    npm install fore-ai
    • Get started with the following lines:
    const { Foresight } = require("fore-ai");
    
    const foresight = new Foresight({ apiToken: "<YOUR_API_TOKEN>" });
    
    await foresight.log({
    	query: "What is the easiest programming language?",
    	response: "Python",
    	contexts: ["Python rated the easiest programming language"],
    	tag: "my_awesome_experiment",
    });
    
    // You can add more such queries using foresight.log
    // ....
    
    await foresight.flush();
    • Or alternatively to curate your evalsets and run regular evals against them do:
    const { Foresight, MetricType } = require("fore-ai");
    
    const foresight = new Foresight({ apiToken: "<YOUR_API_TOKEN>" });
    
    const evalset = await foresight.createSimpleEvalset({
    	evalsetId: "programming-languages",
    	queries: [
    		"hardest programming language?",
    		"easiest programming language?",
    	],
    	referenceAnswers: ["Malbolge", "Python"],
    });
    
    const runConfig = {
    	evalsetId: "programming-languages",
    	experimentId: "my-smart-llm",
    	metrics: [MetricType.GROUNDEDNESS, MetricType.SIMILARITY],
    };
    
    const myGenerateGn = (query) => {
    	// Do the LLM processing with your model...
    	// Here is some demo code:
    
    	return {
    		generatedResponse: query.includes("hardest")
    			? "Malbolge"
    			: "Python",
    		contexts: [
    			"Malbolge is the hardest language",
    			"Python is the easiest language",
    		],
    	};
    };
    
    await foresight.generateAnswersAndRunEval({
    	generateFn: myGenerateGn,
    	runConfig,
    });

Metrics

Groundedness

Depends on:

  • LLM's generated response;
  • Context used for generating the answer.

The metric answers the question: Is the response based on the context and nothing else?

This metric estimates the fraction of facts in the generated response that can be found in the provided context.

Example:

  • Context: The front door code has been changed from 1234 to 7945 due to security reasons.
  • Q: What is the current front door code?
  • A1: 7945. [groundedness score = 0.9]
  • A2: 0000. [groundedness score = 0.0]
  • A3: 1234. [groundedness score = 0.04]

Similarity

Depends on:

  • LLM's generated response;
  • A reference response to compare the generated response with.

The metric answers the question: Is the generated response semantically equivalent to the reference response?

Example:

  • Question: Is Python an easy programming language to learn?
  • Reference response: Python is an easy programming language to learn
  • Response 1: It is easy to be proficient in python [similarity score = 0.72]
  • Response 2: Python is widely recognized for its simplicity. [similarity score = 0.59]
  • Response 3: Python is not an easy programming language to learn [similarity score = 0.0]

Relevance (coming soon)

Depends on:

  • LLM's generated response;
  • User query/question.

The metric answers the question: Does the response answer the question and only the question?

This metric checks that the answer given by the LLM is trying to answer the given question precisely and does not include irrelevant information.

Example:

  • Q: At which temperature does oxygen boil?
  • A1: Oxygen boils at -183 °C. [relevance score = 1.0]
  • A2: Oxygen boils at -183 °C and freezes at -219 °C. [relevance score = 0.5]

Completeness (coming soon)

Depends on:

  • LLM's generated response;
  • User query/question.

The metric answers the question: Are all aspects of the question answered?

Example:

  • Q: At which temperature does oxygen boil and freeze?
  • A1: Oxygen boils at -183 °C. [completeness score = 0.5]
  • A2: Oxygen boils at -183 °C and freezes at -219 °C. [completeness score = 1.0]

Package Sidebar

Install

npm i fore-ai

Weekly Downloads

78

Version

0.0.3

License

ISC

Unpacked Size

27.4 kB

Total Files

5

Last publish

Collaborators

  • apanakkat