@basementuniverse/tf-idf
TypeScript icon, indicating that this package has built-in type declarations

0.0.3 • Public • Published

TF-IDF

Search for terms in an array of documents using TF-IDF.

Installation

npm install -g @basementuniverse/tf-idf

Usage

import { Corpus } from '@basementuniverse/tf-idf';

const corpus = new Corpus([
  'This is a document',
  'Here is another document',
]);

const results = corpus.search('document');

results will look something like:

[
  {
    "document": "This is a document",
    "score": 0.5
  },
  {
    "document": "Here is another document",
    "score": 0.5
  }
]

The documents passed into the Corpus constructor will be treated as strings by default, and will be converted to lowercase and split by non-word characters.

However, it is possible to pass in values of any type here, as long as you provide a function to convert each value to an array of strings. For example:

const corpus = new Corpus(
  [
    {
      id: '1234',
      name: 'John Doe',
    },
    {
      id: '2345',
      name: 'Jane Doe',
    },
  ],
  document => [document.id, ...document.name.toLowerCase().split(' ')],
);

Partial term matching can be enabled by passing true as the second argument to search():

const results = corpus.search('doe', true);

Package Sidebar

Install

npm i @basementuniverse/tf-idf

Weekly Downloads

4

Version

0.0.3

License

MIT

Unpacked Size

5.71 kB

Total Files

6

Last publish

Collaborators

  • basementuniverse