streamsummary-stream

0.4.0 • Public • Published

streamsummary-stream

Stream-based implementation of the StreamSummary data structure described in this paper.

Pipe in your buffers/strings to get approximate top-K most frequent elements.

var StreamSummary = require('streamsummary-stream');
var ss = new StreamSummary(50);
 
//...
 
myDataSource.pipe(ss);
 
ss.on('finish', function() {
  console.log(ss.frequency('42'));
  console.log(ss.top());
});

Requires es6 Map

This module uses es6 Maps, so you probably need node.js >= 0.12 or io.js.

API

StreamSummary(size, streamOpts)

Construct a new writable StreamSummary to track the size most frequent elements (extends Stream.Writable).

  • size - the number of elements to track
  • streamOpts - the options to pass to the Stream constructor

StreamSummary.frequency(element)

Get the approximate frequency of element. Returns null if the element isn't in the top size elements.

  • element - the value in question

StreamSummary.top()

Get the top size most frequent elements in ascending order of frequency.

StreamSummary.export()

Export the StreamSummary data as an object. Exported object will look like:

{
  size: 42,
  numUsedBuckets: 40,
  trackedElements: {...},
  registers: [...]
}

StreamSummary.import(data)

Import a StreamSummary data object (expects same format as export() returns).

  • data - object containing StreamSummary data

StreamSummary.merge(ss)

Merge another StreamSummary with this one. Returns a new StreamSummary of size equal to the combined sizes of the two.

  • ss - another StreamSummary instance

Package Sidebar

Install

npm i streamsummary-stream

Weekly Downloads

4

Version

0.4.0

License

MIT

Last publish

Collaborators

  • b3nj4m