@urschrei/ckmeans
TypeScript icon, indicating that this package has built-in type declarations

1.0.5 • Public • Published

Ckmeans

Documentation

Ckmeans clustering is an improvement on 1-dimensional (univariate) heuristic-based clustering approaches such as Jenks. The algorithm was developed by Haizhou Wang and Mingzhou Song (2011) as a dynamic programming approach to the problem of clustering numeric data into groups with the least within-group sum-of-squared-deviations.

Minimizing the difference within groups – what Wang & Song refer to as withinss, or within sum-of-squares – means that groups are optimally homogenous within and the data is split into representative groups. This is very useful for visualization, where one may wish to represent a continuous variable in discrete colour or style groups. This function can provide groups that emphasize differences between data.

Being a dynamic approach, this algorithm is based on two matrices that store incrementally-computed values for squared deviations and backtracking indexes.

Unlike the original implementation, this implementation does not include any code to automatically determine the optimal number of clusters: this information needs to be explicitly provided. It does provide the roundbreaks method to produce nclusters - 1 breaks for labelling, however.

How To Use

Browser as ES Module

// preliminary ritual
import _initCkmeansWasm, {ckmeans_wasm, roundbreaks_wasm } from "@urschrei/ckmeans";
const CKMEANS_WASM_VERSION = "1.0.5";
const CKMEANS_WASM_CDN_URL = `https://cdn.jsdelivr.net/npm/@urschrei/ckmeans@${CKMEANS_WASM_VERSION}/ckmeans_bg.wasm`;
let WASM_READY = false;

export async function initCkmeansWasm() {
  if (WASM_READY) {
    return;
  }
  await _initCkmeansWasm(CKMEANS_WASM_CDN_URL);
  console.log(`got wasm from ${CKMEANS_WASM_CDN_URL}`);
  WASM_READY = true;
}
await initCkmeansWasm();

// Now let's calculate some clusters and breaks

let data = [3.0, 12.0, 13.0, 14.0, 15.0, 16.0, 2.0, 2.0, 3.0,
            5.0, 7.0, 1.0, 2.0, 5.0, 7.0,
            1.0, 5.0, 82.0, 1.0, 1.3, 1.1, 78.0]
let nclusters = 3;
try {
    let clusters = wasm.ckmeans_wasm(data, nclusters);
    // [
    // [1.0, 1.0, 1.0, 1.0, 1.1, 1.3, 2.0, 2.0, 2.0, 3.0, 5.0,
    //  5.0, 5.0, 7.0, 7.0],
    // [12., 13., 14., 15., 16.],
    // [78., 82.]
    // ]
    console.info(clusters);
} catch (error) {
    console.error("Error:", error);
}
try {
    let breaks = wasm.roundbreaks_wasm(data, nclusters);
    // [9.0, 40.0]
    console.info(breaks);
} catch (error) {
    console.error("Error:", error);
    }

Observable

ckmeans_wasm = {
  const wasm_module = await import(
    'https://unpkg.com/@urschrei/ckmeans@1.0.5/ckmeans.js'
  );
  await wasm_module.default();
  return wasm_module.ckmeans_wasm;
}

Perf

100k floats into 5 clusters in ~38 ms.

References

  1. Wang, H., & Song, M. (2011). Ckmeans.1d.dp: Optimal k-means Clustering in One Dimension by Dynamic Programming. The R Journal, 3(2), 29.

Package Sidebar

Install

npm i @urschrei/ckmeans

Weekly Downloads

0

Version

1.0.5

License

MIT OR Apache-2.0

Unpacked Size

57.6 kB

Total Files

7

Last publish

Collaborators

  • urschrei