software

ANNchor – Accelerating k-NN Graphs For Complex Metrics With Machine Learning

ANNchor is a python library which constructs approximate k-nearest neighbour graphs for slow metrics.

The k-NN graph is an extremely useful data structure that appears in a wide variety of applications, for example: clustering, dimensionality reduction, visualisation and exploratory data analysis (EDA).

However, if we want to use a slow metric, these k-NN graphs can take an exceptionally long time to compute.

Typical slow metrics include the Wasserstein metric (Earth Mover’s distance) applied to images, and Levenshtein (Edit) distance on long strings, where the time taken to compute these distances is significantly longer than a typical Euclidean distance.

ANNchor uses Machine Learning methods to infer true distances between points in a data set from a variety of features derived from anchor points (aka landmarks/waypoints).

In practice, this means that ANNchor does not make as many calls to the underlying metric as other state of the art k-NN graph generation techniques.

This translates to quicker run times, especially when the metric is slow.

Results from ANNchor can easily be combined with other popular libraries in the Data Science community.

In the docs we give examples of how to use ANNchor in an EDA pipeline alongside UMAP and HDBSCAN.

Installation

Clone this repo and install with pip:

pip install git+https://github.com/gchq/annchor.git

Basic Usage

import numpy as np
import annchor

X =          #your data, list/np.array of items
distance =   #your distance function, distance(X[i],X[j]) = d

ann = annchor.Annchor(X,
                      distance,
                      n_anchors=15,
                      n_neighbors=15,
                      p_work=0.1)
ann.fit()

print(ann.neighbor_graph)
Varshini

Varshini is a Cyber Security expert in Threat Analysis, Vulnerability Assessment, and Research. Passionate about staying ahead of emerging Threats and Technologies.

Recent Posts

Best OSINT Tools for Journalists 2026: Verify Sources, Images and Claims

Journalists use OSINT to verify public information before publishing. In 2026, misinformation, AI-generated images, fake…

6 hours ago

Install Docker on Ubuntu 20.04: Complete Step-by-Step Guide

Docker is an open-source platform that lets you package and run applications inside containers. Each container…

16 hours ago

Install PostgreSQL on Ubuntu: Database Setup and Admin Guide

PostgreSQL (often called Postgres) is an open-source relational database system. It supports advanced features like JSON…

17 hours ago

Install Xrdp Remote Desktop on Ubuntu: Setup and Connect

Xrdp is an open-source server that lets you connect to your Ubuntu machine from another computer…

17 hours ago

Tomcat 9 on Ubuntu 20.04: Install, Configure, and Start

Apache Tomcat is an open-source web server and Java servlet container. It is one of the…

17 hours ago

Automatic Updates on Ubuntu: Set Up unattended-upgrades

Keeping your Ubuntu system updated is one of the best ways to protect it. Security…

19 hours ago