software

ANNchor – Accelerating k-NN Graphs For Complex Metrics With Machine Learning

ANNchor is a python library which constructs approximate k-nearest neighbour graphs for slow metrics.

The k-NN graph is an extremely useful data structure that appears in a wide variety of applications, for example: clustering, dimensionality reduction, visualisation and exploratory data analysis (EDA).

However, if we want to use a slow metric, these k-NN graphs can take an exceptionally long time to compute.

Typical slow metrics include the Wasserstein metric (Earth Mover’s distance) applied to images, and Levenshtein (Edit) distance on long strings, where the time taken to compute these distances is significantly longer than a typical Euclidean distance.

ANNchor uses Machine Learning methods to infer true distances between points in a data set from a variety of features derived from anchor points (aka landmarks/waypoints).

In practice, this means that ANNchor does not make as many calls to the underlying metric as other state of the art k-NN graph generation techniques.

This translates to quicker run times, especially when the metric is slow.

Results from ANNchor can easily be combined with other popular libraries in the Data Science community.

In the docs we give examples of how to use ANNchor in an EDA pipeline alongside UMAP and HDBSCAN.

Installation

Clone this repo and install with pip:

pip install git+https://github.com/gchq/annchor.git

Basic Usage

import numpy as np
import annchor

X =          #your data, list/np.array of items
distance =   #your distance function, distance(X[i],X[j]) = d

ann = annchor.Annchor(X,
                      distance,
                      n_anchors=15,
                      n_neighbors=15,
                      p_work=0.1)
ann.fit()

print(ann.neighbor_graph)
Tamil S

Tamil has a great interest in the fields of Cyber Security, OSINT, and CTF projects. Currently, he is deeply involved in researching and publishing various security tools with Kali Linux Tutorials, which is quite fascinating.

Recent Posts

ShadowDumper – Advanced Techniques For LSASS Memory Extraction

Shadow Dumper is a powerful tool used to dump LSASS (Local Security Authority Subsystem Service)…

20 hours ago

Shadow-rs : Harnessing Rust’s Power For Kernel-Level Security Research

shadow-rs is a Windows kernel rootkit written in Rust, demonstrating advanced techniques for kernel manipulation…

2 weeks ago

ExecutePeFromPngViaLNK – Advanced Execution Of Embedded PE Files via PNG And LNK

Extract and execute a PE embedded within a PNG file using an LNK file. The…

3 weeks ago

Red Team Certification – A Comprehensive Guide To Advancing In Cybersecurity Operations

Embark on the journey of becoming a certified Red Team professional with our definitive guide.…

3 weeks ago

CVE-2024-5836 / CVE-2024-6778 : Chromium Sandbox Escape via Extension Exploits

This repository contains proof of concept exploits for CVE-2024-5836 and CVE-2024-6778, which are vulnerabilities within…

4 weeks ago

Rust BOFs – Unlocking New Potentials In Cobalt Strike

This took me like 4 days (+2 days for an update), but I got it…

4 weeks ago