software

Coreax – Revolutionizing Data Reduction With Coreset Algorithms In JAX

For n points in d dimensions, a coreset algorithm takes an n×d data set and reduces it to m≪n points whilst attempting to preserve the statistical properties of the full data set.

The algorithm maintains the dimension of the original data set. Thus the m points, referred to as the coreset, are also d-dimensional.

The m points need not be in the original data set. We refer to the special case where all selected points are in the original data set as a coresubset.

Some algorithms return the m points with weights, so that importance can be attributed to each point in the coreset. The weights, wi for i=1,…,m, are often chosen from the simplex. In this case, they are non-negative and sum to 1: wi>0 ∀i and ∑iwi=1.

Please see the documentation for some in-depth examples.

Example Applications

Choosing Pixels From An Image

In the example below, we reduce the original 180×215 pixel image (38,700 pixels in total) to a coreset approximately 20% of this size. (Left) original image.

(Centre) 8,000 coreset points chosen using Stein kernel herding, with point size a function of weight. (Right) 8,000 points chosen randomly. Run examples/david_map_reduce_weighted.py to replicate.

Setup

Before installing coreax, make sure JAX is installed. Be sure to install the preferred version of JAX for your system.

Install JAX noting that there are (currently) different setup paths for CPU and GPU use:

$ python3 -m pip install jax

For more information click here.

Varshini

Varshini is a Cyber Security expert in Threat Analysis, Vulnerability Assessment, and Research. Passionate about staying ahead of emerging Threats and Technologies.

Recent Posts

Nmap cheat sheet for beginners

Nmap (Network Mapper) is a free tool that helps you find devices on a network,…

6 hours ago

Understanding the Model Context Protocol (MCP) and How It Works

Introduction to the Model Context Protocol (MCP) The Model Context Protocol (MCP) is an open…

1 week ago

The file Command – Quickly Identify File Contents in Linux

While file extensions in Linux are optional and often misleading, the file command helps decode what a…

1 week ago

How to Use the touch Command in Linux

The touch command is one of the quickest ways to create new empty files or update timestamps…

1 week ago

How to Search Files and Folders in Linux Using the find Command

Handling large numbers of files is routine for Linux users, and that’s where the find command shines.…

1 week ago

How to Move and Rename Files in Linux with the mv Command

Managing files and directories is foundational for Linux workflows, and the mv (“move”) command makes it easy…

1 week ago