Relatude.DB with TurboQuant from Google Research

This experiment contains 30 000 wikipedia articles indexed in a vector database with 3072 dimensions from OpenAI (text-embedding-3-large). The example contains two database instances, to compare the difference using a basic flat vector index, and the new TurboQuant algorithm from Google Research.

Relatude.DB is an opensource C# native database under development. Click here for the code of this example. Click here for the TurboQuant implementation.

The TurboQuant algorithm reduce the memory by a factor of around 3x while maintaining most of the speed. On my local machine TurboQuant search is faster than the flat index. On this hosting server, it is slower for some reason. The search is a "brute force" implementation, and all 30 000 vectors are compared with the query vector, there is no partitioning or clustering of the vectors. This is just a first attempt and optimzation is possible, but it is already impressive to see how much memory can be saved with this approach while maintaining good search quality. It is running with 4096 dimensions, where 1024 is padded, 3 bits, 200 000 initial training. ( The query cache is off, but search a few times to allow for warmup )

Results

Search time:

-- ms

Vector index size:

-- MB

No search executed yet.

Run a search to see matching articles.

Select a result to see full content.