This continues from the previous post.

Removing FAISS and Introducing NumPy Calculations

In the previous post, I said I had built a RAG system using LangChain’s FAISS. While studying the internals, I realized there was no real reason to use FAISS. Because similarity search can be performed more efficiently with simple NumPy calculations in this case, I changed the implementation.

First, I wondered whether NumPy could perform the same role as FAISS. So I ran a small experiment.

Experiment: Comparing FAISS and NumPy Similarity Search Speed

FAISS is definitely fast, but as you can see below, the difference was only about 0.005 seconds. That is not a meaningful bottleneck for this project, so I judged that FAISS was unnecessary.

Performance Comparison (Top-5 Search)

Method	Elapsed Time	Speed Rank
FAISS	0.002095 sec	1st
NumPy (Pure Python)	0.007358 sec	2nd
Scikit-learn (Brute Force)	0.023916 sec	3rd

Detailed Results

[A] FAISS

Elapsed time: 0.002095 sec
Indexes: [0, 160, 2260, 1256, 307]
Distances (L2): [0.0, 0.364, 0.397, 0.413, 0.415]

[B] Scikit-learn (Brute Force)

Elapsed time: 0.023916 sec
Indexes: [0, 160, 2260, 1256, 307]
Distances (L2): [0.0, 0.603, 0.630, 0.643, 0.644]

[C] NumPy (Pure Python)

Elapsed time: 0.007358 sec
Indexes: [0, 160, 2260, 1256, 307]
Distances (Squared L2): [0.0, 0.364, 0.397, 0.413, 0.415]

Reducing GitHub Actions Runtime

Workflow Runtime

The workflow became almost one minute shorter, so I think this was the right choice.

Before

After

Closing

The remaining tasks are the four items I mentioned in the previous post, plus ongoing fixes as I use the system.

Make it usable without initial environment variable setup.
Build it in a form other people can use.
Visualize the built LeetCode problems as a growing tree: when I solve a problem, the tree grows; solved problems become glowing leaves; difficulty is represented with different colors.
Graph network visualization.

Hun-Bot

Algorithm RAG Revision