Algorithm RAG Revision
This continues from the previous post.
Removing FAISS and Introducing NumPy Calculations
In the previous post, I said I had built a RAG system using LangChain’s FAISS. While studying the internals, I realized there was no real reason to use FAISS. Because similarity search can be performed more efficiently with simple NumPy calculations in this case, I changed the implementation.
First, I wondered whether NumPy could perform the same role as FAISS. So I ran a small experiment.
Experiment: Comparing FAISS and NumPy Similarity Search Speed
FAISS is definitely fast, but as you can see below, the difference was only about 0.005 seconds. That is not a meaningful bottleneck for this project, so I judged that FAISS was unnecessary.
Performance Comparison (Top-5 Search)
| Method | Elapsed Time | Speed Rank |
|---|---|---|
| FAISS | 0.002095 sec | 1st |
| NumPy (Pure Python) | 0.007358 sec | 2nd |
| Scikit-learn (Brute Force) | 0.023916 sec | 3rd |
Detailed Results
[A] FAISS
- Elapsed time: 0.002095 sec
- Indexes:
[0, 160, 2260, 1256, 307] - Distances (L2):
[0.0, 0.364, 0.397, 0.413, 0.415]
[B] Scikit-learn (Brute Force)
- Elapsed time: 0.023916 sec
- Indexes:
[0, 160, 2260, 1256, 307] - Distances (L2):
[0.0, 0.603, 0.630, 0.643, 0.644]
[C] NumPy (Pure Python)
- Elapsed time: 0.007358 sec
- Indexes:
[0, 160, 2260, 1256, 307] - Distances (Squared L2):
[0.0, 0.364, 0.397, 0.413, 0.415]
Reducing GitHub Actions Runtime
Workflow Runtime
The workflow became almost one minute shorter, so I think this was the right choice.
Before

After

Closing
The remaining tasks are the four items I mentioned in the previous post, plus ongoing fixes as I use the system.
- Make it usable without initial environment variable setup.
- Build it in a form other people can use.
- Visualize the built LeetCode problems as a growing tree: when I solve a problem, the tree grows; solved problems become glowing leaves; difficulty is represented with different colors.
- Graph network visualization.
댓글