What I learned fixing duplicate embeddings in a product search index
I had a product search project where vector results started repeating the same item under slightly different titles. The business complaint was simple: buyers searched for a replacement part and saw four nearly identical cards before any alternative appeared. At first I thought the embedding model was weak, but the model was only exposing a data hygiene problem. I pulled the raw product feed, emb…