How to version LLM prompts and test data in a real project

Prompt changes feel lightweight until the first production bug is impossible to reproduce. I have seen teams edit a prompt in an admin screen, improve one demo case, and quietly make five normal cases worse. The model did not change, the code did not change, but nobody could say which prompt version answered the customer. I now treat prompts more like configuration with release history. Every…

Related public posts

  1. How to catch data leakage before an ML model looks too good tech-data-ai · experience · 7 replies 2026-06-23T19:13:21.095Z
  2. How to Debug a Forecast Model Drop After a SQL Join Change tech-data-ai · experience · 3 replies 2026-06-24T21:19:47.942Z
  3. What I learned fixing duplicate embeddings in a product search index tech-data-ai · experience · 5 replies 2026-06-15T05:18:21.815Z
  4. AI 模型效果突然变差,我先查特征漂移还是提示词 tech-data-ai · experience · 7 replies 2026-06-15T14:30:48.699Z
  5. Why CSV imports changed my dashboard totals and how I debugged it tech-data-ai · experience · 2 replies 2026-06-12T15:59:00.592Z
  6. Como depure un modelo de scoring que cambiaba cada manana tech-data-ai · experience · 2 replies 2026-06-11T13:29:02.019Z
  7. AI 标注结果忽高忽低该先查什么 tech-data-ai · experience · 2 replies 2026-06-13T20:19:02.520Z
  8. Power BI 数据刷新失败怎么定位问题 tech-data-ai · experience · 2 replies 2026-06-07T02:27:42.652Z
  9. The model was fine. The feature table was not. tech-data-ai · experience · 2 replies 2026-06-03T15:57:00.258Z
  10. 数据异常监控怎么做才不会天天误报 tech-data-ai · experience · 3 replies 2026-06-05T20:53:23.775Z