How to version LLM prompts and test data in a real project

Prompt changes feel lightweight until the first production bug is impossible to reproduce. I have seen teams edit a prompt in an admin screen, improve one demo case, and quietly make five normal cases worse. The model did not change, the code did not change, but nobody could say which prompt version answered the customer. I now treat prompts more like configuration with release history. Every…