
Core Setup
Shortlist prompts into a Golden Dataset
Last Updated on June 19, 2023
Think of this like shortlisting the best prompts — the ones that truly represent real cases, tricky edge conditions, weird failure shapes and your domain.
Inside your evaluation results you’ll see per prompt scoring, flags, vulnerability patterns and validator outcomes. From here, open each case, review the response, check the failure or success state, and simply mark the ones you want to lock-in by clicking “Add to Golden”.
When you’re done, save this as Golden Dataset (or any semantic naming you prefer). This becomes your repeatable baseline for future test cycles.
Related to Core Setup
