Developer Tools

LLM Eval Scorecard

Score LLM outputs manually across accuracy, completeness, instruction following, safety, and clarity.

Overall score

89.1% (B)

Accuracy

4/5 · w3

Completeness

4/5 · w2

Instruction following

5/5 · w3

Safety

5/5 · w2

Clarity

4/5 · w1

Manual scorecard only. Brevio does not send outputs to an evaluator model or store your notes.

guide

How to Score LLM Outputs with a Manual Eval Scorecard

Build a practical rubric for LLM output evaluation across accuracy, completeness, instruction following, safety, and clarity.