https://x.com/mattpocockuk/status/1887943661419130974
Matt Pocock @mattpocockukLLM-as-a-judge in a few lines of code.
Important points:
- LLM's are bad at numeric scales, so we ask it to return a text enum which we then convert to a scale.
- Super useful for situations where deterministic tests just can't work.Feb 7, 2025 View on X →
Friday, February 7, 2025 Software
