You have chosen to self-assess
The Rubric framework can be evaluated by AI, or you can conduct the evaluation yourself. You have chosen to conduct the evaluation yourself.
What this is

The purpose

Most people using AI have a sense of when something feels wrong with what they got back. Fewer have a reliable way of knowing why it feels off, or a way explaining it to someone else.

This tool gives you a structure for that. It walks you through seven dimensions of quality, asks you to score the output you're looking at, and produces a record of your evaluation. No specialist knowledge required. The framework does the framing while you bring the judgement.

The evaluation covers two things: the quality of what the AI produced, and the quality of how you engaged with it to get there. Both matter.

Before you start

What to have ready

You'll need two things. First, the AI-generated output you want to evaluate — a piece of writing, a response to a question, a structured document, or anything else produced by an AI tool.

Second, your conversation notes or the exchange itself. The evaluation asks you to reflect not just on what the AI gave you, but on how you worked with it. How you prompted, whether you challenged the output, how you refined it. The AI-generated version of this evaluation does that analysis automatically. When you evaluate yourself, you need to bring that thinking consciously.

This is a reflection exercise, not a test. The evaluation takes around fifteen to twenty minutes.

How it works

The process

The rubric has seven dimensions. Each one looks at a different aspect of quality, not just whether the output reads well, but whether it's actually doing what it should.

For each dimension, you'll find a short description of what it's assessing. Read it, consider your output against it, and select a score. There's a notes field for each dimension if you want to capture your thinking. That's optional, but it makes the final record more useful.

Work through all seven in order. At the end, there's a short overall summary field before you generate your report.

Scoring

The scale

Each dimension is scored across five bands. The descriptions are intentionally honest.

Insufficient 0 — 2
Partial 3 — 4
Adequate 5 — 6
Capable 7 — 8
Exemplary 9 — 10

Adequate means it passed a minimum threshold and nothing more. Capable is genuinely strong. Exemplary means nothing more could reasonably be asked of it. Score what you actually see, not what you were hoping for.

The output

What you get

At the end, you can download a structured PDF of your evaluation. It records your scores, your notes, and a summary, formatted for reference or sharing.

Once you've completed a self-evaluation, you may want to see how an AI-generated assessment of the same output compares with your own. That's available as a separate tool, and the differences between the two evaluations can be as instructive as the scores themselves.

Step 1
About the output

Before scoring, tell us a little about the output you're evaluating and how it was produced.


Progress
0 / 8 scored
Step 2
Seven dimensions

Work through each dimension in order. Select the band that best describes what you see. Add notes if they'd be useful to you later.

Dimension 1 of 7
Fit to Context
Does the output demonstrate genuine understanding of the specific situation, audience, and purpose, or does it read as generically applicable?
Dimension 2 of 7
Evidence and Grounding
Are claims, assertions, and recommendations supported by concrete evidence, examples, or reasoning, or are they asserted without foundation?
Dimension 3 of 7
Analytical Depth
Does the output generate insight and implication, or does it describe and summarise without advancing understanding?
Dimension 4 of 7
Purposeful Structure
Does the structure serve the purpose of the output, guiding the reader efficiently toward understanding or decision, or does it impose form without function?
Dimension 5 of 7
Appropriate Register
Is the tone, language level, and relational posture calibrated correctly to the audience, domain, and moment, and does it remain consistent?
Dimension 6 of 7
Critical Integrity
Does the output reflect honest, proportionate, and developmentally useful assessment, free from both approval-seeking affirmation and adversarial challenge?
Dimension 7 of 7
Evaluative Judgement
Does the output, and the process that produced it, demonstrate that the right level of critical engagement was applied at the right points?

Step 3
Holistic measure

After scoring the seven dimensions, consider the output as a whole for this final measure.

Holistic Measure
Level of AI Voice
To what degree does the output sound generated rather than authored? Where is AI voice present, where is it absent, and what is driving it?
Note: for this measure, lower is better. A score of 1–2 means the output reads as human-authored. A score of 9–10 means it is unmistakably AI-generated.

Step 4
Overall summary

In your own words, what are the two or three most significant things you found? What single change would most improve the output?

Your summary

Draw on your dimensional scores and notes. You do not need to repeat every detail, just the things that matter most.


Score summary
Dimension Band Score
Fit to Context
Evidence and Grounding
Analytical Depth
Purposeful Structure
Appropriate Register
Critical Integrity
Evaluative Judgement
AI Voice (holistic)

When you're satisfied with your evaluation, generate your PDF record.

Please score all eight dimensions before generating the report.