How We Achieve Human-Level Marking
Discover the technology behind the first AI feedback engine calibrated specifically for Singapore's SEAB 8881 syllabus.
The Simple 3-Step Flow
From submission to actionable feedback in under 45 seconds.
Submit
Paste your GP essay question and text (min. 50 characters).
Process
Our 6 specialised AI agents analyse your work in parallel to ensure speed and objectivity.
Feedback
Receive a holistic grade (A-U), detailed mark breakdown (/50), and actionable improvement steps in under 45 seconds.
The 6-Agent Marking Pipeline
Each agent is a specialist, calibrated to a specific task. Together, they replicate the rigour of a human examination team.
Question Analyser
Dissects the question's demands, key terms, and common pitfalls before marking begins. This agent ensures every subsequent evaluation is grounded in what the question actually asks for.
Content Evaluator
Assesses argument quality, evidence depth, and relevance to the question—completely independent of language quality. Prevents bias by evaluating content in isolation, exactly how SEAB examiners are trained.
Language Evaluator
Evaluates expression, grammar, vocabulary range, and academic tone—without knowledge of content scores. This parallel evaluation ensures fairness and prevents one dimension from overshadowing the other.
Holistic Marker
Synthesizes all data into a final grade, generating a 100-200 word examiner report. This agent applies the official marking rubric and produces the grade justification you'd receive from a human examiner.
Language Corrector
Provides inline "strikethrough" corrections for grammar, punctuation, and register. Shows exactly what to fix and how—no vague suggestions.
Content Improver
Identifies weak arguments and rewrites them into "Band 5" quality passages. Shows you not just what's wrong, but exactly how to elevate your reasoning to top-tier level.
Calibration & Accuracy
Strict Schema Enforcement
Every score is validated via Pydantic schemas to ensure Gemini never produces out-of-range marks. If a mark doesn't fit the official rubric, the system rejects it automatically.
Confidence Check
If the AI is unsure (confidence < 0.7), it flags the essay for human review. This ensures that borderline or unusual essays receive the nuance they deserve, rather than an overconfident guess.