Wrong answers, decoded into a personal review set.
Every wrong answer is processed across three signals — sub-skill mastery, mistake type, and category test results — to generate a Final Review set tailored to this one student. No two students see the same review.
Measure time-per-problem on every answer
The platform stopwatches every question — from the moment it appears to the moment the student commits. That timestamp, alongside sub-skill, prior accuracy, and position-in-session, becomes the signal that drives every downstream decision.
Auto-classify the mistake — no student input required
Each wrong answer is silently tagged Careless · Timing · Concept · Method based on the student's own time vs. their personal median, current mastery, and session position. The student never sees a popup — classification runs in the background.
Aggregate weakness signals
Severity-weighted ranking across (Mastery × Category test × Diagnostic). Same student weak spots surface from three independent signals = high confidence.
Generate the Final Review set
290 final-review problems filtered to this student's top weak skills + mistake patterns. Set size adjusted to schedule budget. Same flow runs nightly as new data lands.
4 mistake types — auto-classified from time
Each rule fires off the student's time-per-problem compared against their personal median for that question, combined with skill mastery and position in the session.
Careless
time < 0.4× median
AND skill mastered
Timing
time > 2.0× median
AND late in session (>70%)
Concept
time > 1.5× median
AND low prior accuracy (<50%)
Method
time 0.7–1.5× median
AND skill known
Why time-per-problem is the headline signal
Other platforms ask the student to self-classify mistakes — students lie, skip, or pick the wrong tag. We measure actual time on every question from first render to commit, compare it against the student's own median for that exact problem (or tier default), and classify mechanically. No survey, no honesty problem — just the clock.