Behind the BuildToday's AI Specialist: The Spoken Feedback Agent. The Agent That Gives Feedback in the Voice of a Coach, Not a Marker.

Today's AI Specialist: The Spoken Feedback Agent. The Agent That Gives Feedback in the Voice of a Coach, Not a Marker.

Today's AI Specialist: The Spoken Feedback Agent. The Agent That Gives Feedback in the Voice of a Coach, Not a Marker.
Sophie

Listen to this article

Narrated by Sophie · 7 min read audio

0:000:00

Every practice session on the platform ends with a summary. The summary is short. It names three things the learner did well, names the single highest-leverage move for the learner to practise next, and tucks the grammar corrections underneath as a list. The total reading time is about ninety seconds.

The agent that writes that summary is the Spoken Feedback Agent. She is the slow-side of Sophie, the one who runs after the practice session ends, when the latency budget is no longer two seconds. She has time to think about what the learner needs to hear, and the tone she gives it in is the most carefully tuned thing in the system.

She writes in the voice of a coach, not a marker. The distinction sounds small. It is not. It is the decision that decides whether the learner comes back for another session.

This is the build story of the Spoken Feedback Agent. Why the tone matters more than the content, what she actually surfaces in a typical summary, and the cost the tone choice carries.

The problem the agent solves

Adult professional learners do not lack information about their English. They lack the motivation to practise. The reason they lack motivation is almost always emotional, not intellectual. They have been told, in school, in exams, by managers, by themselves, that their English is full of mistakes. The accumulated weight of that messaging is what keeps them from booking the next session.

A practice partner that delivers feedback in a marker tone ("you made these errors, here is the corrected version") adds to that accumulated weight. The learner reads the summary. The summary confirms what they already feared. They close the tab. They do not come back tomorrow.

The Spoken Feedback Agent was designed to break that loop. The summary she writes is allowed to flag errors, but it is not allowed to lead with them. The first thing the learner reads is what they did well. The second thing they read is what to practise next. The errors are present, but they are not the headline.

This is a retention decision, not a stylistic one. Learners who receive coaching-tone feedback book the next session at materially higher rates than learners who receive marker-tone feedback. The data is unambiguous, and it is unambiguous specifically for the adult professional cohort the platform serves.

What the agent actually surfaces

The summary opens with specific praise drawn from actual phrases the learner used in the session, names the single highest-leverage move to practise next (one move, not five), gives the learner a short drill the next Sophie session can run, and lists corrections at the bottom rather than leading with them. The praise is tied to a specific moment, which keeps it from sounding generic. The next move is constrained to one focus, because the next session should have one focus. The drill is sized so the learner will actually do it. The corrections are visible but not foregrounded.

Taken together, this produces a summary the learner can read in ninety seconds and act on in the next session. The acting-on is the point.

The tone constraints

The tone of the writing itself is the most carefully tuned thing about the agent. Praise that lands and praise that sounds performative are separated by very little; the same is true of forward-looking next-move framing versus retrospective fault-finding, and of matter-of-fact corrections versus the apologetic-or-chiding registers that adult learners read as condescension. The agent has to stay on the right side of all three lines, every time, or the affective filter rises and the feedback stops landing. The affective filter is the technical term for the emotional defensiveness that closes a learner to feedback. Raise it, and the learner stops absorbing. Keep it low, and the feedback lands.

What the tone choice costs

The tone choice has two real costs.

The first is that some learners initially feel the feedback is "too soft." Particularly learners from academic-exam-trained backgrounds, where they have been graded on errors for years, read the coaching-tone summary as evasive. They want the marker tone they are used to. For those users, the corrections list is surfaced more prominently when their preference is set explicitly, and the summary tone is shifted slightly more direct. But the underlying coaching frame stays: the agent does not switch into marker mode.

This is a calculated bet. The same learners who initially prefer marker tone show better twelve-week retention on coaching tone. They are objecting at week one to the thing that keeps them practising at week twelve. The data is clear enough that we serve the long-term outcome, even when the short-term preference disagrees.

The second cost is computational. The agent runs on a stronger model than the rest of the practice session pipeline, because the sentence-level tone work is harder than the structural work. Generating a coaching-tone summary takes longer and costs more per session than generating a marker-tone summary would. The cost is part of the deal.

TL;DR

The Spoken Feedback Agent writes the summary every practice session ends with. She writes in the voice of a coach, not a marker. The summary opens with specific praise drawn from what the learner actually did, names the single highest-leverage move to practise next, gives a short drill the next Sophie session can run, and lists corrections at the bottom rather than leading with them. The tone of the writing itself is the most carefully tuned thing in the system. Coaching-tone feedback produces materially higher retention than marker-tone feedback among adult professional learners. The tone choice costs more compute and is initially disliked by exam-trained learners, who later show better twelve-week outcomes on it than on the tone they preferred at week one.

More from the Behind the Build series

Learning Materials

Key Vocabulary

feedbacknoun · B2

Information or comments given to a learner about how well they have performed, intended to guide improvement.

The Spoken Feedback Agent writes the feedback summary at the end of every practice session.

marker (in marker tone)noun · C1

A person who grades or marks a learner's work; in this post, used metaphorically for a feedback voice that focuses on errors and scoring.

A marker tells you what you got wrong; a coach tells you what to practise next.

coach (verb)verb · B2

To guide a learner forward by naming strengths and the next practice move, rather than only judging past performance.

The agent coaches the learner toward the next session rather than grading the last one.

highest-leverageadjective · C1

Producing the greatest improvement for the smallest effort; the single move worth focusing on next.

The summary names the single highest-leverage move for the learner to practise next.

drillnoun · B2

A short, focused practice exercise designed to build a specific skill through repetition.

The next session can run a 90-second drill on opening sentences with a verb.

foregroundverb · C1

To place something at the front of attention so the reader notices it first.

The summary foregrounds three strengths and a next move; corrections are not foregrounded.

tuckverb · C1

To place something quietly underneath or out of the way, available but not prominent.

The grammar corrections are tucked under the active sections rather than leading them.

affective filternoun phrase · C1

In second-language acquisition, the emotional defensiveness that blocks a learner from absorbing input when anxiety or self-doubt is high.

The three constraints together keep the learner's affective filter low so the feedback lands.

retention (in learning)noun · C1

The rate at which learners continue to engage with a programme over time, especially across weeks or months.

Coaching-tone feedback produces materially higher retention than marker-tone feedback.

disengageverb · C1

To stop participating or paying attention to something, often quietly rather than dramatically.

Learners receiving marker-tone feedback tend to disengage from practice within four to six sessions.

defensivenessnoun · C1

An emotional reaction that protects a person from criticism and closes them off from new input.

The affective filter is the technical term for the emotional defensiveness that closes a learner to feedback.

performativeadjective · C1

Done for effect or to be seen doing it, rather than as genuine response; in praise, sounding scripted rather than felt.

Generic praise like 'good job!' is worse than no praise — the learner reads it as performative.

chidingadjective · C1

Gently scolding or reproaching; a tone that conveys 'you should have known better'.

Saying 'actually, the correct version is...' sounds chiding and is avoided in the corrections.

apologeticadjective · B2

Showing or expressing regret, often by softening a statement so much that it loses authority.

Phrasing like 'you might consider trying...' sounds apologetic and undermines the correction.

retrospective vs forward-lookingadjective pair · C1

Retrospective framing looks back at what was done; forward-looking framing names what to do next. The same fact can be delivered either way.

'You didn't do enough of X' is retrospective; 'next session, practise X' is forward-looking.

Grammar Notes

Contrast structures with 'not X. Y.' to differentiate two roles or modes

Two short, parallel sentences (or one sentence with a comma + 'not Y') are the cleanest English way to draw a sharp distinction between two stances. The pattern works because the first clause carries the positive claim and the 'not' clause names the rejected alternative — both stay in the reader's working memory and the contrast does the teaching. Avoid expanding the 'not' clause with hedges ('not exactly a marker, more of a...') — the rhetorical force depends on the bare opposition.

'She writes in the voice of a coach, not a marker.' / 'A marker tells you what you got wrong; a coach tells you what to practise next.'

Common mistake: Burying the contrast inside a longer clause ('She writes in a voice that is more like a coach than a marker, though not entirely...'). The hedging dilutes the distinction. English contrast structures are at their strongest when the two roles are named cleanly and the second is rejected without qualification.

'must be X' for prescriptive design rules

'Must be' marks a non-negotiable design rule — a constraint built into the system, not a recommendation. It is stronger than 'should be' (advisable), 'needs to be' (required for a purpose), or 'is X' (descriptive). Use 'must be' when stating the rule a system or product is bound by, especially in spec writing or design documentation.

'The praise must be specific.' / 'The next-move framing must be forward-looking, not retrospective.' / 'The corrections must be matter-of-fact, not apologetic and not chiding.'

Common mistake: Softening prescriptive rules to 'should be' or 'tends to be': 'The praise should be specific' reads as a suggestion the writer may ignore. In design or product writing where the rule is architectural, 'must be' is the right register — and the contrast with 'not Y' that often follows ('must be X, not Y') reinforces the rule's non-negotiability.

Comparative-with-cost framing: 'X costs more but produces better Y'

This pattern — name the cost in the same sentence (or adjacent sentence) as the benefit, with an explicit comparative — is the honest English register for trade-offs. It acknowledges what is given up rather than hiding it, and uses 'than' to make the comparison precise. The structure is: 'X takes/costs more than Y, but produces better Z.' The 'but' is sometimes implied rather than written, as long as the two clauses sit close enough to read as one trade-off.

'Generating a coaching-tone summary takes longer and costs more per session than generating a marker-tone summary would.' / 'The same learners who initially prefer marker tone show better twelve-week retention on coaching tone.'

Common mistake: Stating only the benefit and omitting the cost: 'Coaching tone produces better retention.' This reads as marketing rather than design honesty. The English convention in technical and product writing is to name the cost in the same breath as the benefit — readers trust writers who do, and distrust writers who don't.

Comprehension Questions

  1. 1.What four sections make up the summary the Spoken Feedback Agent writes, and which three are foregrounded?
  2. 2.Why is the tone choice described as a retention decision rather than a stylistic one?
  3. 3.What are the three tone constraints built into the agent's prompt, and what does each one prevent?
  4. 4.What are the two real costs of the coaching-tone approach, and why does the team accept them?
  5. 5.Think about a piece of feedback you received recently in your own learning, work, or study — a teacher's comment, a manager's review, a coach's note. Was it written in marker tone (foregrounding what you got wrong) or coach tone (foregrounding what you did well and the next move)? How did it land emotionally, and what did you do next?

Run your own diagnostic

Use the same Strategic Council I run my own decisions through. The assessment preview is free. The specific central human intelligence it is based on is verified in person during the call.

Start the free diagnostic →