Today's AI Specialist: The Conversation Partner. The Agent That Decides What Sophie Will Talk About.
Today's AI Specialist: The Conversation Partner. The Agent That Decides What Sophie Will Talk About.
Sophie is the practice partner the learner sees and hears. She is also, by design, not the agent who decides what conversation she is going to have.
The Conversation Partner is the back-of-house agent who picks the topic. She runs between sessions, in the long latency budget that Sophie cannot afford in the moment, and her job is to decide, for each learner, what the next conversation should be. By the time the learner opens Sophie, the topic is already chosen. Sophie just runs it.
This split was not obvious when I started building the practice surface. The instinct was to let Sophie pick her own topics. The reasons it does not work are the build story of this agent.
The problem the Conversation Partner solves
Sophie has a two-second latency budget. That budget rules out everything that takes more than one model call per turn, including looking up the learner's full history, scanning what they have not practised recently, or matching the candidate topic to their stated goals. Inside a session, Sophie is fast and warm. She is not deliberative.
When I let Sophie pick her own topics, the topics were either generic ("let's talk about your weekend") or repetitive (the same three topics across five sessions, because Sophie does not remember what she chose last time at the moment she is choosing now). The sessions felt generic. The learners noticed.
The fix was to move topic selection out of the session entirely. The Conversation Partner runs between sessions, with no latency constraint, and picks the next-best topic. By the time the learner opens Sophie, the topic is loaded into Sophie's session context, and Sophie just runs the conversation. The split is the same split that separates fast-Sophie from slow-Sophie in the feedback layer: real-time work for Sophie, deliberative work for the back-of-house agents.
What the Conversation Partner knows
The Conversation Partner has read access to the long-term memory store that Sophie does not see in real time.
She sees the learner's full assessment history, including all four sub-scores at each assessment and the trajectory between them. She sees every previous practice session: the topics, the durations, the moments flagged by the Spoken Feedback Agent as difficult. She sees the learner's stated goals, captured at onboarding and refreshed in periodic check-ins. She sees the upcoming real-world contexts the learner has told the platform about: a job interview next month, a board presentation in three weeks, a quarterly review with a German client.
She does not see things the learner has not given the platform: calendar contents, email contents, location. The boundary is the same boundary Sophie respects in her real-time view: the platform sees what the learner has told it, nothing else.
What the Conversation Partner has that Sophie does not is time. She is allowed to look at all of this, deliberate, and choose. Sophie is not.
How the agent decides what to pick
Three criteria, applied in order.
First, what the learner has explicitly asked to practise. If the learner has told the platform that they have an interview next Tuesday and they want to rehearse, that topic goes to the top of the queue. Stated-goal topics always win. The reasoning is that explicit request is the strongest motivation signal the agent has, and motivation drives retention. A learner practising on a topic they care about retains the gain. A learner practising on a topic the agent chose for them retains less.
Second, what the learner has avoided. Adult learners consistently steer around certain topics in practice: the kind of negotiation they find hard, the kind of stakeholder they freeze in front of, the kind of question that triggers their L1 to come back. The Conversation Partner notes these patterns and, when the queue allows, surfaces a gentle version of the avoided topic. Occasionally, not every session. The exposure is calibrated to expand the learner's comfort zone without breaking it.
Third, what would close the lowest sub-score on the most recent assessment. If fluency is the lowest sub-score, the agent picks topics that require sustained turns and recovery: open-ended questions, multi-part scenarios, situations where the learner needs to keep talking. If coherence is the lowest, the agent picks topics that require structured argument: opinion pieces, debate scenarios, situations where the learner needs to organise their thinking before delivering it. If lexical range is the lowest, the agent picks topics that pull the learner into vocabulary they have not used recently.
The three criteria together produce a queue, not a single topic. The Conversation Partner maintains a short queue of candidate topics for each learner, and Sophie picks from the top of the queue when the learner opens a session.
Decision one: the learner never sees the Conversation Partner
The agent runs invisibly. The learner sees only Sophie. The topic appears as Sophie opening the conversation; the learner has no UI surface for editing the queue or seeing what is in it.
This was a deliberate choice. We tested a version of the interface that surfaced the topic queue to the learner: "today Sophie will talk to you about your upcoming interview; here are the other topics queued up next." The version performed worse. Learners who could see the queue started managing it instead of practising, picking the topics they felt ready for and avoiding the ones they were not. The avoided-topic exposure mechanic collapsed.
The fix was to keep the queue hidden. The learner can ask Sophie at the start of a session to change the topic (Sophie will then surface a different option from the queue), but the queue itself is not visible. The friction is deliberate. It keeps the avoidance behaviour from optimising the agent against the learner's actual progress.
Decision two: the agent does not generate topics, it selects them
The second decision was where the candidate topics come from.
The naive approach is to let the agent generate topics freely against the learner's profile. This produces topics that are technically tailored and feel inauthentic. The generated topics have a slightly artificial quality. They read like an AI's best guess at what the learner needs, not like a real conversation.
The fix was to give the agent a topic library, a curated catalogue of about 600 conversation prompts, written by a human (me, mostly), organised by skill, level, and scenario. The agent selects from the library rather than generating from scratch. The library is the human voice of the platform; the agent is the matching layer that picks the right prompt for the right learner at the right moment.
The library grows. About twenty prompts are added per month, written by me after I notice a gap in the topic coverage from running the Conversation Partner's logs. The agent never generates a prompt that has not been hand-written.
TL;DR
The Conversation Partner is the back-of-house agent who decides, for each learner, what Sophie will talk about next. Sophie has a two-second latency budget that rules out picking topics deliberately; the Conversation Partner runs between sessions, in unlimited time, and picks the next-best topic. She uses three criteria in order: what the learner explicitly asked to practise, what they have avoided, and what would close the lowest sub-score. The learner never sees the queue. Surfacing it caused learners to manage avoidance rather than practise through it. The agent does not generate topics from scratch; she selects from a hand-written library of about 600 prompts, which preserves the platform's voice in the topic selection itself. The Conversation Partner is one of 123 specialist agents in the team report at nigelcasey.com/agent-team-report.html, working alongside Sophie's real-time stack and the Spoken Feedback Agent in the practice cluster.
See how the Conversation Partner was built and meet the rest of the team (/build)
Learning Materials
Key Vocabulary
queue
An ordered list of items waiting to be processed or used, with the next item taken from the top.
“The Conversation Partner maintains a queue of three to five candidate topics per learner.”
library (in topic library)
A curated catalogue or collection of resources organised for retrieval — here, a stock of pre-written conversation prompts the agent selects from.
“The agent selects from a topic library of about 600 hand-written prompts.”
back-of-house
Operating behind the scenes, out of view of the end user; borrowed from restaurant kitchens to mean infrastructure-side rather than customer-facing.
“The Conversation Partner is a back-of-house agent the learner never sees.”
deliberative
Involving careful, slow, considered thought rather than fast, instinctive response.
“Sophie is fast and warm; she is not deliberative.”
real-time
Happening at the same speed as the events it responds to, with no perceptible delay.
“Sophie does the real-time conversation; the Conversation Partner does the deliberative work between sessions.”
generic
Not specific to any particular person or situation; general, lacking distinctive character.
“When Sophie picked her own topics, the sessions felt generic.”
repetitive
Repeating the same thing over and over, often to the point of becoming dull or ineffective.
“Sophie's own choices were repetitive — the same three topics across five sessions.”
prompt
A short written cue or instruction that opens a conversation, task, or AI response.
“The library holds about 600 conversation prompts organised by skill and level.”
avoidance
The behaviour of consistently steering away from a topic, situation, or task that feels uncomfortable or difficult.
“The agent gently surfaces topics the learner has steered around — the avoidance pattern is part of the design signal.”
comfort zone
The range of situations a person handles without anxiety; the boundary beyond which they begin to feel stretched.
“The exposure is calibrated to expand the learner's comfort zone without breaking it.”
gentle exposure
A small, calibrated dose of an avoided or challenging stimulus, delivered in a low-stakes context so the person can build tolerance.
“Gentle exposure to avoided topics produces durable transfer to the real-world context.”
motivation
The internal drive that makes someone want to do something; the reason behind sustained effort.
“Explicit request is the strongest motivation signal the agent has.”
retention
The continued holding of something — here, the degree to which a learner keeps a practice gain rather than losing it.
“Motivation drives retention; learners practising on topics they care about retain the gain.”
calibrate
To adjust a setting precisely so it matches the conditions or the target — neither too much nor too little.
“The exposure is calibrated to stretch the learner without breaking them.”
hand-written
Composed by a human rather than generated by software; in agent design, used to mark content authored by the operator and curated into a library.
“The agent never generates a prompt that has not been hand-written.”
Grammar Notes
Ordered list of criteria with 'First, what... Second, what... Third, what...'
English uses 'First, ... Second, ... Third, ...' (each followed by a comma and the criterion) to lay out sequential design rules or decision steps in priority order. Each item starts with the same grammatical shape — here, 'what + clause' — which signals to the reader that these are parallel criteria, not a loose list. The parallel structure makes the priority clear: item one is applied before item two, item two before item three.
“'First, what the learner has explicitly asked to practise... Second, what the learner has avoided... Third, what would close the lowest sub-score on the assessment.'”
Common mistake: Breaking parallelism by mixing forms: 'First, what the learner asked for. Second, the avoidance pattern. Third, to close the sub-score.' The mix of 'what + clause' / 'noun phrase' / 'infinitive' makes the criteria feel like a loose list rather than an ordered priority. Keep the grammatical shape identical across all three items.
Phrasal verbs in tech writing ('steer around', 'reach for')
Technical and design writing in English leans heavily on phrasal verbs — verb plus particle combinations like 'steer around', 'reach for', 'surface up', 'pull in' — to describe behaviour and motion. These feel more native and less academic than their single-word Latinate equivalents ('avoid', 'use', 'present', 'incorporate'). They are not informal in this register; they are the natural English of system descriptions. Use them when the action involves motion, direction, or selection.
“'Adult learners consistently steer around certain topics in practice.'”
Common mistake: Replacing every phrasal verb with a Latinate single word in an attempt to sound more formal: 'Adult learners consistently avoid certain topics' is grammatically fine but loses the sense of active steering. The phrasal verb 'steer around' implies a deliberate, ongoing manoeuvre, not just an absence of engagement. The choice of verb carries meaning.
Negative imperative for design constraint ('she does not generate; she selects')
English uses the negative-plus-positive contrast pair — 'not X; Y' — to express a design constraint sharply. The negative clause states what the system is not allowed to do; the positive clause states what it does instead. The two clauses are often separated by a comma or semicolon and share the same subject, which makes the contrast feel like a single rule rather than two statements. This is the natural English shape for architectural decisions where the rule is as much about exclusion as inclusion.
“'The agent does not generate topics, it selects them.'”
Common mistake: Stating only the positive ('the agent selects topics') loses the constraint — the reader does not learn that generation was considered and rejected. Stating only the negative ('the agent does not generate topics') leaves the reader wondering what the agent does instead. The pair carries both halves of the design decision.
Comprehension Questions
- 1.Why is topic selection moved out of Sophie's session into a separate back-of-house agent?
- 2.Why does the post argue that surfacing the topic queue to the learner made the system worse, not better?
- 3.What are the three criteria the Conversation Partner uses to pick the next topic, and in what order?
- 4.Why does the Conversation Partner select from a hand-written library of about 600 prompts rather than generate prompts from scratch?
- 5.Suppose you were designing a fitness coach app that picks the next workout for each user. How would the Conversation Partner's three-criteria approach translate to that domain?
Run your own diagnostic
Use the same Strategic Council I run my own decisions through. The assessment preview is free. The specific central human intelligence it is based on is verified in person during the call.
Start the free diagnostic →