You hired them because they passed the test. The score came back clean. Your TA team moved them forward, training signed off, and they hit the floor with everyone else in the cohort. Ninety days later they were gone. If that pattern feels familiar, you are not alone, and your hiring team is probably not the problem. The signal you screened them on was.

Why Your BPO Agents Keep Failing in the First 90 Days

You hired them because they passed the test. The score came back clean. Your TA team moved them forward, training signed off, and they hit the floor with everyone else in the cohort.

Ninety days later they were gone.

If that pattern feels familiar, you are not alone, and your hiring team is probably not the problem. The signal you screened them on was.

The first 90 days is where the industry bleeds

Attrition in this sector is not a secret. What gets less attention is when it happens. A large share of the people who leave do so almost immediately. Industry research has found that roughly half of the advisors who leave the profession do so within their first 90 days of employment, and broader 2025 data puts first-year attrition in many centers in the range of 69 to 73 percent.

Sit with that for a second. Most of your turnover is not happening at the two-year mark when someone gets a better offer. It is happening in the first three months, before the agent has returned anything close to what you spent to get them there.

That early window is the most expensive place to lose someone. You have already paid for sourcing, assessment, onboarding, and training. The seat is filled on paper but producing nothing, and you are about to pay for the backfill on top of it. The exact dollar cost varies a lot by region, and it pays to use a number that matches your own market rather than a headline figure from a US center. Replacement cost in offshore hubs runs in the range of roughly 2,200 to 4,500 dollars per agent in the Philippines and India, well below the 10,000 to 20,000 dollars typical of a US seat, but the structure of the loss is identical everywhere. You spend the money up front, and early attrition means you never earn it back before you are spending it again.

So the question worth asking is not "how do we reduce attrition" in the abstract. It is "why do so many of the people who passed our screen fail so fast?"

The headline number hides where the loss actually happens

There is a trap in the way attrition gets reported. A site quotes a number like 40 or 50 or even 100 percent annually, and it sounds like the entire floor is walking out and being rebuilt. That is almost never what is happening.

Two figures from the same industry data make the point. First-year attrition runs between 69 and 73 percent, yet average agent tenure across the industry sits at 14 to 15 months. Those two numbers can only be true at the same time if a stable core of agents stays for years while a smaller population cycles through the same seats over and over. If everyone left at the same steady rate, you could not have most of your loss concentrated in the first year and still hold a meaningfully tenured group on the floor.

Picture a hundred seats purely as a way to think about it. A large share of those agents settle in, build confidence, and stay. The churn is not spread evenly across all hundred. It pools in a subset of seats that never stabilize, where one hire fails, gets replaced, and the replacement fails too. The headline percentage counts every one of those repeat replacements, which is why the number looks like wholesale turnover when the reality is a concentrated, recurring problem in a slice of the floor. To be clear, that split is an illustration rather than a measured ratio, but the shape of it, a steady core plus a revolving group, is exactly what the tenure and first-year numbers imply.

This reframing matters because it changes what kind of problem attrition is. Loss spread evenly across everyone looks like a pay or culture issue, and a hiring assessment cannot fix those. Loss concentrated in a recurring set of seats looks like something else: a fit problem at the point of hire. The agents who cycle in and out are disproportionately the ones who were mismatched to the role on day one, and communication under live conditions is one of the clearest mismatches there is. Seen this way, the headline number does not overstate the case for better screening. It hides it.

A B2 Pass that was never really a pass

Here is the scenario I have watched play out more times than I can count.

A candidate takes the assessment. The result comes back B2. Pass. On the surface, a good hire. But the single score was hiding two very different realities underneath it.

Their writing was genuinely strong. Clean grammar, good vocabulary, well-structured responses. Call it 86 percent. Their voice clarity, on the other hand, was choppy and hard to follow under pressure. Call it 42 percent. Average those two numbers and you land at something respectable. The test reports a pass. The weakness disappears into the math.

Then the agent gets on a live call. A customer says, "I need to change my flight booking." The agent, who scored beautifully on the multiple-choice section, responds with: "Yes so, um, you need to go to the, oh, website and then you are finding the booking section and after that you can, um."

Response time twelve seconds. Customer satisfaction low. Call escalated.

The test did not lie. It just never measured what broke down.

The part nobody scores: confidence

Look closely at that response and you will see something the score sheet has no column for. It is not that the agent did not know the answer. The information was right there in front of them. The problem was confidence. Knowing the words and assembling them into a clear sentence in real time, in a second language, under pressure, are three different skills, and the gap between them is widest in the early days.

This is where the first 90 days matters in a way that has nothing to do with training completion. An agent a year into the job has built that confidence through repetition. They have handled the awkward calls, recovered from the fumbles, and learned that they can get through a tough exchange. In month one, none of that exists yet. And the absence of it gets detected by every ear on the other side of the line.

The research backs this up directly. Studies on second-language speaking have consistently found that more frequent real-world use of the language builds self-confidence and perceived competence, which in turn lowers the anxiety a speaker feels in live situations. Researchers even have a name for the specific fear that shows up in spontaneous, unscripted conversation: communicative apprehension. It is at its worst exactly where a contact center lives, in real-time exchange with no chance to prepare the next line.

Now layer the customer onto that. People do not call you when things are going well. They call when something has already broken in their relationship with the product, and they often arrive with a specific outcome already in mind. They are on edge before the agent says a word. When the resolution does not come fast, their mood drops, and a customer in that state is primed to find fault. A wobble in the agent's confidence becomes the thing they latch onto, and from that moment it is a barrier to resolution rather than a path to it.

Here is the irony worth sitting with. The same agent, handling the same issue in their primary language, might resolve it without a second thought. The competence is real. The product knowledge is real. What is missing is the confidence to apply it to the specific situation in front of them, in the moment, in English. This is not about reciting a knowledge base article. It is about interpreting a frustrated customer and adapting on the fly, and that is precisely the thing a multiple-choice test cannot see.

The score has to carry more weight than you think

There is a quiet assumption baked into most hiring processes: that somewhere in the chain, a human will catch what the test misses. In practice, that safety net is thinner than it looks.

Start with the recruiter. TA teams in this sector exist to keep pace with relentless attrition. They are measured on speed and fill rates, and they are moving fast by design. Expecting a recruiter under that kind of pressure to pause, listen past a stated score, and add their own layer of judgment about whether a candidate can really hold a difficult conversation is asking a lot. Many will reasonably trust the number in front of them and move on, because that is what the number is there for.

Then consider who else sits in the chain. Ops supervisors, team leads, and hiring managers often weigh in on candidates too. Each of them brings a different threshold, a different ear, and a different read on what "good enough communication" sounds like. Some are themselves second-language English speakers making a judgment call about another second-language speaker. None of that is a criticism of the people involved. It is just the reality of a multi-handoff process where communication is assessed informally, inconsistently, and at speed.

Put those two things together and the conclusion is unavoidable. When the human judgment around a hire is fast, distributed, and uneven, the assessment cannot afford to be soft. The score is not a backstop to human judgment. For communication specifically, it is often the only objective read anyone in the chain actually has. That is exactly why a single blended number is so dangerous, and why the signal needs to be stronger, sharper, and harder to misread than what most tests deliver today.

Communication is two-way, and the test only watches one side

Most language assessments measure proficiency. Grammar, vocabulary, pronunciation, fluency. All useful, all real. But they measure them in a vacuum: one question, one response, from one person, scored at one moment.

Real communication does not work like that. It is a multi-turn exchange where the customer changes their request mid-sentence, adds a constraint you did not expect, and quietly decides whether they trust the person on the other end. The recipient of the communication is never part of the measurement, yet the recipient is the entire point.

This is where the gap between "knows the language" and "can communicate in the role" becomes a business problem rather than a linguistic one. And customers feel it long before your QA team does. CSA Research found that 74 percent of consumers are more likely to keep buying from a brand that supports them in their own language, and separate research from Qualtrics and ServiceNow found that 80 percent of customers have switched brands because of poor customer experience. Those are your client's churn numbers. They trace straight back to the hire.

The fix is not a harder test. It is a better signal.

None of this means your assessment is too easy. Making the bar higher just rejects more people while still measuring the wrong thing. The fix is to measure what actually predicts performance on the floor:

Score voice and writing separately, so a strong written result can never quietly cover for weak verbal clarity. The two skills are not interchangeable, and averaging them into one number is how a 42 percent becomes a pass.

Assess real conversation, not recall. A candidate who has memorized "I will be happy to assist you" can still freeze the moment the conversation shifts. What matters is whether they can follow meaning, handle a change in direction, and stay clear when a second constraint lands. That only shows up in a simulated exchange, not a single scripted answer.

Measure the recipient's experience, not just the speaker's output. The real test of communication is not whether the agent spoke correctly. It is whether the customer on the other end could follow them, felt understood, and got to a resolution.

Where this leaves you

If you are seeing early attrition, escalations from agents who looked fine in training, or that nagging sense that your screen is not telling you the whole story, the issue is almost never the people doing the hiring. It is the signal they are being handed.

Cut bad-fit attrition by even a fraction and the savings compound fast, because you stop paying twice for the same seat. But you cannot fix what you cannot see, and a single averaged score is built to hide exactly the thing that breaks down on a real call.

The test measured whether they knew the language. It should have been measuring whether they could actually communicate in the role.

Evala scores voice clarity and written English as separate signals, and assesses how candidates handle real, multi-turn conversations rather than multiple-choice recall. If you want to see what your current screen might be missing, book a demo.

Sources

Call Centre Helper, "What is Attrition Rate and How to Calculate It" (half of advisors who leave do so within their first 90 days)
Insignia Resources, "Call Center Turnover Rates: 2026 Industry Average" (first-year attrition of 69 to 73 percent; regional replacement cost of roughly $2,200 to $4,500 per agent in the Philippines and India versus $10,000 to $20,000 in the US). Note: this is a staffing vendor aggregating secondary data, used here for directional regional comparison rather than as primary research.
CSA Research, "Can't Read, Won't Buy" (74 percent of consumers more likely to keep buying with support in their own language)
Qualtrics and ServiceNow customer experience research (80 percent of customers have switched brands due to poor experience)
Baker & MacIntyre (2000) and related second-language acquisition research, cited in Cambridge University Press, Studies in Second Language Acquisition (increased real-world use of a second language builds self-confidence and lowers anxiety in live conversation)
Horwitz et al., Foreign Language Classroom Anxiety research (communicative apprehension as the anxiety specific to spontaneous, real-time conversation)

Why English Proficiency Isn't Enough: The Communication Readiness Gap in BPO Hiring