Grab Thailand — Bot Performance Report

📊 Executive Summary

113 test calls completed (50 internal + 63 external) across three PTP voicebots — DAX, MEX, PAX
Core system is stable — ASR accuracy at 96.5%, escalation handling at 92%, flow correctness at 87.7%
TTS quality is the primary improvement area (60% of findings) — a solution is identified and being tested (voice profile adjustment)
Payment confirmation flow is too restrictive — customers must repeat both amount and date explicitly, limiting natural conversation. Needs Regional discussion
Label configuration review recommended before the 100-call sample test tomorrow

📈 Key Metrics

96.5%

ASR Accuracy

109/113 across all 3 bots

87.7%

Flow Correctness

MEX lowest at 72%

4.0/5

TTS Naturalness

Avg across 113 calls

92%

Escalation Accuracy

11/12 — PAX strongest

Metrics computed from 113 test calls (50 internal + 63 external) across DAX, MEX, PAX. TTS score is tester-rated on 1–5 scale.

⚙ Functional Status

Bot / Component	Status	Notes
DAX (Driver)	Stable	ASR 97%, Flow 95%, TTS 4.1/5
MEX (Merchant)	In Progress	ASR 93%, Flow 72%, TTS 4.0/5 — flow refinements identified, solutions ready
PAX (Passenger)	Stable	ASR 100%, Flow 94%, TTS 4.1/5, Escalation 92%
TTS Engine	In Progress	Date pronunciation issue identified — voice profile adjustment being tested
Escalation Trigger	Stable	Working correctly — hostility + agent request detection confirmed
After-Call Labels	To Be Discussed	Label B/C refinement recommended — see details below

🔬 Quality Analysis

Issue Breakdown by Type (42 occurrences across 113 calls)

25

TTS

60% of issues

14

Flow

33% of issues

3

ASR

7% of issues

Finding	Type	Status	Solution
Bot repeats the word "วันที่" when reading payment dates — sounds unnatural to customers	TTS	In Progress	Voice profile adjustment identified — testing alternatives. Also covers name pronunciation and pacing refinements.
Inconsistent speech pacing — some phrases sound rushed or uneven, affecting conversation naturalness	TTS	In Progress	Continuous R&D improvement — pacing quality will improve incrementally through ongoing voice profile tuning
[PAX] When a customer asks about their payment deadline, the bot currently doesn't provide a clear response — needs a defined handling approach	Flow	To Be Discussed	Guardrail is working as intended (payment date should not be disclosed). Need to define how the bot should respond when asked — to be coordinated with Regional
Payment confirmation requires the customer to explicitly repeat both the exact amount and a specific date — a simple "yes, I'll pay" or "I'll pay next Monday" isn't accepted. This creates an unnatural conversation and also limits the bot's ability to clarify vague dates (e.g., "next week" → ask which day)	Flow	To Be Discussed	Current strict confirmation was set by Regional to ensure full payment amount and a date within the acceptable window. Recommend reviewing whether implicit confirmation (e.g., "yes, I'll pay on the date you mentioned") can be accepted while still meeting these requirements — to be coordinated with Regional

Based on 113 test calls (50 internal + 63 external). A single call can surface multiple issue types. Flow-level items require coordination with the Regional team for implementation.

💡 Insights & Analysis

What's Working

Escalation handling is production-ready — 92% accuracy across all bots. Hostility, agent requests, and technical issues all route correctly.
ASR is strong at 96.5% — speech recognition performs well across all three customer segments.
Core conversation flow is solid — payment reminders, identity verification, and closing sequences work as designed.

Key Observations

TTS accounts for the majority of findings — date pronunciation, speech pacing, and name handling. Voice profile solution is being tested, with pacing improving through ongoing R&D.
Payment confirmation flow is too strict — customers must explicitly repeat both amount and date, which feels unnatural. This also limits the bot's ability to handle vague dates. Set by Regional — needs discussion.

Decision Needed

Label B/C configuration — if current definitions are unchanged for tomorrow's 100-call test, the RA analysis may show inflated B/C counts. Decision on whether to adjust labels should ideally be made today.
TTS voice profile selection — alternative profiles will be provided tomorrow for review. Selecting a profile enables deployment to address the date pronunciation issue.

🎯 Recommendations

Recommendation 1

Review TTS voice profile options — selection to finalize tomorrow
Alternative voice profiles are being prepared. AI Rudder will provide a shortlist tomorrow for Grab Local to review and select. The selected profile will address the date pronunciation issue along with name handling and pacing refinements.

Recommendation 2

Refine label B/C configuration before the 100-call test
Cleaner label definitions produce cleaner RA analysis. If the change can't be confirmed before tomorrow, we'll run with current labels and provide a supplementary view showing results under the proposed schema.

Recommendation 3

Consolidate flow-level items for a single Regional session
Two findings require flow-level decisions (PAX payment date handling, strict payment confirmation). Presenting them as a single, documented request makes it easy for Regional to review and decide in one pass.

🏷 After-Call Label Configuration Review

The AICalling system classifies each completed call using intent labels (A–G). After reviewing the current configuration, we identified an opportunity to improve classification accuracy ahead of the 100-call sample test.

Recommendation: Refine Labels B and C

Label B currently combines "refuse to pay" with "uncooperative" (e.g., hung up after identity verification) — these are different intents
Label C acts as a catch-all for anything unmapped, not just "silence"
This overlap may cause inaccurate classification in the 100-call RA analysis

Label	Current Definition	Recommended Change
A	Promise to Pay (amount ≥ totalPastDue, date ≤ maxPaymentDate)	No change
B	Refuse to Pay / Uncooperative — broad definition including refusals, wrong party, callbacks, and mid-conversation hang-ups	Narrow to "Refuse to Pay" only — clear, explicit refusal
C	Keep Silent — plus catch-all for unmapped conversations	Narrow to "Pure Silent" only — picked up but no interaction
D	Call Transferred	No change
F	Failed to Connect	No change
G	Voice Mail	No change
H (new)	—	Add "Unknown Intention" — user interacted but no clear payment commitment or refusal (hung up mid-conversation, ambiguous responses)

This recommendation is subject to coordination with the Regional team. If unchanged, AI Rudder will proceed with current labels and note classification considerations in the RA analysis.

✅ Next Steps

Today (Feb 26)

Grab Local Review label B/C refinement recommendation

Grab Local Coordinate with Regional on flow-level decisions that need discussion (2 items identified)

Tomorrow (Feb 27)

AI Rudder Provide TTS voice profile options for Grab Local to select

Grab Regional Execute 100-call sample test with real customers

Following Week

AI Rudder Conduct RA analysis on 100-call results (label accuracy review)

Grab Local Confirm TTS voice profile selection

AI Rudder Amend bot persona based on selected voice profile

Upcoming — Regional Session

Grab Regional Review consolidated flow-change request (2 items with proposed solutions)

Grab Regional Confirm label B/C configuration (if not decided before 100-call test)