📊 Executive Summary
- 113 test calls completed (50 internal + 63 external) across three PTP voicebots — DAX, MEX, PAX
- Core system is stable — ASR accuracy at 96.5%, escalation handling at 92%, flow correctness at 87.7%
- TTS quality is the primary improvement area (60% of findings) — a solution is identified and being tested (voice profile adjustment)
- Payment confirmation flow is too restrictive — customers must repeat both amount and date explicitly, limiting natural conversation. Needs Regional discussion
- Label configuration review recommended before the 100-call sample test tomorrow
📈 Key Metrics
96.5%
ASR Accuracy
109/113 across all 3 bots
87.7%
Flow Correctness
MEX lowest at 72%
4.0/5
TTS Naturalness
Avg across 113 calls
92%
Escalation Accuracy
11/12 — PAX strongest
Metrics computed from 113 test calls (50 internal + 63 external) across DAX, MEX, PAX. TTS score is tester-rated on 1–5 scale.
⚙ Functional Status
| Bot / Component |
Status |
Notes |
| DAX (Driver) |
Stable |
ASR 97%, Flow 95%, TTS 4.1/5 |
| MEX (Merchant) |
In Progress |
ASR 93%, Flow 72%, TTS 4.0/5 — flow refinements identified, solutions ready |
| PAX (Passenger) |
Stable |
ASR 100%, Flow 94%, TTS 4.1/5, Escalation 92% |
| TTS Engine |
In Progress |
Date pronunciation issue identified — voice profile adjustment being tested |
| Escalation Trigger |
Stable |
Working correctly — hostility + agent request detection confirmed |
| After-Call Labels |
To Be Discussed |
Label B/C refinement recommended — see details below |
🔬 Quality Analysis
Issue Breakdown by Type (42 occurrences across 113 calls)
| Finding |
Type |
Status |
Solution |
| Bot repeats the word "วันที่" when reading payment dates — sounds unnatural to customers |
TTS |
In Progress |
Voice profile adjustment identified — testing alternatives. Also covers name pronunciation and pacing refinements. |
| Inconsistent speech pacing — some phrases sound rushed or uneven, affecting conversation naturalness |
TTS |
In Progress |
Continuous R&D improvement — pacing quality will improve incrementally through ongoing voice profile tuning |
| [PAX] When a customer asks about their payment deadline, the bot currently doesn't provide a clear response — needs a defined handling approach |
Flow |
To Be Discussed |
Guardrail is working as intended (payment date should not be disclosed). Need to define how the bot should respond when asked — to be coordinated with Regional |
| Payment confirmation requires the customer to explicitly repeat both the exact amount and a specific date — a simple "yes, I'll pay" or "I'll pay next Monday" isn't accepted. This creates an unnatural conversation and also limits the bot's ability to clarify vague dates (e.g., "next week" → ask which day) |
Flow |
To Be Discussed |
Current strict confirmation was set by Regional to ensure full payment amount and a date within the acceptable window. Recommend reviewing whether implicit confirmation (e.g., "yes, I'll pay on the date you mentioned") can be accepted while still meeting these requirements — to be coordinated with Regional |
Based on 113 test calls (50 internal + 63 external). A single call can surface multiple issue types. Flow-level items require coordination with the Regional team for implementation.
💡 Insights & Analysis
What's Working
- Escalation handling is production-ready — 92% accuracy across all bots. Hostility, agent requests, and technical issues all route correctly.
- ASR is strong at 96.5% — speech recognition performs well across all three customer segments.
- Core conversation flow is solid — payment reminders, identity verification, and closing sequences work as designed.
Key Observations
- TTS accounts for the majority of findings — date pronunciation, speech pacing, and name handling. Voice profile solution is being tested, with pacing improving through ongoing R&D.
- Payment confirmation flow is too strict — customers must explicitly repeat both amount and date, which feels unnatural. This also limits the bot's ability to handle vague dates. Set by Regional — needs discussion.
Decision Needed
- Label B/C configuration — if current definitions are unchanged for tomorrow's 100-call test, the RA analysis may show inflated B/C counts. Decision on whether to adjust labels should ideally be made today.
- TTS voice profile selection — alternative profiles will be provided tomorrow for review. Selecting a profile enables deployment to address the date pronunciation issue.
🎯 Recommendations
Recommendation 1
Review TTS voice profile options — selection to finalize tomorrow
Alternative voice profiles are being prepared. AI Rudder will provide a shortlist tomorrow for Grab Local to review and select. The selected profile will address the date pronunciation issue along with name handling and pacing refinements.
Recommendation 2
Refine label B/C configuration before the 100-call test
Cleaner label definitions produce cleaner RA analysis. If the change can't be confirmed before tomorrow, we'll run with current labels and provide a supplementary view showing results under the proposed schema.
Recommendation 3
Consolidate flow-level items for a single Regional session
Two findings require flow-level decisions (PAX payment date handling, strict payment confirmation). Presenting them as a single, documented request makes it easy for Regional to review and decide in one pass.
🏷 After-Call Label Configuration Review
The AICalling system classifies each completed call using intent labels (A–G). After reviewing the current configuration, we identified an opportunity to improve classification accuracy ahead of the 100-call sample test.
Recommendation: Refine Labels B and C
- Label B currently combines "refuse to pay" with "uncooperative" (e.g., hung up after identity verification) — these are different intents
- Label C acts as a catch-all for anything unmapped, not just "silence"
- This overlap may cause inaccurate classification in the 100-call RA analysis
| Label |
Current Definition |
Recommended Change |
| A |
Promise to Pay (amount ≥ totalPastDue, date ≤ maxPaymentDate) |
No change |
| B |
Refuse to Pay / Uncooperative — broad definition including refusals, wrong party, callbacks, and mid-conversation hang-ups |
Narrow to "Refuse to Pay" only — clear, explicit refusal |
| C |
Keep Silent — plus catch-all for unmapped conversations |
Narrow to "Pure Silent" only — picked up but no interaction |
| D |
Call Transferred |
No change |
| F |
Failed to Connect |
No change |
| G |
Voice Mail |
No change |
| H (new) |
— |
Add "Unknown Intention" — user interacted but no clear payment commitment or refusal (hung up mid-conversation, ambiguous responses) |
This recommendation is subject to coordination with the Regional team. If unchanged, AI Rudder will proceed with current labels and note classification considerations in the RA analysis.
✅ Next Steps
Today (Feb 26)
Grab Local
Review label B/C refinement recommendation
Grab Local
Coordinate with Regional on flow-level decisions that need discussion (2 items identified)
Tomorrow (Feb 27)
AI Rudder
Provide TTS voice profile options for Grab Local to select
Grab Regional
Execute 100-call sample test with real customers
Following Week
AI Rudder
Conduct RA analysis on 100-call results (label accuracy review)
Grab Local
Confirm TTS voice profile selection
AI Rudder
Amend bot persona based on selected voice profile
Upcoming — Regional Session
Grab Regional
Review consolidated flow-change request (2 items with proposed solutions)
Grab Regional
Confirm label B/C configuration (if not decided before 100-call test)