📈 Key Metrics
85.7%
Overall Accuracy
42/49 picked-up calls
49
Calls Analyzed
93 blasted, 49 picked up
7
Problem Cases
14.3% error rate
4
Fixes Ready
Ready to deploy
4
Decisions Needed
Requires Grab Local
2
Regional Items
New evidence added
🏷 Intention Distribution & Label Accuracy
Comparing the system's automatic label assignment against our reviewed classification reveals significant overlap issues in Labels B and C — reinforcing the label reconfiguration recommendation from the previous report.
System Labels (Automatic)
Label A 9 calls (18%)
Label B 15 calls (31%)
Label C 25 calls (51%)
Reviewed Labels (RA Analysis)
Label A 14 calls (29%)
Label B 2 calls (4%)
Label E* 25 calls (51%)
Label F* 8 calls (16%)
Key Findings from Label Comparison
- 5 calls were actually Label A (Promise to Pay) but system labeled them as B — missed PTP commitments. Root cause: strict confirmation flow rejected valid payment dates within the acceptable window.
- Label B is over-assigned — only 2 of 15 system-labeled B calls were genuine refusals. The remaining 13 were: 5 actual A (PTP), 7 ambiguous/unknown intent, 1 no-voice.
- Label C captures too many different scenarios — all 25 system-labeled C calls actually split into: unknown/ambiguous intent (18) and pure no-voice/failed (7).
- *Labels E and F are used in this analysis to distinguish: E = unknown/ambiguous intent (interacted but no clear commitment), F = no voice or failed connection. These map to our previously recommended Label H configuration.
🔬 Observations Across All 49 Calls
Beyond the 7 problem cases, we documented quality observations across all 49 calls. These are tracked for continuous improvement but are not blocking go-live (already accepted at Feb 26 meeting).
| Category |
Issue |
Frequency |
Status |
| TTS |
Unnatural pauses in DPD numbers and mid-sentence breaks |
~18 calls (37%) |
Accepted |
| TTS |
Date repetition — "วันพุธที่วันที่ 27" pattern |
6 calls (12%) |
Accepted |
| TTS |
English name pronunciation errors (names in English alphabet) |
5 calls (10%) |
Accepted |
✅ Fixes Ready to Deploy
The following prompt improvements have been completed and tested in the AIR OptLanguage bot. Ready for transfer to Grab's staging version.
D3-001
Silent Handling — Identity Verification Loop Fix
Added clear protocol to Section 1.2 (Verification Response Handling): when background noise is detected as voice, the bot previously looped indefinitely asking if the customer is on the line — breaking the flow and leaking loan information without identity verification. Fix adds retry limit and hang-up rule.
Evidence: calls f4e2f47d (34 rounds, 407s, all [novoice]) and 1bbd4475 (38 rounds, 503s)
D3-002
FAQ Restructure — Payment Instructions Moved to FAQ
All 3 bots (DAX, MEX, PAX) now have a dedicated Part 3: FAQ section. Payment instructions are only provided if the customer explicitly asks "how to pay" — no longer proactively stated. Addresses Grab Local's request from the Feb 26 on-site session.
D3-003
TTS Numbered Step Pronunciation Fix
Fixed Thai number suffix pronunciation logic within the new FAQ section. Addresses the issue reported by Grab Local during on-site testing where numbered steps were read incorrectly.
D3-004
First Name Usage After Identity Verification
Prompt guidance added so the bot calls the customer by first name only (not full name) after identity verification is complete. Current success rate: 3 out of 40 calls — the LLM must judge which part of the full-name field is the first name. Will continue to refine prompt guidance, but results may vary.
Requested by Grab Local: "after verifying customer identity, can we call him/her only first name?"
🚨 Critical Finding: Loan Information Leak 2 of 49 calls (4.1%)
Compliance Risk — Fixed in Prompt, Awaiting Transfer
In 2 out of 49 calls, the bot disclosed outstanding loan details (amount, number of items, days past due) without successfully verifying the customer's identity. The customer never spoke — background noise was misinterpreted as voice input, causing the bot to proceed through the conversation flow.
| Call ID |
Duration |
Rounds |
What Happened |
f4e2f47d |
407 seconds |
34 rounds |
All [novoice]. Bot disclosed: 2 items, ฿11,507.62, 72 days overdue. Then asked for payment date 20+ times with no response. |
1bbd4475 |
503 seconds |
38 rounds |
All [novoice] then background echo from 5:29 mark. Bot disclosed: 3 items, ฿21,123.23, 86 days overdue. |
Fix Applied (D3-001)
Added identity verification retry limit with hang-up rule. If the bot goes through silent handling (3 attempts) and still cannot verify the customer → end call. Loan information is never disclosed without confirmed identity.
💬 Requires Grab Local Decision
These 4 items need requirement clarification from Grab Local before we can proceed with the next prompt iteration.
Q1: Payment Date Disclosure Rule 3 of 49 calls (6.1%)
The FAQ rule says "never disclose maximumPaymentDate" and always respond with "within today." However, the closing script for refuse-to-pay scenarios includes
maximumPaymentDate in the message — e.g., "please arrange payment within {maximumPaymentDate}."
This also affects how the bot handles customer questions about payment deadlines. In
3 calls, customers asked "what's the latest date I can pay?" and the bot had no proper response — it deflected with the standard PTP date request instead of answering the question.
Question: Should the bot
never mention
maximumPaymentDate in any scenario (including closing statements and when customer asks about the deadline)?
Call references: 1ea697a4 (MEX — closing script contradiction), e6752717, e077fb1c, 94b25a0d (DAX — customer asked about payment deadline, no response)
Q2: Escalation Handling Sub-Categories 1 of 49 calls (2.0%)
The LLM misinterpreted "just came back from province" as severe hardship, triggering premature escalation. For DAX (driver segment), many customers are in situations where their car is broken, they're temporarily not driving, or they have short-term income disruption — these are not the same as severe hardship.
Question: What scenarios should classify as each escalation category? We need concrete examples to enrich the prompt — especially around "no job / no money" vs "temporary disruption" for drivers.
Call reference: 3e112d2c — user said "เพิ่งกลับมาจากต่างจังหวัด" then "เดี๋ยวจะลองเริ่มวิ่ง"
Q3: Loan Type Response Script
The bot currently has 2 different response scripts for PayLater and Cash Loan, but no loan-level data is passed (confirmed by Regional — debtor-level info only). The bot cannot differentiate which loan type the customer has.
Question: Should we merge into one generic response script that works for both loan types? Or keep both and let the bot handle ambiguity based on customer response?
Q4: Non-Target Person Handling Script 2 of 49 calls (4.1%)
When someone other than the intended debtor answers the call, the bot has no dedicated handling flow. It continues the standard conversation and eventually labels the call as B (refuse to pay). In
2 calls, a family member or colleague picked up — the bot followed the refuse-to-pay closing script, which is not appropriate.
Question: What script should the bot follow when it identifies the caller is not the target recipient? Suggested approach: acknowledge the wrong party, thank them, and end the call politely — e.g., "We will contact them again at another time. Thank you."
Current Script When Non-Target Person Answers (for reference)
"ขออภัยที่รบกวนนะครับ เราจะเร่งตรวจสอบข้อมูลที่เกี่ยวข้องอีกครั้ง สวัสดีครับ"
Current Label Rule
"If the user is the wrong party of contact, requests for a call back, or refuse to pay → Choose B (refuse to pay) label"
Call reference: 2944e9fa — non-target person answered, bot followed standard script. Suggestion: create dedicated wrong-party script and consider separate label from refuse-to-pay.
🔔 Regional Items — Previously Reported, New Evidence
These recommendations were raised in the V2 report (Feb 26). The RA analysis provides concrete data reinforcing both items. Final decision requires Regional approval.
Label B/C Reconfiguration 25 of 49 calls (51%) ambiguous
Recommendation: Add Label H for "Unknown Intention"
The RA analysis found
3 new cases of B→C mislabeling where the customer interacted but neither committed nor refused:
f4e2f47d — Pure silent, labeled B (should be C/F)
25bfbb64 — User said "blocked, can't use app" but didn't explicitly refuse to pay, labeled B
0f2f3bd9 — User said "don't drive much" then hung up, didn't refuse to pay, labeled B
Across all 49 calls, 25 calls (51%) are "unknown/ambiguous intent" under our reviewed classification. These are currently split across B and C, making both labels unreliable for collection analytics.
Strict Payment Confirmation Flow 5 of 49 calls (10.2%) missed PTP
Recommendation: Accept implicit confirmation when amount and date are within acceptable range
1 new case: Call
bf5f2bbd — user said "can't pay today, but can in 2 days" (March 2). March 2 is within the
maximumPaymentDate of March 4. The system labeled this as
B (Refuse) instead of
A (Promise to Pay) because the user didn't initially state the full amount and exact date in the required format.
The user eventually provided the date after multiple back-and-forth prompts — but the strict flow created unnecessary friction and risked the customer hanging up.
5 total calls in this analysis were mislabeled B when they should have been A (Promise to Pay), suggesting the strict flow is systematically missing PTP commitments.
🎤 Female Voice Testing Report
| Approach |
Result |
Conclusion |
| Female voice (Leda) |
Inconsistent — some calls fixed date repetition, others didn't |
Not Reliable |
| Prompt adjustment |
Improved stability, most calls passed — but still intermittent |
Best Current Fix |
| Alternative TTS model (Microsoft/Gemini) |
Failed — robot fell back to contingency audio, couldn't initiate dialogue |
Incompatible |
Recommendation
Continue with the current male voice profile + prompt tuning. The bot personality is already established with the current voice, and extensive testing has been completed. A full voice change would alter the bot's persona and require re-testing all scenarios — not recommended with 6 days to go-live. Future ASR enhancements from R&D will further improve voice quality.
🚀 Next Steps
Immediate — Today (Mar 5)
Grab Local
Confirm 4 requirement decisions: payment date disclosure, escalation sub-categories, loan type response, non-target person script
Grab Local
Review label B/C evidence — confirm support for Label H recommendation to Regional
This Week (Mar 5–7)
AI Rudder
Transfer 4 ready fixes (D3-001 to D3-004) to OptLanguage → notify Grab for staging transfer
AI Rudder
Implement additional prompt changes based on today's requirement decisions
Joint
Test updated bot on staging environment
Go-Live (Mar 11)
Grab Regional
Transfer staging → production + apply config change C3-001
AI Rudder
On-site monitoring for go-live day
Grab Regional
Review label reconfiguration + strict confirmation flow recommendations (post go-live)