Grab Thailand — DAX POC RA Analysis Report

📈 Key Metrics

85.7%

Overall Accuracy

42/49 picked-up calls

49

Calls Analyzed

93 blasted, 49 picked up

7

Problem Cases

14.3% error rate

4

Fixes Ready

Ready to deploy

4

Decisions Needed

Requires Grab Local

2

Regional Items

New evidence added

🏷 Intention Distribution & Label Accuracy

Comparing the system's automatic label assignment against our reviewed classification reveals significant overlap issues in Labels B and C — reinforcing the label reconfiguration recommendation from the previous report.

System Labels (Automatic)

Label A 9 calls (18%)

Label B 15 calls (31%)

Label C 25 calls (51%)

Reviewed Labels (RA Analysis)

Label A 14 calls (29%)

Label B 2 calls (4%)

Label E* 25 calls (51%)

Label F* 8 calls (16%)

Key Findings from Label Comparison

5 calls were actually Label A (Promise to Pay) but system labeled them as B — missed PTP commitments. Root cause: strict confirmation flow rejected valid payment dates within the acceptable window.
Label B is over-assigned — only 2 of 15 system-labeled B calls were genuine refusals. The remaining 13 were: 5 actual A (PTP), 7 ambiguous/unknown intent, 1 no-voice.
Label C captures too many different scenarios — all 25 system-labeled C calls actually split into: unknown/ambiguous intent (18) and pure no-voice/failed (7).
*Labels E and F are used in this analysis to distinguish: E = unknown/ambiguous intent (interacted but no clear commitment), F = no voice or failed connection. These map to our previously recommended Label H configuration.

🔬 Observations Across All 49 Calls

Beyond the 7 problem cases, we documented quality observations across all 49 calls. These are tracked for continuous improvement but are not blocking go-live (already accepted at Feb 26 meeting).

Category	Issue	Frequency	Status
TTS	Unnatural pauses in DPD numbers and mid-sentence breaks	~18 calls (37%)	Accepted
TTS	Date repetition — "วันพุธที่วันที่ 27" pattern	6 calls (12%)	Accepted
TTS	English name pronunciation errors (names in English alphabet)	5 calls (10%)	Accepted

✅ Fixes Ready to Deploy

The following prompt improvements have been completed and tested in the AIR OptLanguage bot. Ready for transfer to Grab's staging version.

D3-001

Silent Handling — Identity Verification Loop Fix

Added clear protocol to Section 1.2 (Verification Response Handling): when background noise is detected as voice, the bot previously looped indefinitely asking if the customer is on the line — breaking the flow and leaking loan information without identity verification. Fix adds retry limit and hang-up rule.
Evidence: calls f4e2f47d (34 rounds, 407s, all [novoice]) and 1bbd4475 (38 rounds, 503s)

D3-002

FAQ Restructure — Payment Instructions Moved to FAQ

All 3 bots (DAX, MEX, PAX) now have a dedicated Part 3: FAQ section. Payment instructions are only provided if the customer explicitly asks "how to pay" — no longer proactively stated. Addresses Grab Local's request from the Feb 26 on-site session.

D3-003

TTS Numbered Step Pronunciation Fix

Fixed Thai number suffix pronunciation logic within the new FAQ section. Addresses the issue reported by Grab Local during on-site testing where numbered steps were read incorrectly.

D3-004

First Name Usage After Identity Verification

Prompt guidance added so the bot calls the customer by first name only (not full name) after identity verification is complete. Current success rate: 3 out of 40 calls — the LLM must judge which part of the full-name field is the first name. Will continue to refine prompt guidance, but results may vary.
Requested by Grab Local: "after verifying customer identity, can we call him/her only first name?"

🚨 Critical Finding: Loan Information Leak 2 of 49 calls (4.1%)

Compliance Risk — Fixed in Prompt, Awaiting Transfer

In 2 out of 49 calls, the bot disclosed outstanding loan details (amount, number of items, days past due) without successfully verifying the customer's identity. The customer never spoke — background noise was misinterpreted as voice input, causing the bot to proceed through the conversation flow.

Call ID	Duration	Rounds	What Happened
`f4e2f47d`	407 seconds	34 rounds	All [novoice]. Bot disclosed: 2 items, ฿11,507.62, 72 days overdue. Then asked for payment date 20+ times with no response.
`1bbd4475`	503 seconds	38 rounds	All [novoice] then background echo from 5:29 mark. Bot disclosed: 3 items, ฿21,123.23, 86 days overdue.

Fix Applied (D3-001)

Added identity verification retry limit with hang-up rule. If the bot goes through silent handling (3 attempts) and still cannot verify the customer → end call. Loan information is never disclosed without confirmed identity.

💬 Requires Grab Local Decision

These 4 items need requirement clarification from Grab Local before we can proceed with the next prompt iteration.

Q1: Payment Date Disclosure Rule 3 of 49 calls (6.1%)

The FAQ rule says "never disclose maximumPaymentDate" and always respond with "within today." However, the closing script for refuse-to-pay scenarios includes maximumPaymentDate in the message — e.g., "please arrange payment within {maximumPaymentDate}."

This also affects how the bot handles customer questions about payment deadlines. In 3 calls, customers asked "what's the latest date I can pay?" and the bot had no proper response — it deflected with the standard PTP date request instead of answering the question.

Question: Should the bot never mention maximumPaymentDate in any scenario (including closing statements and when customer asks about the deadline)?
Call references: 1ea697a4 (MEX — closing script contradiction), e6752717, e077fb1c, 94b25a0d (DAX — customer asked about payment deadline, no response)

Q2: Escalation Handling Sub-Categories 1 of 49 calls (2.0%)

The LLM misinterpreted "just came back from province" as severe hardship, triggering premature escalation. For DAX (driver segment), many customers are in situations where their car is broken, they're temporarily not driving, or they have short-term income disruption — these are not the same as severe hardship.

Question: What scenarios should classify as each escalation category? We need concrete examples to enrich the prompt — especially around "no job / no money" vs "temporary disruption" for drivers.
Call reference: 3e112d2c — user said "เพิ่งกลับมาจากต่างจังหวัด" then "เดี๋ยวจะลองเริ่มวิ่ง"

Q3: Loan Type Response Script

The bot currently has 2 different response scripts for PayLater and Cash Loan, but no loan-level data is passed (confirmed by Regional — debtor-level info only). The bot cannot differentiate which loan type the customer has.

Question: Should we merge into one generic response script that works for both loan types? Or keep both and let the bot handle ambiguity based on customer response?

Q4: Non-Target Person Handling Script 2 of 49 calls (4.1%)

When someone other than the intended debtor answers the call, the bot has no dedicated handling flow. It continues the standard conversation and eventually labels the call as B (refuse to pay). In 2 calls, a family member or colleague picked up — the bot followed the refuse-to-pay closing script, which is not appropriate.

Question: What script should the bot follow when it identifies the caller is not the target recipient? Suggested approach: acknowledge the wrong party, thank them, and end the call politely — e.g., "We will contact them again at another time. Thank you."

Current Script When Non-Target Person Answers (for reference)

"ขออภัยที่รบกวนนะครับ เราจะเร่งตรวจสอบข้อมูลที่เกี่ยวข้องอีกครั้ง สวัสดีครับ"

Current Label Rule

"If the user is the wrong party of contact, requests for a call back, or refuse to pay → Choose B (refuse to pay) label"

Call reference: 2944e9fa — non-target person answered, bot followed standard script. Suggestion: create dedicated wrong-party script and consider separate label from refuse-to-pay.

🔔 Regional Items — Previously Reported, New Evidence

These recommendations were raised in the V2 report (Feb 26). The RA analysis provides concrete data reinforcing both items. Final decision requires Regional approval.

Label B/C Reconfiguration 25 of 49 calls (51%) ambiguous

Recommendation: Add Label H for "Unknown Intention"

The RA analysis found 3 new cases of B→C mislabeling where the customer interacted but neither committed nor refused:

f4e2f47d — Pure silent, labeled B (should be C/F)
25bfbb64 — User said "blocked, can't use app" but didn't explicitly refuse to pay, labeled B
0f2f3bd9 — User said "don't drive much" then hung up, didn't refuse to pay, labeled B

Across all 49 calls, 25 calls (51%) are "unknown/ambiguous intent" under our reviewed classification. These are currently split across B and C, making both labels unreliable for collection analytics.

Strict Payment Confirmation Flow 5 of 49 calls (10.2%) missed PTP

Recommendation: Accept implicit confirmation when amount and date are within acceptable range

1 new case: Call bf5f2bbd — user said "can't pay today, but can in 2 days" (March 2). March 2 is within the maximumPaymentDate of March 4. The system labeled this as B (Refuse) instead of A (Promise to Pay) because the user didn't initially state the full amount and exact date in the required format.

The user eventually provided the date after multiple back-and-forth prompts — but the strict flow created unnecessary friction and risked the customer hanging up. 5 total calls in this analysis were mislabeled B when they should have been A (Promise to Pay), suggesting the strict flow is systematically missing PTP commitments.

🎤 Female Voice Testing Report

Approach	Result	Conclusion
Female voice (Leda)	Inconsistent — some calls fixed date repetition, others didn't	Not Reliable
Prompt adjustment	Improved stability, most calls passed — but still intermittent	Best Current Fix
Alternative TTS model (Microsoft/Gemini)	Failed — robot fell back to contingency audio, couldn't initiate dialogue	Incompatible

Recommendation

Continue with the current male voice profile + prompt tuning. The bot personality is already established with the current voice, and extensive testing has been completed. A full voice change would alter the bot's persona and require re-testing all scenarios — not recommended with 6 days to go-live. Future ASR enhancements from R&D will further improve voice quality.

🚀 Next Steps

Immediate — Today (Mar 5)

Grab Local Confirm 4 requirement decisions: payment date disclosure, escalation sub-categories, loan type response, non-target person script

Grab Local Review label B/C evidence — confirm support for Label H recommendation to Regional

This Week (Mar 5–7)

AI Rudder Transfer 4 ready fixes (D3-001 to D3-004) to OptLanguage → notify Grab for staging transfer

AI Rudder Implement additional prompt changes based on today's requirement decisions

Joint Test updated bot on staging environment

Go-Live (Mar 11)

Grab Regional Transfer staging → production + apply config change C3-001

AI Rudder On-site monitoring for go-live day

Grab Regional Review label reconfiguration + strict confirmation flow recommendations (post go-live)