Extract a contact from a message
People volunteer identity in chat. "Hi, this is Jane from Acme — could you …", "ping me at jane@acme.com", "my number is +47 555-1234". That information is the difference between a row called "anonymous visitor #4912" and a real CRM contact you can re-engage later. Your job is to read one closed conversation, extract whatever the end-user actually said about themselves, and write it through to the CRM — no proposal queue, no review step. The data source is the user's own message; if they typed it, that's authoritative enough.
A separate, scheduled skill://crm/clean-contact-data curator runs weekly to merge any duplicates this pass creates (e.g. visitor gives email in conv #1 and phone in conv #2 with no overlap). Don't try to do hygiene's job here — keep the per-conversation pass narrow.
TL;DR
- Read the conversation with
conv_get_conversation(<conversationId>). The user prompt names the conversation; do not list or scan others. - Check the first end-user message's
metadata.senderClassification(email channel only). IfisMailingList,isAutoReply, orisBounceis true → skip the conversation entirely, no writes. These aren't real people we should be CRM-tracking. - Extract identifying fields from
end_user-authored messages only:email,phone,name,companyId/companyName,title. Ignore agent and system messages — those are operator output, not user-volunteered identity. For email messages, also readmetadata.signatureText— the trailing block already pulled out of the body. Signatures are the strongest identity source we have (the sender themselves typed it as their canonical "this is who I am") and should take precedence over inline mentions in the body. - Skip if nothing identifying was said. If you found neither email nor phone nor a clear self-introduced name, finish silently — no writes.
- Look up an existing contact with
crm_find_contact({ email?, phone? })before creating. Match keys: extracted email or phone. - Create or backfill, never overwrite. If
crm_find_contactreturns null →crm_create_contactwith the extracted fields and the conversation'sendUserId. If it returns an existing row →crm_update_contactand ONLY fill fields that are currently null/empty on the existing row. Do not overwrite a non-empty field. - Stop. One
crm_create_contactor onecrm_update_contact. No further actions.
Role accounts (email only)
When metadata.senderClassification.isRoleAccount is true on the first end-user message, the address is a shared inbox (support@, sales@, info@, noreply@, …) — not a person. Extract email only; do not write name, title, or phone from the signature or the body, because those fields rotate week to week as different humans answer the inbox. Tag with from-chat and role-account so operators can filter or merge later.
Step 1 — read the conversation
{ "name": "conv_get_conversation", "arguments": { "id": "ccv_…" } }
The response includes messages[]. The conversation summary also has endUserId — note it; you'll pass it on crm_create_contact so the new contact links back to the conversation participant.
Step 2 — extract from end-user messages
Only consider authorType: "end_user" messages. Look in two places, in priority order:
metadata.signatureText(email channel only). A signature block is the sender's own canonical self-description — a strong identity claim. Parse it as a structured block: name on its own line (sometimes preceded by a closing like "Best,"), then any combination of title, company, address, phone, email, website. Lines like "Sent from my iPhone" carry no identity; ignore them.- Inline body mentions ("Hi, I'm Jane from Acme", "ping me at +47 …"). Same rules as before — only self-referential, not third-party routing instructions.
For each field:
- Email — From-address (
conv_get_conversationreturns it on the conversation; it's the From of every end-user email). Inline mentions are secondary. Skip emails that look like third parties they're forwarding to ("send it to legal@partner.com"). - Phone — signature first, then inline. Normalise to E.164 if you can infer the country from context; otherwise keep as typed.
- Name — signature first (the line right after the closing, e.g. "Best,\nJane Doe"). Otherwise only when the user explicitly self-introduces ("I'm Jane", "this is Jane Doe"). Don't infer from email local-part —
j.doe@acme.comdoesn't tell you they're "Jane Doe". Don't pick up names of third parties they're discussing. - Title / company — signature first (typical layout: name → title → company). Otherwise only when self-stated ("I'm head of ops at Acme"). Don't infer company from email domain alone unless they also self-introduce: "I'm Jane from Acme" +
jane@acme.comis fine, justjane@acme.comis not.
If a message contains "send to support@example.com please" and nothing self-referential, skip the email — that's a routing instruction, not an identity claim.
Reading the signature block
metadata.signatureText is the raw block as the sender typed it. A typical shape:
Best regards,
Jane Doe
Head of Operations
Acme Corp
+1 555 123 4567
jane@acme.com
Parse line by line. The first non-closing line is the name. Subsequent lines are some mix of title, company, phone, email, address — order varies. Be conservative on ambiguous lines (e.g. "London" alone — is that an address fragment? a city? skip it).
Step 3 — when to skip entirely
- The first end-user message's
metadata.senderClassificationhasisMailingList,isAutoReply, orisBounceset to true → skip. These are not human correspondents. - No email AND no phone AND no clear self-introduced name → skip, no writes.
- Conversation has no
endUserId(channel doesn't track participants) → skip; nothing to attribute the contact to. - The conversation's
endUserIdalready maps to acrm_contactsrow (viaendUserIdjoin) AND that row already has email + phone + name populated → skip, nothing to backfill.
Step 4 — dedupe before creating
{ "name": "crm_find_contact", "arguments": { "email": "jane@acme.com" } }
Or by phone if no email. If crm_find_contact returns null, you'll create a new contact in step 5. Otherwise, you'll backfill.
If you have BOTH email and phone, run crm_find_contact for email first; if no hit, run for phone. The first hit wins.
Step 5a — create (no existing match)
{
"name": "crm_create_contact",
"arguments": {
"name": "Jane Doe",
"email": "jane@acme.com",
"phone": "+47 555-1234",
"title": "Head of Ops",
"endUserId": "eu_...",
"tags": ["from-chat"]
}
}
Pass only the fields you actually extracted. Always include endUserId (the conversation's participant) so the contact is linked back to the chat surface — the dashboard's conversation view uses that link to show "this person's CRM record".
Add the from-chat tag so operators can filter contacts that came in via this curator.
Step 5b — backfill (existing match found)
{
"name": "crm_update_contact",
"arguments": {
"id": "cct_existing",
"patch": { "phone": "+47 555-1234", "title": "Head of Ops" },
"mode": "fill-null"
}
}
Always pass mode: "fill-null" from this skill. That tells the service to apply patch keys only where the existing contact's field is null/empty; non-null fields are left untouched server-side. This is defense-in-depth on top of your own filtering — if you accidentally include a field that's already populated, the service refuses to overwrite rather than clobbering operator-curated data.
You should still pre-filter the patch yourself (it's cleaner and avoids a no-op write): read the row from crm_find_contact's response and, for each extracted field:
- existing field is
nullor empty string → include it inpatch - existing field has a value → leave it out of
patch, even if the user's new value looks "better"
If every extracted field is already populated, finish silently without calling crm_update_contact at all.
What NOT to do
- Don't overwrite human-curated data. The operator typed something, then the user typed something different in chat — trust the operator. Backfill empty fields only.
- Don't extract from agent messages. "Sure, jane@acme.com is on file" is the agent quoting back something it remembers; treating it as an identity claim creates feedback loops.
- Don't infer.
j.doe@acme.comdoes not mean their name is "Jane Doe" or "John Doe". Names come from explicit self-introduction; everything else stays null. - Don't propose merges. That's
skill://crm/clean-contact-data's job and runs weekly across the whole population. Per-conversation extraction is narrow on purpose. - Don't open multiple records. One contact per pass — either one create or one update, never both.
Related
skill://crm/clean-contact-data— population-level merge pass that catches dupes this skill missed (e.g. visitor gave email in one conv, phone in another).skill://kb/review-content— the symmetric pattern: per-conversation event-driven extraction, but for KB candidates instead of CRM contacts. Note that KB proposes (because LLM-drafted facts can be wrong); CRM contact extraction auto-applies (because the data source is the user's own typed words).