Auto-enrich new contacts

This script polls Supersonic for recently created contacts, enriches each one using a web search API (Tavily), and writes the enrichment data back to the contact record. Run it every 15 minutes on cron.

How it works

Fetch contacts sorted by created_at descending.
Check which ones were created since the last run (tracked via a timestamp file).
For each new contact, search the web for their name + company.
Extract title, LinkedIn URL, and company info from search results.
Update the contact record in Supersonic.

Prerequisites

You need a Contacts object type with fields for enrichment data:

npx supersonic-cli objects add-field \
  --object-type-slug "contacts" \
  --name "LinkedIn" \
  --field-type "text"

npx supersonic-cli objects add-field \
  --object-type-slug "contacts" \
  --name "Bio" \
  --field-type "text"

You also need a Tavily API key (free tier available). Any search API works — substitute your preferred provider.

The script

#!/usr/bin/env python3
"""
Auto-enrich new contacts with web search data.

Env vars: SUPERSONIC_API_KEY, TAVILY_API_KEY
State file: ~/.supersonic_enrich_last_run
"""

import os
import json
from datetime import datetime, timezone
from pathlib import Path
import httpx

API_URL = "https://mcp.supersonic.cv/api/developers/mcp/call/"
API_KEY = os.environ["SUPERSONIC_API_KEY"]
TAVILY_API_KEY = os.environ["TAVILY_API_KEY"]
OBJECT_TYPE_SLUG = "contacts"
LAST_RUN_FILE = Path.home() / ".supersonic_enrich_last_run"


def api_call(tool: str, params: dict) -> dict:
    resp = httpx.post(
        API_URL,
        headers={
            "Authorization": f"Bearer {API_KEY}",
            "Content-Type": "application/json",
        },
        json={"tool": tool, "params": params},
        timeout=15.0,
    )
    resp.raise_for_status()
    return resp.json()


def get_last_run() -> str | None:
    if LAST_RUN_FILE.exists():
        return LAST_RUN_FILE.read_text().strip()
    return None


def save_last_run(timestamp: str):
    LAST_RUN_FILE.write_text(timestamp)


def get_new_contacts(since: str | None) -> list[dict]:
    params = {
        "object_type_slug": OBJECT_TYPE_SLUG,
        "sort": "-created_at",
        "limit": 50,
    }
    data = api_call("records.list", params)
    records = data.get("records", [])

    if since:
        records = [r for r in records if r.get("created_at", "") > since]

    return records


def search_person(name: str, company: str | None) -> dict:
    """Search the web for a person and extract enrichment data."""
    query = name
    if company:
        query = f"{name} {company}"

    resp = httpx.post(
        "https://api.tavily.com/search",
        json={
            "api_key": TAVILY_API_KEY,
            "query": query,
            "search_depth": "basic",
            "max_results": 5,
        },
        timeout=15.0,
    )
    resp.raise_for_status()
    results = resp.json().get("results", [])

    enrichment = {}

    for result in results:
        url = result.get("url", "")
        content = result.get("content", "")

        # Extract LinkedIn URL
        if "linkedin.com/in/" in url and "LinkedIn" not in enrichment:
            enrichment["LinkedIn"] = url

        # Use snippet as bio if it mentions the person
        if name.split()[0].lower() in content.lower() and "Bio" not in enrichment:
            enrichment["Bio"] = content[:500]

    return enrichment


def update_record(record_id: str, data: dict):
    api_call("records.update", {
        "object_type_slug": OBJECT_TYPE_SLUG,
        "record_id": record_id,
        "data": data,
    })


def main():
    last_run = get_last_run()
    now = datetime.now(timezone.utc).isoformat()

    contacts = get_new_contacts(last_run)
    print(f"Found {len(contacts)} new contacts to enrich.")

    for contact in contacts:
        record_id = contact["id"]
        data = contact.get("data", {})
        name = data.get("Name", "")
        company = data.get("Company")

        if not name:
            continue

        # Skip if already enriched
        if data.get("LinkedIn") or data.get("Bio"):
            print(f"  Skipping {name} (already enriched)")
            continue

        print(f"  Enriching {name}...")
        try:
            enrichment = search_person(name, company)
            if enrichment:
                update_record(record_id, enrichment)
                print(f"    Updated: {list(enrichment.keys())}")
            else:
                print(f"    No data found")
        except Exception as e:
            print(f"    Error: {e}")

    save_last_run(now)
    print("Done.")


if __name__ == "__main__":
    main()

Setup

Install dependencies

pip install httpx

Set environment variables

export SUPERSONIC_API_KEY="supersonic_live_YOUR_KEY"
export TAVILY_API_KEY="tvly-YOUR_KEY"

Test with a single run

python auto_enrich.py

Check a contact to verify enrichment data was written:

npx supersonic-cli records get \
  --object-type-slug "contacts" \
  --record-id "RECORD_ID"

Schedule with cron

Every 15 minutes:

*/15 * * * * SUPERSONIC_API_KEY=supersonic_live_YOUR_KEY TAVILY_API_KEY=tvly-YOUR_KEY /usr/bin/python3 /path/to/auto_enrich.py >> /var/log/auto_enrich.log 2>&1

Using a different search API

The search_person function is the only part that depends on Tavily. Swap it out for any provider. Here’s the same function using SerpAPI:

def search_person(name: str, company: str | None) -> dict:
    query = f"{name} {company}" if company else name

    resp = httpx.get(
        "https://serpapi.com/search",
        params={
            "api_key": os.environ["SERPAPI_KEY"],
            "q": query,
            "num": 5,
        },
        timeout=15.0,
    )
    resp.raise_for_status()
    results = resp.json().get("organic_results", [])

    enrichment = {}
    for result in results:
        link = result.get("link", "")
        snippet = result.get("snippet", "")

        if "linkedin.com/in/" in link and "LinkedIn" not in enrichment:
            enrichment["LinkedIn"] = link
        if name.split()[0].lower() in snippet.lower() and "Bio" not in enrichment:
            enrichment["Bio"] = snippet[:500]

    return enrichment

This script processes up to 50 contacts per run. If you’re importing large batches, increase the limit parameter or run more frequently. Each records.list and records.update call counts toward the 1,000 calls/min rate limit.

To enrich existing contacts (not just new ones), remove the since filter and the “already enriched” check. Run it once as a backfill, then switch back to incremental mode.

​How it works

​Prerequisites

​The script

​Setup

​Using a different search API

How it works

Prerequisites

The script

Setup

Using a different search API