How AI Spam Filtering Works in InputHaven
The problem with keyword-based spam filtering
Traditional spam filters work by matching keywords: "buy now", "click here", "free money". This approach has two fundamental problems:
- False positives — A legitimate message saying "click here to see our pricing" gets flagged
- Easy to bypass — Spammers use Unicode lookalikes, misspellings, and creative phrasing to dodge keyword lists
Spam has evolved. Spam filtering needs to evolve too.
How InputHaven's spam protection works
InputHaven uses five layers of protection, each catching different types of spam:
Layer 1: Honeypot fields
We generate an invisible form field. Real users never see it, but automated bots fill it in. If the honeypot field has a value, the submission is spam. This catches the majority of basic bots with zero impact on legitimate users.
Layer 2: Rate limiting
Each form endpoint has per-IP rate limits. If someone submits 50 times in a minute, that's not a human. Rate limiting prevents brute-force spam attacks and protects your submission quota.
Layer 3: Domain allowlists
You can configure which domains are allowed to submit to your form. If a submission comes from an unauthorized domain, it's rejected. This prevents form hijacking and unauthorized embedding.
Layer 4: Keyword filtering
We maintain a curated list of common spam phrases and patterns. This catches low-effort spam that uses known spam terminology. The keyword filter is fast and runs on every submission at no cost.
Layer 5: AI classification (Starter+ plans)
This is where it gets interesting. On Starter plans and above, submissions that pass the first four layers are analyzed by Claude — Anthropic's AI model.
How the AI layer works
When a submission reaches the AI layer, here's what happens:
- Data preparation — We extract all form field values and truncate to 2,000 characters (enough for classification, small enough for fast processing)
- Context-aware analysis — Claude analyzes the submission considering:
- Is this coherent human language or generated gibberish?
- Does the content match what a legitimate form submission would contain?
- Are there subtle spam indicators like excessive URLs, promotional language, or social engineering?
- Does the "email" field contain a disposable email service?
- Confidence scoring — The AI returns a spam confidence score from 0–100 and a human-readable reason explaining why it made that decision
- Decision — Submissions scoring above the threshold are marked as spam. The threshold is tuned to minimize false positives — we'd rather let one spam through than block a real submission.
What makes AI filtering different
Consider these real examples:
Example 1: Sophisticated spam
> "Hi, I noticed your website and thought you might be interested in our SEO services. We've helped companies like yours increase traffic by 300%. Let me know if you'd like a free consultation."
A keyword filter might not catch this — there are no obvious spam words. But Claude recognizes the pattern: unsolicited commercial outreach with inflated claims and a sales pitch.
Example 2: Legitimate message that looks like spam
> "I want to buy your product. Can you send me pricing for 100 units? I need them delivered free to our warehouse by Friday."
A keyword filter might flag "buy" and "free". But Claude understands context: this is a genuine purchase inquiry.
Example 3: Unicode evasion
> "Ⅽheck out this аmazing оffer — frеe сryрto!"
Spammers use Cyrillic lookalikes to bypass keyword filters. Claude reads the rendered text and catches the spam regardless of character encoding tricks.
Graceful degradation
The AI layer is designed to fail safely:
- If the Anthropic API is unavailable, the submission passes through (we don't block legitimate users due to third-party downtime)
- If the AI takes too long, we fall back to keyword-only filtering
- AI results are logged but never the sole reason for rejection — the other four layers provide baseline protection
The numbers
In our testing:
- Honeypot alone catches ~70% of spam
- Honeypot + keywords catches ~85% of spam
- All five layers catch ~98% of spam
- False positive rate with AI: <0.1%
How to enable it
AI spam filtering is available on Starter plans ($5/mo) and above. To enable it:
- Go to your form settings in the dashboard
- Toggle on "AI Spam Filtering"
- That's it — no configuration needed
Every submission will show its spam score and classification reason in the dashboard, so you can audit the AI's decisions.
Cost
AI spam filtering uses Claude Haiku — Anthropic's fastest, most affordable model. The cost per classification is approximately $0.0003 (three hundredths of a cent). Even at 10,000 submissions/month, AI spam filtering costs less than $3 in API calls. This is absorbed into your plan price — there's no per-submission charge.