FeatureAIspam filteringClaude

How AI Spam Filtering Works in InputHaven

January 10, 20256 min read

The problem with keyword-based spam filtering

Traditional spam filters work by matching keywords: "buy now", "click here", "free money". This approach has two fundamental problems:

False positives — A legitimate message saying "click here to see our pricing" gets flagged
Easy to bypass — Spammers use Unicode lookalikes, misspellings, and creative phrasing to dodge keyword lists

Spam has evolved. Spam filtering needs to evolve too.

How InputHaven's spam protection works

InputHaven uses five layers of protection, each catching different types of spam:

Layer 1: Honeypot fields

We generate an invisible form field. Real users never see it, but automated bots fill it in. If the honeypot field has a value, the submission is spam. This catches the majority of basic bots with zero impact on legitimate users.

Layer 2: Rate limiting

Each form endpoint has per-IP rate limits. If someone submits 50 times in a minute, that's not a human. Rate limiting prevents brute-force spam attacks and protects your submission quota.

Layer 3: Domain allowlists

You can configure which domains are allowed to submit to your form. If a submission comes from an unauthorized domain, it's rejected. This prevents form hijacking and unauthorized embedding.

Layer 4: Keyword filtering

We maintain a curated list of common spam phrases and patterns. This catches low-effort spam that uses known spam terminology. The keyword filter is fast and runs on every submission at no cost.

Layer 5: AI classification (Starter+ plans)

This is where it gets interesting. On Starter plans and above, submissions that pass the first four layers are analyzed by Claude — Anthropic's AI model.

How the AI layer works

When a submission reaches the AI layer, here's what happens:

Data preparation — We extract all form field values and truncate to 2,000 characters (enough for classification, small enough for fast processing)
Context-aware analysis — Claude analyzes the submission considering:

- Is this coherent human language or generated gibberish?

- Does the content match what a legitimate form submission would contain?

- Are there subtle spam indicators like excessive URLs, promotional language, or social engineering?

- Does the "email" field contain a disposable email service?

Confidence scoring — The AI returns a spam confidence score from 0–100 and a human-readable reason explaining why it made that decision
Decision — Submissions scoring above the threshold are marked as spam. The threshold is tuned to minimize false positives — we'd rather let one spam through than block a real submission.

What makes AI filtering different

Consider these real examples:

Example 1: Sophisticated spam

> "Hi, I noticed your website and thought you might be interested in our SEO services. We've helped companies like yours increase traffic by 300%. Let me know if you'd like a free consultation."

A keyword filter might not catch this — there are no obvious spam words. But Claude recognizes the pattern: unsolicited commercial outreach with inflated claims and a sales pitch.

Example 2: Legitimate message that looks like spam

> "I want to buy your product. Can you send me pricing for 100 units? I need them delivered free to our warehouse by Friday."

A keyword filter might flag "buy" and "free". But Claude understands context: this is a genuine purchase inquiry.

Example 3: Unicode evasion

> "Ⅽheck out this аmazing оffer — frеe сryрto!"

Spammers use Cyrillic lookalikes to bypass keyword filters. Claude reads the rendered text and catches the spam regardless of character encoding tricks.

Graceful degradation

The AI layer is designed to fail safely:

If the Anthropic API is unavailable, the submission passes through (we don't block legitimate users due to third-party downtime)
If the AI takes too long, we fall back to keyword-only filtering
AI results are logged but never the sole reason for rejection — the other four layers provide baseline protection

The numbers

In our testing:

Honeypot alone catches ~70% of spam
Honeypot + keywords catches ~85% of spam
All five layers catch ~98% of spam
False positive rate with AI: <0.1%

How to enable it

AI spam filtering is available on Starter plans ($5/mo) and above. To enable it:

Go to your form settings in the dashboard
Toggle on "AI Spam Filtering"
That's it — no configuration needed

Every submission will show its spam score and classification reason in the dashboard, so you can audit the AI's decisions.

Cost

AI spam filtering uses Claude Haiku — Anthropic's fastest, most affordable model. The cost per classification is approximately $0.0003 (three hundredths of a cent). Even at 10,000 submissions/month, AI spam filtering costs less than $3 in API calls. This is absorbed into your plan price — there's no per-submission charge.

Ready to try InputHaven?

500 free submissions/month. No credit card required.

Get Started Free