Improve spam scoring accuracy with real-world email corpus testing

## Context

The spam analyzer models 45+ signals based on SpamAssassin rules, CAN-SPAM, and GDPR patterns. But it hasn't been validated against a large corpus of real emails to calibrate scoring weights.

## What needs to happen

1. **Build a test corpus** — collect ~100 emails across categories:
   - Legitimate transactional (receipts, shipping, password resets)
   - Legitimate marketing (newsletters, promotions)
   - Known spam/phishing examples (available from public datasets)

2. **Run the spam analyzer** against each and compare scores to expected classification

3. **Tune weights** — adjust signal weights so that:
   - Legitimate emails score 80+
   - Spam emails score below 40
   - Edge cases (aggressive marketing) land in the 40-70 range

## How to contribute

This is a great contribution for someone interested in email deliverability. You don't need to write much code — mostly curating test data and running the existing analyzer.

```bash
bun install
bun test -- --grep "spam"   # Run existing spam tests
```

The spam analyzer lives in \`src/analyzers/spam/\`.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve spam scoring accuracy with real-world email corpus testing #5

Context

What needs to happen

How to contribute

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Improve spam scoring accuracy with real-world email corpus testing #5

Description

Context

What needs to happen

How to contribute

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions