MAESTRO: add engine TDD fixtures and tests

2026-02-22 05:23:31 +01:00 · 2026-02-17 23:33:54 +01:00
parent 635d0724ba
commit 199c92ab33
5 changed files with 237 additions and 0 deletions
--- a/Docs/SpecKit-web-header-analyzer-Phase-02-Engine-Refactoring.md
+++ b/Docs/SpecKit-web-header-analyzer-Phase-02-Engine-Refactoring.md
@@ -0,0 +1,73 @@
 # Phase 02: Engine Refactoring
 This phase decomposes the monolithic `decode-spam-headers.py` (6,931 lines, 106 test methods, 3 classes) into independently testable scanner modules that the API can invoke programmatically. This is a prerequisite for all user stories — without modular scanners, the backend cannot expose individual tests or stream progress. TDD Red-Green: write failing tests first, then implement parser, scanner base, registry, 10 vendor-grouped scanner modules, and the analyzer orchestrator.
 ## Spec Kit Context
 - **Feature:** 1-web-header-analyzer
 - **Specification:** .specify/specs/1-web-header-analyzer/spec.md
 - **Plan:** .specify/specs/1-web-header-analyzer/plan.md
 - **Tasks:** .specify/specs/1-web-header-analyzer/tasks.md
 - **Data Model:** .specify/specs/1-web-header-analyzer/data-model.md
 - **Constitution:** .specify/memory/constitution.md (TDD mandate: P6)
 ## Architecture Reference
 The existing monolith structure (read-only reference):
 - `decode-spam-headers.py` lines 209–419: `Logger` class
 - `decode-spam-headers.py` lines 421–439: `Verstring` class
 - `decode-spam-headers.py` lines 441+: `SMTPHeadersAnalysis` class
 - `decode-spam-headers.py` lines 1896–2027: `getAllTests()` defining all 106 tests
 - `decode-spam-headers.py` lines 2437–6504: All test method implementations
 Target modular structure:
 ```
 backend/app/engine/
 ├── __init__.py
 ├── models.py              # AnalysisRequest, AnalysisResult, TestResult, HopChainNode, SecurityAppliance
 ├── logger.py              # Adapted Logger class (Python logging module)
 ├── parser.py              # HeaderParser.parse(raw_text) -> list[ParsedHeader]
 ├── scanner_base.py        # BaseScanner protocol: id, name, run(headers) -> TestResult | None
 ├── scanner_registry.py    # ScannerRegistry: get_all(), get_by_ids(), list_tests()
 ├── analyzer.py            # HeaderAnalyzer orchestrator with progress callback
 └── scanners/
    ├── received_headers.py      # Tests 1–3
    ├── forefront_antispam.py    # Tests 12–16, 63–64
    ├── spamassassin.py          # Tests 18–21, 74
    ├── ironport.py              # Tests 27–29, 38–43, 88–89
    ├── mimecast.py              # Tests 30, 61–62, 65
    ├── trendmicro.py            # Tests 47–59, 97
    ├── barracuda.py             # Tests 69–73
    ├── proofpoint.py            # Tests 66–67
    ├── microsoft_general.py     # Tests 31–34, 80, 83–85, 99–102
    └── general.py               # Remaining tests: 4–11, 17, 22–26, 36–37, 44–46, 68, 75–79, 82, 86–87, 90–96, 98, 103–106
 ```
 ## Tasks
 - [x] T007 Write failing tests (TDD Red) in `backend/tests/engine/test_parser.py` (header parsing with sample EML), `backend/tests/engine/test_scanner_registry.py` (discovery returns 106+ scanners, filtering by ID), and `backend/tests/engine/test_analyzer.py` (full pipeline with reference fixture). Create `backend/tests/fixtures/sample_headers.txt` with representative header set extracted from the existing test infrastructure
 - [ ] T008 Create `backend/app/engine/__init__.py` and `backend/app/engine/models.py` — Pydantic models for `AnalysisRequest`, `AnalysisResult`, `TestResult`, `HopChainNode`, `SecurityAppliance`. Refer to `.specify/specs/1-web-header-analyzer/data-model.md` for field definitions and severity enum values (spam→#ff5555, suspicious→#ffb86c, clean→#50fa7b, info→#bd93f9)
 - [ ] T009 Create `backend/app/engine/logger.py` — extract Logger class from `decode-spam-headers.py` (lines 209–419), adapt to use Python `logging` module instead of direct stdout
 - [ ] T010 Create `backend/app/engine/parser.py` — extract header parsing from `SMTPHeadersAnalysis.collect()` and `getHeader()` (lines ~2137–2270). Expose `HeaderParser.parse(raw_text: str) -> list[ParsedHeader]` including MIME boundary and line-break handling. Verify `test_parser.py` passes (TDD Green)
 - [ ] T011 Create `backend/app/engine/scanner_base.py` — abstract `BaseScanner` (Protocol or ABC) with interface: `id: int`, `name: str`, `run(headers: list[ParsedHeader]) -> TestResult | None`
 - [ ] T012 Create `backend/app/engine/scanner_registry.py` — `ScannerRegistry` with auto-discovery: `get_all()`, `get_by_ids(ids)`, `list_tests()`. Verify `test_scanner_registry.py` passes (TDD Green)
 - [ ] T013 [P] Create scanner modules by extracting test methods from `SMTPHeadersAnalysis` into `backend/app/engine/scanners/`. Each file implements `BaseScanner`:
  - `backend/app/engine/scanners/received_headers.py` (tests 1–3)
  - `backend/app/engine/scanners/forefront_antispam.py` (tests 12–16, 63–64)
  - `backend/app/engine/scanners/spamassassin.py` (tests 18–21, 74)
  - `backend/app/engine/scanners/ironport.py` (tests 27–29, 38–43, 88–89)
  - `backend/app/engine/scanners/mimecast.py` (tests 30, 61–62, 65)
  - `backend/app/engine/scanners/trendmicro.py` (tests 47–59, 97)
  - `backend/app/engine/scanners/barracuda.py` (tests 69–73)
  - `backend/app/engine/scanners/proofpoint.py` (tests 66–67)
  - `backend/app/engine/scanners/microsoft_general.py` (tests 31–34, 80, 83–85, 99–102)
  - `backend/app/engine/scanners/general.py` (remaining tests: 4–11, 17, 22–26, 36–37, 44–46, 68, 75–79, 82, 86–87, 90–96, 98, 103–106)
 - [ ] T014 Create `backend/app/engine/analyzer.py` — `HeaderAnalyzer` orchestrator: accepts `AnalysisRequest`, uses `HeaderParser` + `ScannerRegistry`, runs scanners with per-test timeout, collects results (marking failed tests with error status per FR-25), supports progress callback `Callable[[int, int, str], None]`. Verify `test_analyzer.py` passes (TDD Green)
 ## Completion
 - [ ] `pytest backend/tests/engine/` passes with all tests green
 - [ ] All 106+ tests are registered in the scanner registry (`ScannerRegistry.get_all()` returns 106+ scanners)
 - [ ] Analysis of `backend/tests/fixtures/sample_headers.txt` produces results matching original CLI output
 - [ ] `ruff check backend/` passes with zero errors
 - [ ] Run `/speckit.analyze` to verify consistency
--- a/backend/tests/engine/test_analyzer.py
+++ b/backend/tests/engine/test_analyzer.py
@@ -0,0 +1,46 @@
 from __future__ import annotations
 from pathlib import Path
 from app.engine.analyzer import HeaderAnalyzer
 from app.engine.models import AnalysisRequest, AnalysisResult, TestResult
 FIXTURES_DIR = Path(__file__).resolve().parents[1] / "fixtures"
 def test_analyzer_runs_selected_tests_and_reports_progress() -> None:
    raw_headers = (FIXTURES_DIR / "sample_headers.txt").read_text(encoding="utf-8")
    request = AnalysisRequest(
        headers=raw_headers,
        config={
            "test_ids": [12, 13],
            "resolve": False,
            "decode_all": False,
        },
    )
    progress_events: list[tuple[int, int, str]] = []
    def on_progress(current_index: int, total_tests: int, test_name: str) -> None:
        progress_events.append((current_index, total_tests, test_name))
    analyzer = HeaderAnalyzer()
    result = analyzer.analyze(request, progress_callback=on_progress)
    assert isinstance(result, AnalysisResult)
    assert len(result.results) == 2
    assert [item.test_id for item in result.results] == [12, 13]
    assert all(isinstance(item, TestResult) for item in result.results)
    assert result.metadata.total_tests == 2
    assert (
        result.metadata.passed_tests
        + result.metadata.failed_tests
        + result.metadata.skipped_tests
    ) == result.metadata.total_tests
    assert progress_events
    assert all(total == 2 for _, total, _ in progress_events)
    assert progress_events[0][0] == 0
    assert progress_events[-1][0] == 1
--- a/backend/tests/engine/test_parser.py
+++ b/backend/tests/engine/test_parser.py
@@ -0,0 +1,52 @@
 from __future__ import annotations
 from pathlib import Path
 import pytest
 from app.engine.parser import HeaderParser
 FIXTURES_DIR = Path(__file__).resolve().parents[1] / "fixtures"
@pytest.fixture()
 def sample_headers() -> str:
    return (FIXTURES_DIR / "sample_headers.txt").read_text(encoding="utf-8")
 def test_parser_extracts_headers_and_preserves_order(sample_headers: str) -> None:
    parser = HeaderParser()
    headers = parser.parse(sample_headers)
    assert headers, "Expected parsed headers to be non-empty."
    indices = [header.index for header in headers]
    assert indices == list(range(len(headers)))
    names = [header.name for header in headers]
    assert names[:2] == ["Received", "Received"]
    assert "X-Should-Not-Be-Parsed" not in names
 def test_parser_handles_folded_lines(sample_headers: str) -> None:
    parser = HeaderParser()
    headers = parser.parse(sample_headers)
    subject = next(header for header in headers if header.name == "Subject")
    assert "folded line" in subject.value
    assert "\n" in subject.value
    authentication = next(
        header for header in headers if header.name == "Authentication-Results"
    )
    assert "spf=pass" in authentication.value
    assert "\n" in authentication.value
 def test_parser_preserves_content_type_boundary(sample_headers: str) -> None:
    parser = HeaderParser()
    headers = parser.parse(sample_headers)
    content_type = next(header for header in headers if header.name == "Content-Type")
    assert "boundary=\"boundary-123\"" in content_type.value
--- a/backend/tests/engine/test_scanner_registry.py
+++ b/backend/tests/engine/test_scanner_registry.py
@@ -0,0 +1,32 @@
 from __future__ import annotations
 from app.engine.scanner_registry import ScannerRegistry
 def test_registry_discovers_all_scanners() -> None:
    registry = ScannerRegistry()
    scanners = registry.get_all()
    assert len(scanners) >= 106
    ids = [scanner.id for scanner in scanners]
    assert len(ids) == len(set(ids))
    assert {1, 12, 66}.issubset(set(ids))
 def test_registry_filters_by_ids_and_lists_tests() -> None:
    registry = ScannerRegistry()
    selected = registry.get_by_ids([1, 12, 66])
    assert [scanner.id for scanner in selected] == [1, 12, 66]
    tests = registry.list_tests()
    lookup = {test.id: test for test in tests}
    assert lookup[1].name == "Received - Mail Servers Flow"
    assert lookup[12].name == "X-Forefront-Antispam-Report"
    assert lookup[66].name == "X-Proofpoint-Spam-Details"
    assert lookup[1].category
    assert lookup[12].category
    assert lookup[66].category
--- a/backend/tests/fixtures/sample_headers.txt
+++ b/backend/tests/fixtures/sample_headers.txt
@@ -0,0 +1,34 @@
 Received: from mail.example.org (mail.example.org [203.0.113.10])
    by mx.example.com with ESMTPS id 12345
    for <user@example.com>; Tue, 17 Feb 2026 10:00:00 +0000
 Received: from localhost (localhost [127.0.0.1])
    by mail.example.org with SMTP id 67890
    for <user@example.com>; Tue, 17 Feb 2026 09:59:00 +0000
 Authentication-Results: mx.example.com;
    spf=pass smtp.mailfrom=example.org;
    dkim=pass header.d=example.org;
    dmarc=pass
 Subject: This is a test subject
    with a folded line
 From: "Sender Name" <sender@example.org>
 To: user@example.com
 Date: Tue, 17 Feb 2026 10:00:00 +0000
 Message-ID: <1234@example.org>
 Content-Type: multipart/alternative; boundary="boundary-123"
 X-Forefront-Antispam-Report: CIP:203.0.113.10;CTRY:US;LANG:en;SCL:1;SRV:;
    IPV:NLI;SFV:SKI;H:mail.example.org;CAT:NONE;SFTY:0.0;SFS:(0);DIR:INB;
 X-Spam-Status: No, score=-0.1 required=5.0 tests=NONE
 X-Spam-Level: **
 X-Spam-Flag: NO
 X-Spam-Report: Example report line one
    Example report line two
 X-Mimecast-Spam-Score: 1
 X-Proofpoint-Spam-Details: rule=default, score=0
 X-MS-Exchange-Organization-SCL: 1
 --boundary-123
 Content-Type: text/plain; charset=utf-8
 This is the body.
 X-Should-Not-Be-Parsed: nope
 --boundary-123--