Swaylen Hayes
posts / projects / about  
Home

Blog

Writing about AI tooling, Apple Silicon ML, and developer infrastructure.

2026

  • Design test — Typography, components, and patterns
    Visual reference for type scale, callouts, dense data tables, data cards, bento grids, forms, and all markdown elements.
  • Site Launch
    What this site is and what's coming.
  • Hybrid UI Detection: Why We Split Vision and Intelligence
    uitag combines Apple Vision's text detection with a fine-tuned YOLO model to hit 90.8% coverage on ScreenSpot-Pro — faster and cheaper than VLM-only approaches.
  • Markdown & Component Reference
    Every visual element this site supports — typography, tables, callouts, code, cards, grids, and forms.
  • Benchmarking UI Detection on ScreenSpot-Pro
    How we evaluated uitag against 1,581 annotations across 26 professional macOS applications — methodology, results, and what the numbers actually mean.
  • Why Detection and Intelligence Should Be Separate Layers
    The architectural argument for splitting UI perception from UI reasoning — and what happens when you don't.
  • GUI-Specialized Apple Silicon VLM Matrix
    Which vision-language models actually work for UI tasks on M-series chips — tested configurations, latency numbers, and the models worth your time.
  • A Failure Mode Watchlist for Multi-Agent Systems
    Naming the drift patterns that cause multi-agent work to silently degrade — because you can't fix what you haven't named.
  • mlx-triage: Preflight Validation for MLX Models
    A practitioner-facing CLI that answers the question every Apple Silicon ML developer asks before benchmarking: is this model structurally sound?
  • CoT Suppression in Spatial UI Tasks
    Chain-of-thought reasoning helps most LLM tasks. For spatial UI interaction, it actively hurts — here's what we found and why.
  • Building the Operations Layer for a Multi-Agent Development System
    44 projects, 11 agent entities, 3 runtimes — what it takes to keep them coordinated without losing your mind.
  • The Operator Orchestration Workstation
    Every agentic framework assumes a shared runtime. None are designed for the case where the human operator is the coordination mechanism — and that's the case most practitioners are living in.
  • Trust Calibration Without Confidence Scores
    VLMs don't reliably report their own confidence. Leith builds trust from behavioral signals instead — and it works better.
  • Multi-Signal Verification for VLM UI Agents
    One detection method is a guess. Two that agree are evidence. How Leith uses signal redundancy to make UI interaction reliable.
  • Memory Architecture for a Multi-Agent Ecosystem
    How 11 agent entities share memory across sessions and projects without it decaying into noise — drift detection, extraction normalization, and metacognitive reliability scoring.

2025

  • Apple Silicon VLM Benchmark Roundup
    A short public narrative covering what we tested, what we found, and what you should run if you're doing local multimodal inference on M-series hardware.
  • Interagent: A Coordination Protocol for Multi-Agent LLM Systems
    When the human operator is the routing authority — not a bottleneck to be automated away — you need a different kind of protocol.
  • Portable Agent Memory: The Problem That Started Everything
    When you close a Claude session, everything the agent learned disappears. This is the problem that led to the Metacognitive Memory System, the Interagent Protocol, and the entire multi-agent ecosystem.
© 2026 Swaylen Hayes
Press Esc or click anywhere to close