2 min read

mlx-triage

Table of Contents

What it does

CLI tool for preflight validation of MLX models on Apple Silicon. Answers “is this model structurally sound before I benchmark it?” by checking architecture compatibility, weight shapes, chat templates, and quantization formats.

Architecture / Key capabilities

  • Architecture compatibility checks — Validates that model architecture is supported by the target MLX runtime before any inference attempt
  • Weight shape validation — Inspects tensor dimensions and layer structure to catch shape mismatches that would cause silent failures or crashes during inference
  • Chat template verification — Confirms chat templates are present, well-formed, and compatible with the intended serving configuration
  • Quantization format checks — Validates quantization metadata and format compatibility so operators know whether a quantized model will load correctly
  • Practitioner-facing CLI — Designed for the person who downloads models from Hugging Face and needs a fast structural sanity check before committing to a full benchmark run

Key numbers

MISSING — model count, validation pass/fail rates

Current phase

Workstream B artifacts live — model-directory skeleton at docs/model-directory.md. Core priority: residual Qwen-specific Tier 2.1 heterogeneous-length batch divergence (MLX-001).

Status

Active — next milestones: MLX-002 expand model-directory detail pages, MLX-003 README rewrite around preflight + validation story, MLX-004 refresh validation-results.md, MLX-005 decide release scope (v0.2.1 vs v0.3.0)

MISSING — Repository URL