Posts tagged with "benchmarking"
- Benchmarking UI Detection on ScreenSpot-ProHow we evaluated uitag against 1,581 annotations across 26 professional macOS applications — methodology, results, and what the numbers actually mean.
- GUI-Specialized Apple Silicon VLM MatrixWhich vision-language models actually work for UI tasks on M-series chips — tested configurations, latency numbers, and the models worth your time.
- Apple Silicon VLM Benchmark RoundupA short public narrative covering what we tested, what we found, and what you should run if you're doing local multimodal inference on M-series hardware.