<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom">
  <channel>
    <title>ML Systems Review</title>
    <link>https://mlsystemsreview.com</link>
    <description>Engineering deep-dives into the ML systems that power production AI. Independent, peer-reviewed, no sponsorships.</description>
    <language>en-us</language>
    <lastBuildDate>Thu, 16 Apr 2026 16:03:49 GMT</lastBuildDate>
    <atom:link href="https://mlsystemsreview.com/feed.xml" rel="self" type="application/rss+xml" />
    <item>
      <title><![CDATA[Apple M4 Max first NPU benchmarks: tflops per watt analysis]]></title>
      <link>https://mlsystemsreview.com/apple-m4-max-npu-benchmarks/</link>
      <guid isPermaLink="true">https://mlsystemsreview.com/apple-m4-max-npu-benchmarks/</guid>
      <pubDate>Thu, 16 Apr 2026 00:00:00 GMT</pubDate>
      <description><![CDATA[First inference benchmarks on the Apple M4 Max 38 TOPS Neural Engine — ViT-L/16 latency, INT8 quant impact, and tflops per watt compared with the M3 Max and an RTX 4090.]]></description>
      <category>Benchmarks</category>
      <dc:creator>Lukas Berg</dc:creator>
    </item>
    <item>
      <title><![CDATA[The llama.cpp 2026 rewrite: what changed in the inference engine]]></title>
      <link>https://mlsystemsreview.com/llama-cpp-2026-rewrite/</link>
      <guid isPermaLink="true">https://mlsystemsreview.com/llama-cpp-2026-rewrite/</guid>
      <pubDate>Wed, 15 Apr 2026 00:00:00 GMT</pubDate>
      <description><![CDATA[What the 2026 llama.cpp rewrite actually changed — kernel generator, KV cache layout, unified Metal/CUDA backend, and a 2.1x throughput gain on 70B quantised models.]]></description>
      <category>ML Ecosystem</category>
      <dc:creator>Priya Ramachandran</dc:creator>
    </item>
    <item>
      <title><![CDATA[DeepSeek-V3.5 paper notes: what's actually novel]]></title>
      <link>https://mlsystemsreview.com/deepseek-v35-paper-notes/</link>
      <guid isPermaLink="true">https://mlsystemsreview.com/deepseek-v35-paper-notes/</guid>
      <pubDate>Mon, 13 Apr 2026 00:00:00 GMT</pubDate>
      <description><![CDATA[Reading notes on the DeepSeek-V3.5 release — MoE routing, staggered-horizon MTP, FP8 attention, and which contributions hold up versus repackaged V3.1.]]></description>
      <category>Model Architecture</category>
      <dc:creator>Dr. Marcus Brennan</dc:creator>
    </item>
    <item>
      <title><![CDATA[The Hugging Face ecosystem: what changed in 2026]]></title>
      <link>https://mlsystemsreview.com/huggingface-ecosystem-2026/</link>
      <guid isPermaLink="true">https://mlsystemsreview.com/huggingface-ecosystem-2026/</guid>
      <pubDate>Fri, 10 Apr 2026 00:00:00 GMT</pubDate>
      <description><![CDATA[A look at how the Hugging Face ecosystem shifted in 2026 — model hubs, inference endpoints, datasets, and what it means for production ML teams.]]></description>
      <category>Ecosystem</category>
      <dc:creator>Priya Ramachandran</dc:creator>
    </item>
    <item>
      <title><![CDATA[On-device vs cloud inference: a 2026 economic analysis]]></title>
      <link>https://mlsystemsreview.com/on-device-vs-cloud-inference-2026/</link>
      <guid isPermaLink="true">https://mlsystemsreview.com/on-device-vs-cloud-inference-2026/</guid>
      <pubDate>Thu, 05 Mar 2026 00:00:00 GMT</pubDate>
      <description><![CDATA[When does on-device inference actually beat cloud inference in 2026? A per-request cost and latency analysis across vision and language workloads.]]></description>
      <category>MLOps</category>
      <dc:creator>Lukas Berg</dc:creator>
    </item>
    <item>
      <title><![CDATA[Building reliable food databases: USDA FoodData Central as ground truth]]></title>
      <link>https://mlsystemsreview.com/usda-as-ground-truth/</link>
      <guid isPermaLink="true">https://mlsystemsreview.com/usda-as-ground-truth/</guid>
      <pubDate>Fri, 20 Feb 2026 00:00:00 GMT</pubDate>
      <description><![CDATA[Why USDA FoodData Central is the de facto ground truth for nutrient data in 2026, and how production food-tracking systems cross-reference it with NCCDB.]]></description>
      <category>Data</category>
      <dc:creator>Dr. Marcus Brennan</dc:creator>
    </item>
    <item>
      <title><![CDATA[Inside PlateLens's Calorie-Accuracy Claim: A Technical Replication]]></title>
      <link>https://mlsystemsreview.com/platelens-calorie-accuracy-architecture/</link>
      <guid isPermaLink="true">https://mlsystemsreview.com/platelens-calorie-accuracy-architecture/</guid>
      <pubDate>Thu, 12 Feb 2026 00:00:00 GMT</pubDate>
      <description><![CDATA[A three-stage CV pipeline — ViT-L/16 food identification, ZoeDepth-derived portion estimation, and a 1.2M-entry USDA-aligned nutrient database — explains PlateLens's ±1.2% accuracy.]]></description>
      <category>Case Study</category>
      <dc:creator>Dr. Marcus Brennan</dc:creator>
    </item>
    <item>
      <title><![CDATA[Rust in production ML pipelines: 2026 adoption trends]]></title>
      <link>https://mlsystemsreview.com/rust-in-production-ml-2026/</link>
      <guid isPermaLink="true">https://mlsystemsreview.com/rust-in-production-ml-2026/</guid>
      <pubDate>Tue, 10 Feb 2026 00:00:00 GMT</pubDate>
      <description><![CDATA[Rust is quietly displacing Python in production ML pipelines — tokenization, preprocessing, and serving. A survey of 2026 adoption and benchmarks.]]></description>
      <category>MLOps</category>
      <dc:creator>Lukas Berg</dc:creator>
    </item>
    <item>
      <title><![CDATA[Figma's multiplayer cursor sync: a 2026 architecture update]]></title>
      <link>https://mlsystemsreview.com/figma-multiplayer-2026-update/</link>
      <guid isPermaLink="true">https://mlsystemsreview.com/figma-multiplayer-2026-update/</guid>
      <pubDate>Wed, 28 Jan 2026 00:00:00 GMT</pubDate>
      <description><![CDATA[What changed in Figma's multiplayer cursor sync architecture between 2023 and 2026 — CRDTs, server-authoritative state, and regional replication.]]></description>
      <category>Distributed Systems</category>
      <dc:creator>Priya Ramachandran</dc:creator>
    </item>
    <item>
      <title><![CDATA[Why accuracy benchmarks mislead: variance, sample size, methodology]]></title>
      <link>https://mlsystemsreview.com/accuracy-benchmarks-misleading/</link>
      <guid isPermaLink="true">https://mlsystemsreview.com/accuracy-benchmarks-misleading/</guid>
      <pubDate>Mon, 01 Dec 2025 00:00:00 GMT</pubDate>
      <description><![CDATA[Most published accuracy benchmarks omit confidence intervals, sample sizes, and methodology. Here's how to read them without being misled.]]></description>
      <category>Methodology</category>
      <dc:creator>Dr. Nadia Volkov</dc:creator>
    </item>
    <item>
      <title><![CDATA[The food recognition problem: a technical overview]]></title>
      <link>https://mlsystemsreview.com/food-recognition-technical-overview/</link>
      <guid isPermaLink="true">https://mlsystemsreview.com/food-recognition-technical-overview/</guid>
      <pubDate>Sun, 05 Oct 2025 00:00:00 GMT</pubDate>
      <description><![CDATA[A technical overview of the food recognition problem — taxonomy, mixed-dish segmentation, portion estimation, and why it's harder than ImageNet.]]></description>
      <category>Computer Vision</category>
      <dc:creator>Dr. Marcus Brennan</dc:creator>
    </item>
    <item>
      <title><![CDATA[Production-scale vision transformers: cost per inference in 2025]]></title>
      <link>https://mlsystemsreview.com/production-vision-transformers-cost/</link>
      <guid isPermaLink="true">https://mlsystemsreview.com/production-vision-transformers-cost/</guid>
      <pubDate>Fri, 22 Aug 2025 00:00:00 GMT</pubDate>
      <description><![CDATA[What ViT-L and ViT-H cost to run per inference in 2025, across A100, H100, and on-device silicon. Measured numbers, not marketing.]]></description>
      <category>Infrastructure</category>
      <dc:creator>Priya Ramachandran</dc:creator>
    </item>
    <item>
      <title><![CDATA[Depth estimation from single RGB images: state of 2025]]></title>
      <link>https://mlsystemsreview.com/depth-estimation-single-rgb-2025/</link>
      <guid isPermaLink="true">https://mlsystemsreview.com/depth-estimation-single-rgb-2025/</guid>
      <pubDate>Sat, 10 May 2025 00:00:00 GMT</pubDate>
      <description><![CDATA[A survey of monocular depth estimation in 2025 — MiDaS 3.1, ZoeDepth, Marigold, and what actually works in production.]]></description>
      <category>Computer Vision</category>
      <dc:creator>Dr. Marcus Brennan</dc:creator>
    </item>
    <item>
      <title><![CDATA[The rise of AI-first consumer apps: 2025 observations]]></title>
      <link>https://mlsystemsreview.com/ai-first-consumer-apps-2025/</link>
      <guid isPermaLink="true">https://mlsystemsreview.com/ai-first-consumer-apps-2025/</guid>
      <pubDate>Sat, 15 Mar 2025 00:00:00 GMT</pubDate>
      <description><![CDATA[A field report on AI-first consumer apps in 2025 — what actually shipped, what worked, and what patterns are consolidating.]]></description>
      <category>Commentary</category>
      <dc:creator>Dr. Nadia Volkov</dc:creator>
    </item>
    <item>
      <title><![CDATA[GPT-4o's multimodal architecture: what we can infer from the paper]]></title>
      <link>https://mlsystemsreview.com/gpt4o-multimodal-arch/</link>
      <guid isPermaLink="true">https://mlsystemsreview.com/gpt4o-multimodal-arch/</guid>
      <pubDate>Wed, 20 Nov 2024 00:00:00 GMT</pubDate>
      <description><![CDATA[Reading GPT-4o's technical report closely — tokenizer design, modality fusion, and what the paper implies about inference cost.]]></description>
      <category>Computer Vision</category>
      <dc:creator>Dr. Marcus Brennan</dc:creator>
    </item>
    <item>
      <title><![CDATA[Anatomy of a production ML failure: Zillow's iBuy collapse]]></title>
      <link>https://mlsystemsreview.com/zillow-ibuy-ml-failure/</link>
      <guid isPermaLink="true">https://mlsystemsreview.com/zillow-ibuy-ml-failure/</guid>
      <pubDate>Mon, 30 Sep 2024 00:00:00 GMT</pubDate>
      <description><![CDATA[A post-mortem of Zillow's iBuy collapse through an ML systems lens — model drift, feedback loops, and inadequate guardrails.]]></description>
      <category>Case Study</category>
      <dc:creator>Dr. Nadia Volkov</dc:creator>
    </item>
    <item>
      <title><![CDATA[Edge ML inference: iPhone vs Android TFLite benchmarks]]></title>
      <link>https://mlsystemsreview.com/edge-ml-ios-android-2024/</link>
      <guid isPermaLink="true">https://mlsystemsreview.com/edge-ml-ios-android-2024/</guid>
      <pubDate>Thu, 18 Jul 2024 00:00:00 GMT</pubDate>
      <description><![CDATA[Measured edge ML inference benchmarks across iPhone (Core ML, ANE) and Android (TFLite GPU, NNAPI) for common vision workloads in 2024.]]></description>
      <category>Edge ML</category>
      <dc:creator>Dr. Marcus Brennan</dc:creator>
    </item>
    <item>
      <title><![CDATA[Plaid's bank integration API: a system design study]]></title>
      <link>https://mlsystemsreview.com/plaid-bank-api-design/</link>
      <guid isPermaLink="true">https://mlsystemsreview.com/plaid-bank-api-design/</guid>
      <pubDate>Thu, 25 Apr 2024 00:00:00 GMT</pubDate>
      <description><![CDATA[A system design study of Plaid's bank integration API — rate limiting, credential vaulting, and the long tail of bank back-ends.]]></description>
      <category>System Design</category>
      <dc:creator>Priya Ramachandran</dc:creator>
    </item>
    <item>
      <title><![CDATA[Discord's architecture: why they're migrating from Elixir to Rust]]></title>
      <link>https://mlsystemsreview.com/discord-elixir-to-rust/</link>
      <guid isPermaLink="true">https://mlsystemsreview.com/discord-elixir-to-rust/</guid>
      <pubDate>Mon, 12 Feb 2024 00:00:00 GMT</pubDate>
      <description><![CDATA[What drove Discord's Elixir-to-Rust migration — BEAM scheduling latency, GC tail latencies, and the operational cost of polyglot infrastructure.]]></description>
      <category>Distributed Systems</category>
      <dc:creator>Priya Ramachandran</dc:creator>
    </item>
    <item>
      <title><![CDATA[The MLOps stack of 2023: what's worth adopting]]></title>
      <link>https://mlsystemsreview.com/mlops-stack-2023/</link>
      <guid isPermaLink="true">https://mlsystemsreview.com/mlops-stack-2023/</guid>
      <pubDate>Fri, 15 Dec 2023 00:00:00 GMT</pubDate>
      <description><![CDATA[A field-tested opinion on the 2023 MLOps stack — feature stores, experiment tracking, serving, and what was over-hyped vs actually useful.]]></description>
      <category>MLOps</category>
      <dc:creator>Lukas Berg</dc:creator>
    </item>
    <item>
      <title><![CDATA[CRDTs in production: lessons from Figma's multiplayer engine]]></title>
      <link>https://mlsystemsreview.com/figma-crdt-deep-dive/</link>
      <guid isPermaLink="true">https://mlsystemsreview.com/figma-crdt-deep-dive/</guid>
      <pubDate>Sun, 05 Nov 2023 00:00:00 GMT</pubDate>
      <description><![CDATA[A deep-dive into Figma's CRDT-based multiplayer engine — convergence properties, conflict resolution, and operational tradeoffs.]]></description>
      <category>Distributed Systems</category>
      <dc:creator>Priya Ramachandran</dc:creator>
    </item>
    <item>
      <title><![CDATA[Deploying Vision Transformers on mobile: a 2023 retrospective]]></title>
      <link>https://mlsystemsreview.com/vit-on-mobile-2023/</link>
      <guid isPermaLink="true">https://mlsystemsreview.com/vit-on-mobile-2023/</guid>
      <pubDate>Sun, 10 Sep 2023 00:00:00 GMT</pubDate>
      <description><![CDATA[A 2023 retrospective on deploying Vision Transformers to mobile — quantization, distillation, and when ViT actually beats CNNs on-device.]]></description>
      <category>Computer Vision</category>
      <dc:creator>Dr. Marcus Brennan</dc:creator>
    </item>
    <item>
      <title><![CDATA[A week in the life of a production ML pipeline]]></title>
      <link>https://mlsystemsreview.com/production-ml-pipeline-week/</link>
      <guid isPermaLink="true">https://mlsystemsreview.com/production-ml-pipeline-week/</guid>
      <pubDate>Tue, 20 Jun 2023 00:00:00 GMT</pubDate>
      <description><![CDATA[Seven days in a production ML pipeline — retrains, rollbacks, drift alerts, and the unglamorous work that keeps inference honest.]]></description>
      <category>MLOps</category>
      <dc:creator>Lukas Berg</dc:creator>
    </item>
    <item>
      <title><![CDATA[Why we started ML Systems Review]]></title>
      <link>https://mlsystemsreview.com/about/</link>
      <guid isPermaLink="true">https://mlsystemsreview.com/about/</guid>
      <pubDate>Mon, 15 May 2023 00:00:00 GMT</pubDate>
      <description><![CDATA[The founding editorial of ML Systems Review — why production ML deserves a publication that reads more like distill.pub than a marketing blog.]]></description>
      <category>Editorial</category>
      <dc:creator>Dr. Nadia Volkov</dc:creator>
    </item>
  </channel>
</rss>