Clean Specs Beat New Standards: Iddo Gino on APIs, MCPs, and Smarter AI

In this Commit & Push episode, host Damien Filiatrault sits down with Iddo Gino, founder of RapidAPI (acquired by Nokia) and now CEO of Datawizz, to talk about the real blockers to AI integrations, why he’s skeptical of MCPs, and how smaller, specialized models can slash AI costs while improving results.
Listen to the episode:
From “Awesome APIs” to RapidAPI, and Beyond
As a teen in Israel, Iddo learned by building: Flash games, early websites, and, eventually, a GitHub repo called Awesome APIs. That list evolved into RapidAPI—an interactive marketplace where developers could explore, test, and adopt APIs. The company reached unicorn status, scaled into the enterprise, and was later acquired by Nokia. Iddo’s reflection: the developer platform—a simple, open marketplace for publishing and consuming APIs—risked getting overshadowed by larger enterprise ambitions.
The MCP Debate: Don’t Reinvent What Documentation Can Fix
Iddo’s hot take: MCPs look tidy today because they’re new—not because they’re inherently better.
- Early OpenAI plugins hinted at the right idea (models calling APIs via specs) but stumbled due to weaker models at the time and, more importantly, messy, drifting API docs.
- MCPs offer a “clean slate,” but that means duplicating the entire ecosystem—gateways, auth, tooling, governance—and risking the same decay over time.
- The pragmatic path: keep REST (or GraphQL), fix your specs, and maintain documentation so both humans and LLMs can integrate reliably.
Datawizz: Smaller Models, Bigger Wins
Datawizz helps teams replace generic LLM calls with tiny, purpose-built models—and route traffic intelligently:
- Router as an OpenAI-compatible endpoint: drop-in URL swap.
- Starts with your existing LLM flow, clusters real usage, then trains narrow models that excel at specific, high-volume tasks.
- Routes requests to the best fit: a specialized model when confident, or your preferred LLM for novel prompts.
- Typical impact: ~85–95% cost reduction versus sending everything to a large, general model.
When does this make sense? Once you’re spending at hundreds of millions to billions of tokens per month, with repeatable patterns emerging.
Edge & On-Device: Faster, Cheaper, More Private
Iddo sees inference moving closer to users:
- Edge: many specialized models can run on Cloudflare AI Workers for lower latency.
- On-device: platforms are adding built-in small models and support for adapters (Iddo mentions iOS “26” with a ~3B model and Chrome with Gemini Nano in early builds). Benefits: instant responses, zero per-call compute bills, and strong privacy—perfect for well-scoped tasks.
What This Means for Builders
- Prefer clean specs over new standards. Fix docs and drift; don’t multiply integration surfaces unless you must.
- Start broad, then specialize. Begin with a capable LLM to learn real workloads, then carve out high-volume niches for tiny expert models.
- Route with intent. Use logs and clustering to decide when to promote a specialized model.
- Push compute closer to users. Edge and device inference cut latency and costs while boosting privacy.
- Mind the maintenance burden. Adding MCPs often means maintaining two integration flavors over time.
Final Thoughts
Iddo’s thesis is refreshingly practical: APIs aren’t the problem—undisciplined specs are. Keep your existing interfaces, document them rigorously, and let data guide where you deploy specialized models. The payoff is real: lower costs, faster responses, and more reliable AI features that your team (and your users) can trust.