News Report Technology
October 27, 2025

O.XYZ’s Next Leap: From Wafer-Scale Chips To Routing Intelligence

In Brief

After discovering the limits of single-chip performance, Ahmad Shadid is redefining what “fast” means in AI — transforming its Cerebras-powered OCEAN engine into an intelligent routing platform serving 100,000 models.

O.XYZ Sets Sights On AGI With OCEAN And ORI, Integrating 100,000 Models Into Unified AI Platform

Independent AI developer O.XYZ introduced OCEAN earlier this year, a next-generation decentralized AI search engine powered by Cerebras CS-3 wafer-scale processors. Designed to deliver performance up to ten times faster than ChatGPT, OCEAN aimed to redefine both consumer and enterprise AI experiences. With fast response times, integrated voice interaction, and a decentralized framework, the platform marked an advancement in global AI accessibility and performance.

OCEAN’s defining feature was speed and real-time responsiveness, which stemmed from its hardware design. 

Ahmad Shadid, founder of O.XYZ and IO, noted that the use of Cerebras’s advanced computing architecture played a key role in achieving such high performance. The Cerebras CS-3 chip, also known as the Wafer Scale Engine (WSE-3), integrates 900,000 AI-optimized cores and four trillion transistors onto a single chip, enabling scalable performance without the need for complex distributed programming typical of GPU-based systems. This architecture allowed models ranging from one billion to 24 trillion parameters to run seamlessly without code modification, reducing latency and improving overall efficiency. 

With a memory bandwidth of 21 PB/s, Cerebras-based computation provided fast and consistent processing capabilities that surpass conventional GPU configurations. However, as development progressed, the O.XYZ team identified a key limitation — while Cerebras hardware excelled in memory capacity and single-model performance, the company’s vision required an architecture capable of supporting up to 100,000 models in parallel.

“Initially, we explored leveraging Cerebras’ massive wafer-scale compute for ultra-fast, memory-intensive inference—ideal for a few high-demand models. However, after thorough technical assessments and due diligence conducted by our team on-site at Cerebras’ offices in Palo Alto, we quickly realized the limitation: Cerebras excels at depth, not breadth,” Ahmad Shadid told Mpost.

“While it can run a single large model with extraordinary speed and memory bandwidth, it doesn’t scale economically to host more than one model on a single WSE-3 chip. Although the team initially indicated we would have access to the kernel to customize each of the 900,000 cores to host our desired models, we later discovered that handling models with unique dependencies, quantization schemes, and memory footprints is not feasible on the current Cerebras infrastructure,” he added.

In order to enable unified access to over 100,000 open models through a single API endpoint, the O.XYZ‘s architecture was fundamentally redesigned into a hybrid inference infrastructure.

The system transitioned from relying solely on a monolithic Cerebras setup to a tiered, multi-cloud inference network. High-demand models, such as the top 500 by usage, along with the long-tail models comprising the remaining 99,500+, are dynamically deployed on io.net clusters equipped with H100 and H200 GPUs during peak demand periods. Additionally, requests are federated across more than 200 external inference providers, including Together AI, Fireworks, and Anyscale, allowing ORI — O Routing Intelligence — to function as a universal gateway for model access.

“User behavior forced a fundamental redefinition of what ‘performance’ actually means in AI search,” said Ahmad Shadid.

“Early in development, the OCEAN engine was optimized for raw speed—leveraging Cerebras’ massive on-wafer memory to run a small set of high-performance models with ultra-low latency. On paper, it was impressive: sub-100ms responses, deterministic throughput, and minimal cold-start overhead. But during our public beta, we observed a critical disconnect: users consistently preferred slower, more accurate answers from specialized models over fast, generic responses,” he explained.

O.XYZ’s Plans For Advanced Routing Intelligence

Outlining future plans, O.XYZ noted that it aims to evolve OCEAN into a fully integrated AI platform powered by advanced routing intelligence. The company’s proprietary system, known as O Routing Intelligence (ORI) and developed by its AI research lab, is designed to intelligently distribute computational tasks across the most appropriate models—whether open-source or specialized—depending on the complexity of the request. This approach is intended to optimize operational efficiency and cost while maintaining high standards of speed and accuracy. 

ORI represents a foundational step toward building an extensive AI library capable of supporting over 100,000 models. Comparable in concept to unified intelligence systems introduced by major AI developers, ORI will be capable of selecting and routing tasks among more than 100,000 open-source models in real time. The evolution of OCEAN into ORI will position it as the central component of O.XYZ’s vision for multi-model intelligence, where users can access and interact with a wide range of AI capabilities through a single, cohesive environment.

“From the outset, our vision for OCEAN, which has now evolved into ORI, was ambitious: to build the most capable, accurate, and responsive AI search engine on the market. But as a bootstrapped, self-funded startup competing against well-resourced giants like OpenAI, Anthropic, and Perplexity, we knew we couldn’t win on data, scale, or brand alone. Instead, we bet on intelligence over brute force: a routing-first architecture that could dynamically select the best model for every query from a vast universe of open-source AI,” said Ahmad Shadid.

“This multi-model philosophy fundamentally shaped our hardware strategy and taught us hard-won lessons about the trade-offs between compute, memory, and flexibility,” he added.

Disclaimer

In line with the Trust Project guidelines, please note that the information provided on this page is not intended to be and should not be interpreted as legal, tax, investment, financial, or any other form of advice. It is important to only invest what you can afford to lose and to seek independent financial advice if you have any doubts. For further information, we suggest referring to the terms and conditions as well as the help and support pages provided by the issuer or advertiser. MetaversePost is committed to accurate, unbiased reporting, but market conditions are subject to change without notice.

About The Author

Alisa, a dedicated journalist at the MPost, specializes in cryptocurrency, zero-knowledge proofs, investments, and the expansive realm of Web3. With a keen eye for emerging trends and technologies, she delivers comprehensive coverage to inform and engage readers in the ever-evolving landscape of digital finance.

More articles
Alisa Davidson
Alisa Davidson

Alisa, a dedicated journalist at the MPost, specializes in cryptocurrency, zero-knowledge proofs, investments, and the expansive realm of Web3. With a keen eye for emerging trends and technologies, she delivers comprehensive coverage to inform and engage readers in the ever-evolving landscape of digital finance.

Hot Stories
Join Our Newsletter.
Latest News

The Calm Before The Solana Storm: What Charts, Whales, And On-Chain Signals Are Saying Now

Solana has demonstrated strong performance, driven by increasing adoption, institutional interest, and key partnerships, while facing potential ...

Know More

Crypto In April 2025: Key Trends, Shifts, And What Comes Next

In April 2025, the crypto space focused on strengthening core infrastructure, with Ethereum preparing for the Pectra ...

Know More
Read More
Read more
Adobe Acquires Semrush, Enhancing Marketing Tools To Optimize Brand Presence Across Search And AI Platforms
News Report Software Technology
Adobe Acquires Semrush, Enhancing Marketing Tools To Optimize Brand Presence Across Search And AI Platforms
November 20, 2025
HUMAIN And xAI Sign Agreement To Build 500MW Hyperscale GPU Data Centers In Saudi Arabia
News Report Technology
HUMAIN And xAI Sign Agreement To Build 500MW Hyperscale GPU Data Centers In Saudi Arabia
November 20, 2025
Quarkslab Completes First Public Third‑Party Security Audit Of Bitcoin Core
News Report Technology
Quarkslab Completes First Public Third‑Party Security Audit Of Bitcoin Core
November 20, 2025
Bitget Wallet Launches Zero‑Fee Crypto Card Across More Than 50 Markets
News Report Technology
Bitget Wallet Launches Zero‑Fee Crypto Card Across More Than 50 Markets
November 20, 2025