By KIM BELLARD
I am a fanboy for AI; I don’t really understand the technical aspects, but I sure am excited about its potential. I’m also a sucker for a catchy phrase. So when I (belatedly) learned about TinyAI, I was hooked.
Now, as it turns out, TinyAI (also know as Tiny AI) has been around for a few years, but with the general surge of interest in AI it is now getting more attention. There is also TinyML and Edge AI, the distinctions between which I won’t attempt to parse. The point is, AI doesn’t have to involve huge datasets run on massive servers somewhere in the cloud; it can happen on about as small a device as you care to imagine. And that’s pretty exciting.
What caught my eye was a overview in Cell by Farid Nakhle, a professor at Temple University, Japan Campus: Shrinking the Giants: Paving the Way for TinyAI. “Transitioning from the landscape of large artificial intelligence (AI) models to the realm of edge computing, which finds its niche in pocket-sized devices, heralds a remarkable evolution in technological capabilities,” Professor Nakhle begins.
AI’s many successes, he believes, “…are demanding a leap in its capabilities, calling for a paradigm shift in the research landscape, from centralized cloud computing architectures to decentralized and edge-centric frameworks, where data can be processed on edge devices near to where they are being generated.” The demands for real time processing, reduced latency, and enhanced privacy make TinyAI attractive.
Accordingly: “This necessitates TinyAI, here defined as the compression and acceleration of existing AI models or the design of novel, small, yet effective AI architectures and the development of dedicated AI-accelerating hardware to seamlessly ensure their efficient deployment and operation on edge devices.”
Professor Nakhle gives an overview of those compression and acceleration techniques, as well as architecture and hardware designs, all of which I’ll leave as an exercise for the interested reader.
If all this sounds futuristic, here are some current examples of TinyAI models:
- This summer Google launched Gemma 2 2B, a 2 billion parameter model that it claims outperforms OpenAI’s GPT 3.5 and Mistral AI’s Mixtral 8X7B. VentureBeat opined: “Gemma 2 2B’s success suggests that sophisticated training techniques, efficient architectures, and high-quality datasets can compensate for raw parameter count.”
- Also this summer OpenAI introduced GPT-4o mini, “our most cost-efficient small model.” It “supports text and vision in the API, with support for text, image, video and audio inputs and outputs coming in the future.”
- Salesforce recently introduced its xLAM-1B model, which it likes to call the “Tiny Giant.” It supposedly only has 1b parameters, yet Marc Benoff claims it outperforms modelx 7x its size and boldly says: “On-device agentic AI is here”
- This spring Microsoft launched Phi-3 Mini, a 3.8 billion parameter model, which is small enough for a smartphone. It claims to compare well to GPT 3.5 as well as Meta’s Llama 3.
- H2O.ai offers Danube 2, a 1.8 b parameter model that Alan Simon of Hackernoon calls the most accurate of the open source, tiny LLM models.
A few billion parameters may not sound so “tiny,” but keep in mind that other AI models may have trillions.
Continue reading…