Build for
right-sized AI.
Learn to architect systems that prioritize speed, privacy, and capability density. Bridge the gap from robust enterprise logic to ultra-fast edge deployment.
Flagship model families used as concrete teaching examples
Speech case studies for browser and edge experiences
Core idea: use the right-sized model for the task
Neural Networks
The broader architectures behind modern AI.
LLMs are an advancement of neural networks: many learned layers organized into transformer blocks, and sometimes Mamba-style sequence layers, trained primarily for next-word or next-token prediction. But the wider neural-net family still matters because many tasks are better served by smaller, purpose-built architectures.
Dense networks and MLPs
Use these to introduce compact learnable systems before readers jump to transformers. They are a simple baseline for many structured tasks.
CNNs and perception models
Explain how convolutional models still matter for image and signal tasks, especially where data has strong local structure.
Transformers
Place LLMs inside the wider transformer story, including text, speech, and multimodal systems rather than treating chat as the only endpoint.
Autoencoders and embedding models
Use these to show that representation learning, compression, and anomaly detection are often better served by smaller purpose-built architectures.
Flagship Anchors
Four families built for the real world.
These models lead the industry in capability density and hardware efficiency.
Granite 4.0
Enterprise-grade agents and edge devices. A strong case study for hybrid Mamba-Transformer (9:1) architectures that optimize performance and memory.
Gemma 4
Agents and edge devices. A state-of-the-art family for multimodal reasoning with native audio and video processing across all sizes.
Granite Time Series
IBM's specialized time-series family for forecasting, anomaly detection, representation learning, and similarity search. Built to stay compact, practical, and strong on structured temporal data.
Qwen 3.6
A flexible family for agentic reasoning that stays practical at smaller sizes. Best understood through the 27B, 9B, 4B/2B, and 0.8B variants that span servers, workstations, and local devices.
Mistral Family
The premier European AI provider. A practical lineup spanning compact edge models to frontier enterprise reasoning.
Learning Tracks
Across the full
open AI stack.
How to evaluate model families for real product constraints
When local and mobile-first AI changes the design
How to compare efficient open models without hype
How multilingual model families change product coverage
When to pick fine-tuning instead of RAG
How browser AI works with WebGPU and local caching
How speech models fit browser and edge-style flows
How compact TTS works in modern interfaces
When autoencoders or smaller neural nets are enough
Core Competency
Browser AI & Local Inference
Speech and local inference closer to the user. Lightweight models and WebGPU change the architectural game.
Speech-to-Text
Cohere Transcribe WebGPU
Use the CohereLabs WebGPU transcription demo as a browser-side example for speech AI that runs close to the user.
Speech-to-Text
Granite Speech
Explain how audio chunks, speech inference, and text output work in browser or edge pipelines with privacy-conscious design.
Text-to-Speech
Kokoro
Show how compact TTS models can generate natural voice for accessible, low-latency, browser-first interfaces.
Decision Strategy
The Decision Framework
Move from benchmark-chasing to systems thinking. Pick the right level of intelligence for your constraints.
Use a stronger general model
Choose this when the product goal is broad reasoning, flexible conversation, or highly open-ended tasks without tight operational constraints.
Use a smaller or medium model
Choose this when cost, latency, privacy, or local deployment matter more than maximum generality, especially for bounded workflows.
Use a different neural approach
Choose embeddings, classifiers, or autoencoders when generation is not the problem you are solving and a simpler system is more reliable.
Beyond LLMs
Classical neural networks still solve real products.
The site now includes a neural networks section covering when smaller, purpose-built architectures are a better fit than LLMs, including autoencoders for compression, denoising, anomaly detection, and representation learning.