Google Opens Gemma 4 Model Family Under Apache 2.0 License

Four model sizes aim to span edge devices to large-scale cloud deployments with broad tooling and download options

By Caleb Monroe GOOGL

GOOGL

Google has released Gemma 4, a set of open-source AI models distributed under an Apache 2.0 license. The family contains four sizes geared for edge and cloud use, supports long context windows and multimodal inputs, and is available through multiple developer platforms and download sites. Google reports more than 400 million downloads of Gemma models since the first generation and over 100,000 variants created by developers.

Key Points

Google released Gemma 4 as open-source models under an Apache 2.0 license, with developers downloading prior Gemma models over 400 million times and creating more than 100,000 variants.
The family includes Effective 2B, Effective 4B, 26B Mixture of Experts and 31B Dense models, with the 31B ranked third and the 26B sixth on the Arena AI text leaderboard; models derive from the same research as Gemini 3.
Gemma 4 targets both edge and cloud use cases - edge models support 128K context windows and offline operation on phones, Raspberry Pi and NVIDIA Jetson Orin Nano, while larger models offer up to 256K context and can run on a single 80GB NVIDIA H100 for unquantized bfloat16 weights.

Google has rolled out Gemma 4, a collection of open-source artificial intelligence models provided under an Apache 2.0 license. According to the company, developers have downloaded Gemma models more than 400 million times since the initial Gemma release, and those downloads have produced in excess of 100,000 model variants.

The Gemma 4 lineup comprises four configurations: Effective 2B, Effective 4B, a 26B Mixture of Experts (MoE) model, and a 31B Dense model. Google said the 31B iteration currently ranks third among open models on the Arena AI text leaderboard, while the 26B model sits at sixth. The company also stated these models are derived from the same research and technology base as Gemini 3.

Google described the Gemma 4 family as capable of advanced reasoning and agentic workflows, including function-calling and structured JSON output. The models support code generation and can natively process video, images and audio. Context window sizes vary by model class - edge-oriented models offer a 128K token context window, while the larger variants provide up to a 256K context window. Training data covers more than 140 languages, per Google.

On resource requirements, Google said the unquantized bfloat16 weights for both the 26B and 31B models will fit on a single 80GB NVIDIA H100 GPU. By contrast, the Effective 2B and Effective 4B were developed to operate on mobile and Internet of Things hardware, with offline capability on devices including phones, Raspberry Pi boards and the NVIDIA Jetson Orin Nano. Google noted collaboration on the edge models with its Pixel hardware team, Qualcomm Technologies and MediaTek.

For distribution and developer access, Gemma 4 is available through Google AI Studio, Google AI Edge Gallery and Android Studio. Google said there is day-one support across multiple third-party platforms and runtimes, including Hugging Face, vLLM, llama.cpp, MLX, Ollama and NVIDIA NIM. Model weights can be downloaded from Hugging Face, Kaggle or Ollama, and deployment options include Vertex AI, Cloud Run, Google Kubernetes Engine and Google Cloud’s TPU-accelerated serving.

Should you be buying GOOGL right now? ProPicks AI evaluates GOOGL alongside thousands of other companies every month using more than 100 financial metrics. The service uses AI to identify stocks that offer favorable risk-reward profiles based on current data and cites notable past winners including Super Micro Computer (+185%) and AppLovin (+157%). Investors interested in whether GOOGL appears in any ProPicks AI strategies or in other opportunities in the same sector can consult the service for details.

Risks

Hardware dependency - the 26B and 31B unquantized models require an 80GB NVIDIA H100 GPU to host bfloat16 weights, which may constrain deployment choices for some organizations. This affects cloud infrastructure and compute providers.
Edge and offline operation - while E2B and E4B are optimized for phones, Raspberry Pi and NVIDIA Jetson Orin Nano, real-world performance will depend on device capabilities and power constraints, impacting consumer device and IoT deployments.
Ecosystem and tooling reliance - broad functionality depends on day-one support across third-party runtimes and platforms; variability in integration or support could influence developer adoption, affecting cloud, developer tooling and inference service markets.

Menu

Google Opens Gemma 4 Model Family Under Apache 2.0 License

Key Points

Risks

More from Stock Markets