The developer guide to Grok-4

July 16, 2025

Grok 4 is the July 2025 flagship model from xAI. It delivers a 256k token context window, built in tools, multimodal input, and an optional Heavy multi agent tier. API pricing is $3 per million input tokens and $15 per million output tokens (same as Grok 3). This guide covers capabilities, quick API setup, and how Grok 4 compares to current premium reasoning models from OpenAI, Google, Anthropic, and DeepSeek.

Overview of Grok 4

Grok-4 Release & Variants

Announced July 10, 2025 as xAI's most capable model targeting advanced reasoning and real time intelligence.
Two variants: Grok 4 (single agent) and Grok 4 Heavy (parallel multi agent for complex jobs).

Core API Specs

Attribute	Grok 4
Context window	256 k tokens
Pricing (per 1 M tokens)	$3 input / $15 output
Heavy subscription	$300 / month seat for multi-agent tier
Modalities	Text + images (vision endpoint); voice in mobile apps
Tool use	Native web/X search & function calling

Stand-out Features

Always-on chain-of-thought: the legacy reasoning_effort toggle is gone and every call runs in full reasoning mode.
Live search agent: lets the model fetch current web or X posts during inference—no extra orchestration needed.
Structured outputs & function calling: follow the OpenAI JSON schema standard.

Grok API Access

Getting Started with xAI API

Step 1: Create an xAI account at console.x.ai and generate your API key. Store it securely as an environment variable.

Step 2: Make your first API request following the official tutorial.

Step 3: You can also use OpenAI or Anthropic SDK compatibility by changing the base URL.

python
1client = OpenAI(
2    api_key=os.getenv("XAI_API_KEY"),
3    base_url="https://api.x.ai/v1",
4)
5
6# Sending the query
7response = client.chat.completions.create(
8    model="grok-4",
9    messages=[
10    {
11      "role": "user",
12      "content": [
13        {
14          "type": "text",
15          "text": "Which ingredients do you notice on the picture?"
16        }
17      ]
18    }
19  ],
20)
21
22print(response.choices[0].message.content)

Alternative: Using Keywords AI Gateway

You can also access Grok 4 through the Keywords AI gateway, which provides unified access to all mainstream AI models.(Library) We offer seamless compatibility with popular SDKs including OpenAI, Anthropic, LangChain, Vercel AI SDK, Mastra, and etc., making integration effortless regardless of your preferred development stack.

python
1from openai import OpenAI
2
3client = OpenAI(
4    base_url="https://api.keywordsai.co/api/",
5    api_key=YOUR_KEYWORDSAI_API_KEY,
6)
7
8response = client.chat.completions.create(
9    model="gpt-4o-mini",
10    #Any models you would like to use.
11    messages=[{"role":"user", "content":"Tell me a long story"}],
12)

API-First Model Comparison (July 2025)

Metric	Grok 4	o3	Gemini 2.5 Pro	Claude 4 Sonnet	Claude 4 Opus	DeepSeek R1
Context Window	256k	200k	1M (2M planned)	200k	200k	128k
Input Price ($/1M)	$3.00	$2.00	$1.25 (<200k), $2.50 (>200k)	$3.00	$15.00	$0.55
Output Price ($/1M)	$15.00	$8.00	$10.00	$15.00	$75.00	$2.19
Reasoning Mode	Always-on CoT	Advanced reasoning	Deep Think mode	Thinking mode	Extended thinking	Built-in reasoning
Tool Integration	Live Search, JSON functions	Full tool API	Search, function calling	Tool calling, safety filters	Tool calling, parallel compute	Basic tools, OSS flexibility
Multimodal	Text + Image + Voice (mobile)	Text only (API)	Text + Image + Video + Audio	Text + Image	Text + Image	Text only
Special Features	Multi-agent Heavy ($300/mo)	Mathematical precision	1M+ token processing	Fast inference (1.9s)	Industry-leading coding	Open source weights

When Grok 4 Is the Right Choice

Prefer Grok 4 when…	Why
You need ultra-long context (legal transcripts, book-length prompts).	256k tokens exceeds most competitors except Gemini 2.5 Pro's 1M window, ideal for comprehensive document analysis.
Real-time data or social feeds are critical.	Native Live Search pulls current web & X results during inference—no external API orchestration needed.
Advanced mathematical reasoning.	Heavy variant achieves 44.4% on Humanity's Last Exam and 100% on AIME 2025, outperforming most competitors.
Multi-agent collaboration is needed.	Heavy tier ($300/mo) runs parallel AI agents for complex problem-solving that single models can't handle effectively.
Cost-effective reasoning at scale.	$3/$15 pricing matches Claude 4 Sonnet but with always-on chain-of-thought and superior context window.