Claude ups the ante with larger context, better memory

Staff Writer
Aug 15
3 min read

Anthropic has made its AI model Claude a lot more relevant to business users with a couple of upgrades announced Tuesday.

Key among them is the expansion of its context window from 200,000 tokens to 1 million tokens, allowing it to handle a larger volume of text and code in a single session. The AI startup has also added a new memory function “search past chats” that allows Claude to recall previous conversations and summarize them.

The memory update is being deployed to Max, Team, and Enterprise subscribers, with Pro users to follow; there’s no word yet on free-tier access. The San Francisco–based AI firm is also reportedly seeking fresh funding at a valuation of about $170 billion. The new capability lets users pick up projects or threads without scrolling through old chats, retrieving relevant past exchanges either on request or when Claude deems them useful. Such memory tools, analysts note, boost engagement and platform “stickiness” by making AI assistants feel more persistent and context-aware.

Meanwhile, the five-fold increase in Claude’s ability to process more tokens is a big boost for business users and developers. It means the AI chatbot can now analyze 75,000 lines of codes or 400-500 pages of a book in one go without losing context.

This would also allow developers to analyze an entire software codebase using Claude to find bugs, restructuring codes, or decipher the software architecture. The wider context support in Claude Sonnet 4 is currently in public beta and available through Anthropic’s application programming interface (API) as well as through Amazon Bedrock, a service for building and scaling genAI applications. It is also expected to roll out on Google Cloud’s Vertex AI, an agent builder platform, soon.

Claude is one of the top genAI models and one of the key competitors to OpenAI’s ChatGPT, which has a much smaller context window. Even the most recent and advanced GPT-5 model, released last week, supports only 196,000 tokens. Google’s Gemini 2.5 Pro is the only other AI model that supports 1 million tokens.

Though context length allows a model to analyze more information, it doesn’t necessarily make it more competent. In fact, studies show that most models even with longer context can struggle at recalling information accurately and their effectiveness can decline as the context increases.

While longer context is a big boost for enterprises, processing 1 million tokens can be expensive as Anthropic will charge $6 per million tokens as compared to $3 per million tokens if the queries exceed the 200,000 token mark. The output charges will also rise to $22 per million token from $15 per million token.

One of the biggest advantages of long context in AI models is that it fixes the constant need to remind it of the context and repeat information to it.

Claude’s new memory function is also an attempt in that direction as it will ensure that business users or developers won’t have to provide a context or explanation every time they want to start a project. They can ask it to refer to previous conversations and take the project ahead from there.

According to Collabnix’s “AI Models Comparison 2025” report, published July 1, leading AI models are trying to distinguish themselves through specialized strengths, which makes each model relevant for specific use cases instead of trying to surpass each other on every benchmark score.

For instance, Claude 4 Opus/Sonnet leads in software development with top-tier coding, tool integration, and debugging. Grok 3 (Think Mode) excels in mathematical and scientific research with advanced reasoning, real-time data, and parallel thinking. Gemini 2.5 Pro dominates multimodal work thanks to video understanding, huge context windows, and native audio output.

DeepSeek R1/V3 offers cost-conscious deployment with free, open-source access and competitive performance. Llama 4 is the go-to for open-source development with robust multimodal abilities and commercial licensing, and GPT-4/4.5 remains the balanced, reliable choice for general-purpose use.

The launch of GPT-5 last week, billed as OpenAI’s most advanced model with sharper reasoning and improved safeguards against bias and hallucinations, has only upped the ante. GPT-5 is also expected to be much better at planning and executing tasks involving multiple steps, which means it can backtrack and correct its mistakes. It is also one of the most cost-effective models for high volume workload and has much lower API usage fees ($1.25 per million input token) as compared to rivals such as Claude Opus 4.1 ($15 per million input tokens).