
Google researchers last week launched a new artificial intelligence (AI) architecture that allows large language models (LLMSs) to remember the long-term context of events and topics. The mountain view-based tech giant published a paper on the topic, where researchers claim that AI models trained using the architecture show more “human-like” memory retention capabilities. It is worth noting that Google abandoned traditional transformer and recurrent neural network (RNN) architectures to develop a new approach to teaching AI models how to remember context information.
Titans can expand the context window of AI models with more than 2 million tokens
Ali Behrouz, the project’s lead researcher, posted information about the new building on X (formerly known as Twitter). He claims that the new architecture provides metacultural memory memory and draws attention to teaching AI models how to remember information in test time calculations.
According to Google’s paper, which has been published in the Preprint Online Journal Arxiv, the Titan architecture can expand the context window of the AI model to more than 2 million tokens. Memory has always been a tough problem for AI developers.
Humans remember information and events with context. If someone asks a person what he wears last weekend, they will be able to remember other contextual information, such as attending birthday parties for people they have known for the past 12 years. Regarding the question of why they wore brown jackets and jeans last weekend, the person will be able to background it with all this short-term and long-term information.
AI models, on the other hand, often use search augmented generation (RAG) systems for transformer and RNN architectures. It uses information as a neural node. Therefore, when a question is asked to an AI model, it accesses specific nodes containing primary information, as well as nearby nodes that may contain other information or related information. However, once the query is resolved, the information is deleted from the system to save processing power.
However, there are two disadvantages. First, in the long run, AI models cannot remember information. If a person wants to ask a follow-up question after the meeting, then the complete context must be provided again (unlike how humans work). Second, AI models do a poor job of retrieving information involving long-term environments.
With Titans AI, Behrouz and other Google researchers are trying to build an architecture that enables AI models to develop long-term memory that can run continuously while forgetting information in order to optimize it for computation.
To this end, the researchers designed an architecture that encodes history as parameters of neural networks. Three variants are used – Memory as context (MAC), Memory as Gated (MAG), and Memory as Layer (MAL). Each of these variants is suitable for a specific task.
Additionally, Titans uses a new surprise-based learning system that tells AI models to remember unexpected or critical information about the topic. These two changes allow the Titans architecture to demonstrate improved memory capabilities in LLMS.
In Babilong benchmarks, Titans (MAC) showed excellent performance, which effectively scaled to greater than 2M context windows, surpassing GPT-4, Llama3+Rag and Llama3-70B (e.g. GPT-4, Llama3+Rag and Llama3 performance of -70B). pic.twitter.com/zdngmtgiow
-Ali Behrouz (@Beberouz_Ali) January 13, 2025
In another article, Behrouz claims that based on internal testing of the Babilong benchmark (needle-punching method in needles), the Titans (MAC) model is able to outperform large AI models such as GPT-4, Llama 3+Rag, 3+RAG ,, and Camel 3 70B.