The following is a dev blog, from our Technical Advisor & Developer Evangelist Maddy Arvapally – be sure to follow her on medium!
Large Language Models (LLMs) have revolutionized how we interact with AI, but they are not without limitations. One of the key challenges is ensuring that these models stay relevant in an ever-evolving world. This challenge was the inspiration behind the FreshLLMs paper, which introduces the concept of using search engine augmentation to improve the factual accuracy and real-time relevance of LLMs. In this article, I will discuss the core concepts of the FreshLLMs approach and how Dappier, a platform designed to provide real-time data APIs, has built its real-time model inspired by this ground-breaking research.
The FreshLLMs Paper: A Brief Overview
In the paper FreshLLMs: Refreshing Large Language Models with Search Engine Augmentation, the authors explore the fact that LLMs are generally trained once and then left static, which makes them increasingly less accurate as world knowledge changes. This becomes especially problematic when answering fast-changing questions such as real-time events, stock prices, or recent news.
The FreshQA benchmark introduced in the paper aims to evaluate LLM performance by focusing on questions that require both static (never-changing) and dynamic (fast-changing) knowledge. The key takeaway is that traditional LLMs, including state-of-the-art models like GPT-4, struggle when faced with questions about real-time events or scenarios with false premises. The solution? FRESHPROMPT, a method that augments the LLM’s prompt with relevant information retrieved from a search engine. This incorporation of dynamic, real-time information enables the LLM to respond more accurately to questions that would otherwise lead to hallucinations or outdated responses.
The paper demonstrates that incorporating real-time data into the model’s decision-making pipeline boosts performance significantly. In particular, FRESHPROMPT improved accuracy by up to 47% over traditional methods on fast-changing questions. The authors emphasize that this approach requires no additional fine-tuning, making it scalable for real-time deployments.
Dappier’s Real-Time Model: Bridging the Gap Between Static Data and Real-Time Needs
Dappier’s real-time model architecture is designed to handle large-scale data ingestion and processing from various sources, ensuring that AI-powered applications receive accurate, up-to-date information. The architecture includes complex pipelines that ingest data from multiple channels, such as RSS feeds, Airtable, and external APIs like Polygon for stock market data. These pipelines ensure that diverse data sources are harmonized and made available for real-time use.
Here’s how it works:
- Data Ingestion Pipelines: Dappier continuously ingests data from a wide variety of sources. Whether it’s live market data from Polygon, structured data from Airtable, or content from RSS feeds, the system ingests and processes these streams at scale. This asynchronous processing ensures that the platform can handle millions of requests per day without bottlenecks.
- RAG (Retrieval-Augmented Generation) Inference Model: Once data is ingested, Dappier’s backend triggers an inference engine that determines the type of RAG search required based on the incoming request. The RAG engine is built to differentiate between static queries (where responses come from previously ingested data) and real-time queries (where fresh data from the web or external APIs is needed).
- For real-time searches, the engine performs an immediate search across internet data or connected APIs to retrieve the latest information, which is then fed into the response pipeline. This ensures that applications get the most up-to-date answers, whether it’s real-time stock prices or breaking news.
- Contextual Data Retrieval: Depending on the user query, Dappier retrieves relevant data and enriches the output with context from real-time sources. For example, if a user needs up-to-the-minute stock market data, the real-time inference engine queries the most recent available information and provides it as part of the response. This contextualization ensures that every query is grounded in accurate, fresh data.
Conclusion
Dappier’s real-time data model integrates diverse data sources, performs intelligent inferences, and leverages real-time search to ensure accurate, scalable performance under heavy traffic. By utilizing retrieval-augmented generation (RAG) and dynamic data pipelines, Dappier significantly reduces hallucinations by grounding AI responses in up-to-date, reliable data. This ensures that developers can create smarter, real-time applications with confidence.
For more details, explore Dappier’s real-time data API and marketplace.
To see this blog post as it was originally published, please click here.