I've been exploring how LLM applications with automated query scheduling - like cron-based tasks - can generate daily newsletters and curated content updates. The potential here is incredible: staying continuously updated on specific domains without any manual effort.
However, I ran into a significant challenge during my experiments: the system kept generating the same content every single day. After digging deeper, I realised the issue stems from how LLMs use Retrieval-Augmented Generation (RAG). When these systems search for information online, they stop the moment they believe they've gathered enough data. This leads to premature output generation based on limited sources.
Here's what happened in my case: I asked for a daily newsletter on AWS, expecting diverse topics. Instead, I received content about AWS Lambda. Every. Single. Day. When I examined the reasoning process (the thinking section of the output), I noticed the system was stopping its search immediately after hitting an article on AWS Lambda and generating the entire newsletter based on that alone.
Naturally, I tried the obvious fixes. I explicitly instructed the prompt to generate unique topics daily - didn't work. I added randomization elements - but then the topics became inconsistent and often irrelevant. I tried setting time-bound constraints, asking for content from the last 24 hours - this worked occasionally, but not reliably.
So I've been thinking about a solution: What if LLM systems maintained a local cache? Before generating any output, the system would check this cache to see if similar content was previously created. If it detects duplication, it generates something fresh instead. This would ensure we get high-quality, unique outputs consistently.
The applications for this are vast: generating daily newsletters, preparing for exams (one topic from the syllabus each day), creating unique motivational quotes, crafting bedtime stories - essentially any use case that requires fresh, relevant content on a recurring basis.
\


