Towards Balanced Dendrograms in Hierarchical Document Clustering

Michael Han published on 2026-03-26

An investigation into unbalanced dendrograms produced by Hierarchical Agglomerative Clustering, and the solutions we found, including a look at the recently proposed Chamfer Linkage.

AI browsers: a needs analysis

Michael Han,

Tirath Ramdas published on 2025-09-08 included in category Ai

For the consumer, chatbots are synonymous with AI. The browser serves as a universal workspace for daily work and life, delivering the chatbots we can’t get enough of, but there is a case for deeper integration - embedding AI within the browser experience itself. Should AI be in the browser? If so, how?

RAG by a Thousand Metrics

Jordan Moshcovitis published on 2025-08-15 included in category RAG

Retrieval-Augmented Generation (RAG) pipelines pair large language models (LLMs) with an external retrieval component. By fetching domain‐relevant chunks of text, these systems can provide more up-to-date or domain-specific answers than models relying solely on static training. Yet, they also add complexities: the system depends on both retrieval quality and generation fidelity.

Topic Modeling: A Comparative Overview of BERTopic, LDA, and Beyond

Martin Nguyen,

Tirath Ramdas published on 2025-07-27 included in category NLP

Hashtags were a defining innovation of Web 2.0; what started as a user-invented hack on Twitter in 2007 has become entrenched as an organizing tool in platforms like Instagram and TikTok, driving community curation and content discovery. This was “folksonomy” in action: bottom-up labeling that adapts faster than top-down taxonomies ever could. But with #AI, can we reinvent the concept of taxonomy to combine the “evolveability” of a bottom-up approach with the “systemizability” of a top-down approach? That is the promise of topic modeling.

The dense fog of RAG: navigating dense retrieval's blind spots

Jordan Moshcovitis,

Tirath Ramdas published on 2024-12-20 included in category RAG

Dense retrieval powers RAG systems, but comes with hidden complexities. Learn to identify and address these challenges.

Effective RAG evaluation: integrated metrics are all you need

Jordan Moshcovitis,

Tirath Ramdas published on 2024-12-05 included in category RAG

Retrieval-Augmented Generation (RAG) pipelines have revolutionised how we integrate custom data with large language models (LLMs), unlocking new possibilities in AI applications. However, evaluating the effectiveness of these pipelines has presented a major challenge for most real-world applications. In this post, we’ll look deeper into the pain points of RAG pipeline evaluation and explore strategies to overcome them.

Improved chatbot customer experience: sleep() is all you need

Tirath Ramdas published on 2024-06-29 included in category Chatbots

How chatbots can be made more human-like by incorporating some wisdom on human psychology.

RAG against the infinite context machine: unit economics is all you need

Tirath Ramdas published on 2024-03-13 included in category RAG

Retrieval Augmented Generation is sometimes viewed as a hack, but in practice it offers several important operational benefits.

Reliable RAG: preprocessing is all you need

Jordan Moshcovitis,

Tirath Ramdas published on 2024-02-22 included in category RAG and series Data Preprocessing

Preprocessing content before vector DB ingest improves information retrieval performance of Retrieval Augmented Generation (RAG) with favourable unit economics. Let's take a look at propositional chunking, an effective semantic preprocessing technique that is practical to implement.

Document 0

Tirath Ramdas published on 2023-12-22 included in category Housekeeping

This site will focus on all aspects of agentic-RAG motivated by the pursuit to create an effective AI co-pilot for research tasks. Please subscribe to be notified when new content is added. Thanks!