Curated Intelligence Hub

Master the
AI Frontier.

Discover the high-signal tools, research, and courses driving the future of artificial intelligence.

Tools

Courses

Videos

Research

Curated Intelligence Hub

Master the
AI Frontier.

Discover the high-signal tools, research, and courses driving the future of artificial intelligence.

Tools

Courses

Videos

Research

Latest AI Tools

Hand-picked tools for your workflow

View collection

Verified

New

Hubspot Aeo

custom

HubSpot's AI-powered platform for sales teams, enhancing productivity and driving revenue.

sales engagement

ai for sales

crm

+4 MORE

New addition

Verified

New

Jobbyo

freemium

Jobbyo is an AI-powered platform designed to streamline your job search and career development.

job-search

career-development

ai-assistant

+4 MORE

New addition

Verified

New

remio

freemium

AI assistant built on your memory, unifying files, meetings, and web content to help you work smarter and generate reports.

ai-assistant

knowledge-management

productivity

+6 MORE

New addition

Verified

New

IBM Bob

custom

IBM Bob is an internal AI assistant streamlining employee workflows and knowledge access within the IBM ecosystem.

enterprise-ai

internal-tool

workflow-automation

+4 MORE

New addition

Featured Courses

Hand-picked courses for your workflow

View collection

Certificate

beginner

0.5 HOURS

Build with Andrew

Andrew Ng

Learn to build working web applications by describing ideas in words and letting AI transform them into apps, no coding required.

0.0(0)

Details

Certificate

beginner

3 HOURS

Generative AI for Everyone

Andrew Ng

Learn how generative AI works, its real-world uses, and how to apply it effectively in your life and career.

0.0(0)

Details

Research Papers

Hand-picked research for your workflow

View collection

Research Paper

Pitfalls in Evaluating Interpretability Agents

"Automated interpretability systems aim to reduce the need for human labor and scale analysis to increasingly large models and diverse tasks. Recent efforts toward this goal leverage large language models (LLMs) at increasing levels of autonomy, ranging from fixed one-shot workflows to fully autonomous interpretability agents. This shift creates a corresponding need to scale evaluation approaches to keep pace with both the volume and complexity of generated explanations. We investigate this challenge in the context of automated circuit analysis -- explaining the roles of model components when performing specific tasks. To this end, we build an agentic system in which a research agent iteratively designs experiments and refines hypotheses. When evaluated against human expert explanations across six circuit analysis tasks in the literature, the system appears competitive. However, closer examination reveals several pitfalls of replication-based evaluation: human expert explanations can be subjective or incomplete, outcome-based comparisons obscure the research process, and LLM-based systems may reproduce published findings via memorization or informed guessing. To address some of these pitfalls, we propose an unsupervised intrinsic evaluation based on the functional interchangeability of model components. Our work demonstrates fundamental challenges in evaluating complex automated interpretability systems and reveals key limitations of replication-based evaluation."

Abstract View PDF

Research Paper

Learning Dynamic Belief Graphs for Theory-of-mind Reasoning

"Theory of Mind (ToM) reasoning with Large Language Models (LLMs) requires inferring how people's implicit, evolving beliefs shape what they seek and how they act under uncertainty -- especially in high-stakes settings such as disaster response, emergency medicine, and human-in-the-loop autonomy. Prior approaches either prompt LLMs directly or use latent-state models that treat beliefs as static and independent, often producing incoherent mental models over time and weak reasoning in dynamic contexts. We introduce a structured cognitive trajectory model for LLM-based ToM that represents mental state as a dynamic belief graph, jointly inferring latent beliefs, learning their time-varying dependencies, and linking belief evolution to information seeking and decisions. Our model contributes (i) a novel projection from textualized probabilistic statements to consistent probabilistic graphical model updates, (ii) an energy-based factor graph representation of belief interdependencies, and (iii) an ELBO-based objective that captures belief accumulation and delayed decisions. Across multiple real-world disaster evacuation datasets, our model significantly improves action prediction and recovers interpretable belief trajectories consistent with human reasoning, providing a principled module for augmenting LLMs with ToM in high-uncertainty environment. [this https URL](https://anonymous.4open.science/r/ICML_submission-6373/)"

Abstract View PDF

Top GitHub Repositories

Hand-picked github for your workflow

View collection

Context Engineering

by davidkimai

"A first-principles handbook to mastering context design, orchestration, and optimization for large language models, moving beyond basic prompt engi..."

8,840

983

Python