curius graph

all topics

click on a topic to explore it

186

Topic Clusters

167,210

Total Pages

AI Alignment Research

318 pages in cluster

Sample Pages (Top 50 by confidence)

Open Problems in AIXI Agent Foundations — AI Alignment Forum

https://www.alignmentforum.org/posts/MvfD4tmzyuCYFqB2f/open-problems-in-aixi-age...

Last: Jan 07, 2026

100% confidence

We need a Science of Evals — AI Alignment Forum

https://www.alignmentforum.org/posts/fnc6Sgt3CGCdFmmgX/we-need-a-science-of-eval...

Last: Jan 07, 2026

100% confidence

Redwood Research’s current project - AI Alignment Forum

https://www.alignmentforum.org/posts/k7oxdbNaGATZbtEg3/redwood-research-s-curren...

Last: Jan 07, 2026

100% confidence

Learning-theoretic agenda reading list — AI Alignment Forum

https://www.alignmentforum.org/posts/fsGEyCYhqs7AWwdCe/learning-theoretic-agenda...

Last: Jan 07, 2026

100% confidence

Toward A Mathematical Framework for Computation in Superposition — AI Alignment Forum

https://www.alignmentforum.org/posts/2roZtSr5TGmLjXMnT/toward-a-mathematical-fra...

Last: Jan 07, 2026

100% confidence

Introducing Alignment Stress-Testing at Anthropic — AI Alignment Forum

https://www.alignmentforum.org/posts/EPDSdXr8YbsDkgsDG/introducing-alignment-str...

Last: Jan 07, 2026

100% confidence

Touch reality as soon as possible (when doing machine learning research) — AI Alignment Forum

https://www.alignmentforum.org/posts/fqryrxnvpSr5w2dDJ/touch-reality-as-soon-as-...

Last: Jan 07, 2026

100% confidence

Different perspectives on concept extrapolation — AI Alignment Forum

https://www.alignmentforum.org/posts/j9vCEjRFDwmH8FTKH/different-perspectives-on...

Last: Jan 07, 2026

100% confidence

The blue-minimising robot and model splintering — AI Alignment Forum

https://www.alignmentforum.org/posts/BeeirdrMXCPYZwgfj/the-blue-minimising-robot...

Last: Jan 07, 2026

100% confidence

Catching AIs red-handed — AI Alignment Forum

https://www.alignmentforum.org/posts/i2nmBfCXnadeGmhzW/catching-ais-red-handed

Last: Jan 07, 2026

100% confidence

Claude Sonnet 3.7 (often) knows when it’s in alignment evaluations — AI Alignment Forum

https://www.alignmentforum.org/posts/E3daBewppAiECN3Ao/claude-sonnet-3-7-often-k...

Last: Jan 07, 2026

100% confidence

Investigating the learning coefficient of modular addition: hackathon project — AI Alignment Forum

https://www.alignmentforum.org/posts/4v3hMuKfsGatLXPgt/investigating-the-learnin...

Last: Jan 07, 2026

100% confidence

Safe Predictive Agents with Joint Scoring Rules — AI Alignment Forum

https://www.alignmentforum.org/posts/FFCDWx6qBdBds6jvL/safe-predictive-agents-wi...

Last: Jan 07, 2026

100% confidence

Alignment remains a hard, unsolved problem — AI Alignment Forum

https://www.alignmentforum.org/posts/epjuxGnSPof3GnMSL/alignment-remains-a-hard-...

Last: Jan 07, 2026

100% confidence

A Semitechnical Introductory Dialogue on Solomonoff Induction — AI Alignment Forum

https://www.alignmentforum.org/posts/EL4HNa92Z95FKL9R2/a-semitechnical-introduct...

Last: Jan 07, 2026

100% confidence

Tips and Code for Empirical Research Workflows — AI Alignment Forum

https://www.alignmentforum.org/posts/6P8GYb4AjtPXx6LLB/tips-and-code-for-empiric...

Last: Jan 07, 2026

100% confidence

Synthesizing Standalone World-Models (+ Bounties, Seeking Funding) — AI Alignment Forum

https://www.alignmentforum.org/posts/LngR93YwiEpJ3kiWh/research-agenda-synthesiz...

Last: Jan 07, 2026

100% confidence

Nearcast-based "deployment problem" analysis — AI Alignment Forum

https://www.alignmentforum.org/posts/vZzg8NS7wBtqcwhoJ/nearcast-based-deployment...

Last: Jan 07, 2026

100% confidence

Simulators - AI Alignment Forum

https://www.alignmentforum.org/posts/vJFdjigzmcXMhNTsx/simulators

Last: Jan 07, 2026

100% confidence

Multi-Component Learning and S-Curves — AI Alignment Forum

https://www.alignmentforum.org/posts/RKDQCB6smLWgs2Mhr/multi-component-learning-...

Last: Jan 07, 2026

100% confidence

Bounded Solomonoff Induction: Three Difficulty Settings — AI Alignment Forum

https://www.alignmentforum.org/posts/5bd75cc58225bf067037504a/bounded-solomonoff...

Last: Jan 07, 2026

100% confidence

A starter guide for evals — AI Alignment Forum

https://www.alignmentforum.org/posts/2PiawPFJeyCQGcwXG/a-starter-guide-for-evals

Last: Feb 21, 2026

100% confidence

Smoking Lesion Steelman — AI Alignment Forum

https://www.alignmentforum.org/posts/5bd75cc58225bf0670375452/smoking-lesion-ste...

Last: Jan 07, 2026

100% confidence

Two Major Obstacles for Logical Inductor Decision Theory — AI Alignment Forum

https://www.alignmentforum.org/posts/5bd75cc58225bf06703753d4/two-major-obstacle...

Last: Jan 07, 2026

100% confidence

Automation collapse — AI Alignment Forum

https://www.alignmentforum.org/posts/2Gy9tfjmKwkYbF9BY/automation-collapse

Last: Jan 07, 2026

100% confidence

Open Source Replication of Anthropic’s Crosscoder paper for model-diffing — AI Alignment Forum

https://www.alignmentforum.org/posts/srt6JXsRMtmqAJavD/open-source-replication-o...

Last: Jan 07, 2026

100% confidence

Video lectures on the learning-theoretic agenda — AI Alignment Forum

https://www.alignmentforum.org/posts/NWKk2eQwfuGzRXusJ/video-lectures-on-the-lea...

Last: Jan 07, 2026

100% confidence

Logical Inductor Tiling and Why it's Hard — AI Alignment Forum

https://www.alignmentforum.org/posts/5bd75cc58225bf067037556d/logical-inductor-t...

Last: Jan 07, 2026

100% confidence

The Logistics of Distribution of Meaning — AI Alignment Forum

https://www.alignmentforum.org/posts/MhBRGfTRJKtjc44eJ/the-logistics-of-distribu...

Last: Jan 07, 2026

100% confidence

The 2023 LessWrong Review: The Basic Ask — AI Alignment Forum

https://www.alignmentforum.org/posts/pudQtkre7f9GLmb2b/the-2023-lesswrong-review...

Last: Jan 07, 2026

100% confidence

The Telephone Theorem: Information At A Distance Is Mediated By Deterministic Constraints — AI Alignment Forum

https://www.alignmentforum.org/posts/jJf4FrfiQdDGg7uco/the-telephone-theorem-inf...

Last: Jan 07, 2026

100% confidence

Principles for Alignment/Agency Projects — AI Alignment Forum

https://www.alignmentforum.org/posts/A7GeRNLzuFnhvGGgb/principles-for-alignment-...

Last: Jan 07, 2026

100% confidence

Two proposed projects on abstract analogies for scheming — AI Alignment Forum

https://www.alignmentforum.org/posts/5zsLpcTMtesgF7c8p/two-proposed-projects-on-...

Last: Jan 07, 2026

100% confidence

Goodhart Taxonomy — AI Alignment Forum

https://www.alignmentforum.org/posts/EbFABnst8LsidYs5Y/goodhart-taxonomy

Last: Jan 07, 2026

100% confidence

VC Theory Overview — AI Alignment Forum

https://www.alignmentforum.org/posts/uEwECj53prjKLcBC5/vc-theory-overview

Last: Jan 07, 2026

100% confidence

Alignment Research Field Guide - AI Alignment Forum

https://www.alignmentforum.org/posts/PqMT9zGrNsGJNfiFR/alignment-research-field-...

Last: Jan 07, 2026

100% confidence

UDT shows that decision theory is more puzzling than ever — AI Alignment Forum

https://www.alignmentforum.org/posts/wXbSAKu2AcohaK2Gt/udt-shows-that-decision-t...

Last: Jan 07, 2026

100% confidence

The Inner Alignment Problem — AI Alignment Forum

https://www.alignmentforum.org/s/r9tYkB2a8Fp4DN8yB/p/pL56xPoniLvtMDQ4J

Last: Jan 07, 2026

100% confidence

Cooperative Oracles: Nonexploited Bargaining — AI Alignment Forum

https://www.alignmentforum.org/posts/5bd75cc58225bf067037541a/cooperative-oracle...

Last: Jan 07, 2026

100% confidence

Cooperative Oracles: Stratified Pareto Optima and Almost Stratified Pareto Optima — AI Alignment Forum

https://www.alignmentforum.org/posts/5bd75cc58225bf0670375441/cooperative-oracle...

Last: Jan 07, 2026

100% confidence

Confusions re: Higher-Level Game Theory — AI Alignment Forum

https://www.alignmentforum.org/posts/FPML8k4QtjJxk3Y4M/confusions-re-higher-leve...

Last: Jan 07, 2026

100% confidence

Universal Inductors — AI Alignment Forum

https://www.alignmentforum.org/posts/5bd75cc58225bf067037520a/universal-inductor...

Last: Jan 07, 2026

100% confidence

Reflective oracles as a solution to the converse Lawvere problem — AI Alignment Forum

https://www.alignmentforum.org/posts/5bd75cc58225bf067037550d/reflective-oracles...

Last: Jan 07, 2026

100% confidence

Formal Open Problem in Decision Theory — AI Alignment Forum

https://www.alignmentforum.org/posts/5bd75cc58225bf06703753a9/formal-open-proble...

Last: Jan 07, 2026

100% confidence

Fixed Points - AI Alignment Forum

https://www.alignmentforum.org/s/5WF3wmwvxX9TEbFXf

Last: Jan 07, 2026

100% confidence

The Ubiquitous Converse Lawvere Problem — AI Alignment Forum

https://www.alignmentforum.org/posts/5bd75cc58225bf06703753b9/the-ubiquitous-con...

Last: Jan 07, 2026

100% confidence

Modal Fixpoint Cooperation without Löb's Theorem — AI Alignment Forum

https://www.alignmentforum.org/posts/2WpPRrqrFQa6n2x3W/modal-fixpoint-cooperatio...

Last: Jan 07, 2026

100% confidence

Reflexive Oracles and superrationality: Pareto — AI Alignment Forum

https://www.alignmentforum.org/posts/5bd75cc58225bf0670375068/reflexive-oracles-...

Last: Jan 07, 2026

100% confidence

Partial Agency — AI Alignment Forum

https://www.alignmentforum.org/posts/4hdHto3uHejhY2F3Q/partial-agency

Last: Jan 07, 2026

100% confidence

Stitching SAEs of different sizes — AI Alignment Forum

https://www.alignmentforum.org/posts/baJyjpktzmcmRfosq/stitching-saes-of-differe...

Last: Jan 07, 2026

100% confidence