Sample Pages (Top 50 by confidence)
Open Problems in AIXI Agent Foundations — AI Alignment Forum
https://www.alignmentforum.org/posts/MvfD4tmzyuCYFqB2f/open-problems-in-aixi-age...
1 user
Last: Jan 07, 2026
100% confidence
We need a Science of Evals — AI Alignment Forum
https://www.alignmentforum.org/posts/fnc6Sgt3CGCdFmmgX/we-need-a-science-of-eval...
1 user
Last: Jan 07, 2026
100% confidence
Redwood Research’s current project - AI Alignment Forum
https://www.alignmentforum.org/posts/k7oxdbNaGATZbtEg3/redwood-research-s-curren...
1 user
Last: Jan 07, 2026
100% confidence
Learning-theoretic agenda reading list — AI Alignment Forum
https://www.alignmentforum.org/posts/fsGEyCYhqs7AWwdCe/learning-theoretic-agenda...
1 user
Last: Jan 07, 2026
100% confidence
Toward A Mathematical Framework for Computation in Superposition — AI Alignment Forum
https://www.alignmentforum.org/posts/2roZtSr5TGmLjXMnT/toward-a-mathematical-fra...
1 user
Last: Jan 07, 2026
100% confidence
Introducing Alignment Stress-Testing at Anthropic — AI Alignment Forum
https://www.alignmentforum.org/posts/EPDSdXr8YbsDkgsDG/introducing-alignment-str...
1 user
Last: Jan 07, 2026
100% confidence
Touch reality as soon as possible (when doing machine learning research) — AI Alignment Forum
https://www.alignmentforum.org/posts/fqryrxnvpSr5w2dDJ/touch-reality-as-soon-as-...
1 user
Last: Jan 07, 2026
100% confidence
Different perspectives on concept extrapolation — AI Alignment Forum
https://www.alignmentforum.org/posts/j9vCEjRFDwmH8FTKH/different-perspectives-on...
1 user
Last: Jan 07, 2026
100% confidence
The blue-minimising robot and model splintering — AI Alignment Forum
https://www.alignmentforum.org/posts/BeeirdrMXCPYZwgfj/the-blue-minimising-robot...
1 user
Last: Jan 07, 2026
100% confidence
Catching AIs red-handed — AI Alignment Forum
https://www.alignmentforum.org/posts/i2nmBfCXnadeGmhzW/catching-ais-red-handed
1 user
Last: Jan 07, 2026
100% confidence
Claude Sonnet 3.7 (often) knows when it’s in alignment evaluations — AI Alignment Forum
https://www.alignmentforum.org/posts/E3daBewppAiECN3Ao/claude-sonnet-3-7-often-k...
1 user
Last: Jan 07, 2026
100% confidence
Investigating the learning coefficient of modular addition: hackathon project — AI Alignment Forum
https://www.alignmentforum.org/posts/4v3hMuKfsGatLXPgt/investigating-the-learnin...
1 user
Last: Jan 07, 2026
100% confidence
Safe Predictive Agents with Joint Scoring Rules — AI Alignment Forum
https://www.alignmentforum.org/posts/FFCDWx6qBdBds6jvL/safe-predictive-agents-wi...
1 user
Last: Jan 07, 2026
100% confidence
Alignment remains a hard, unsolved problem — AI Alignment Forum
https://www.alignmentforum.org/posts/epjuxGnSPof3GnMSL/alignment-remains-a-hard-...
1 user
Last: Jan 07, 2026
100% confidence
A Semitechnical Introductory Dialogue on Solomonoff Induction — AI Alignment Forum
https://www.alignmentforum.org/posts/EL4HNa92Z95FKL9R2/a-semitechnical-introduct...
1 user
Last: Jan 07, 2026
100% confidence
Tips and Code for Empirical Research Workflows — AI Alignment Forum
https://www.alignmentforum.org/posts/6P8GYb4AjtPXx6LLB/tips-and-code-for-empiric...
1 user
Last: Jan 07, 2026
100% confidence
Synthesizing Standalone World-Models (+ Bounties, Seeking Funding) — AI Alignment Forum
https://www.alignmentforum.org/posts/LngR93YwiEpJ3kiWh/research-agenda-synthesiz...
1 user
Last: Jan 07, 2026
100% confidence
Nearcast-based "deployment problem" analysis — AI Alignment Forum
https://www.alignmentforum.org/posts/vZzg8NS7wBtqcwhoJ/nearcast-based-deployment...
1 user
Last: Jan 07, 2026
100% confidence
Simulators - AI Alignment Forum
https://www.alignmentforum.org/posts/vJFdjigzmcXMhNTsx/simulators
1 user
Last: Jan 07, 2026
100% confidence
Multi-Component Learning and S-Curves — AI Alignment Forum
https://www.alignmentforum.org/posts/RKDQCB6smLWgs2Mhr/multi-component-learning-...
1 user
Last: Jan 07, 2026
100% confidence
Bounded Solomonoff Induction: Three Difficulty Settings — AI Alignment Forum
https://www.alignmentforum.org/posts/5bd75cc58225bf067037504a/bounded-solomonoff...
1 user
Last: Jan 07, 2026
100% confidence
A starter guide for evals — AI Alignment Forum
https://www.alignmentforum.org/posts/2PiawPFJeyCQGcwXG/a-starter-guide-for-evals
3 users
Last: Jan 07, 2026
100% confidence
Smoking Lesion Steelman — AI Alignment Forum
https://www.alignmentforum.org/posts/5bd75cc58225bf0670375452/smoking-lesion-ste...
1 user
Last: Jan 07, 2026
100% confidence
Two Major Obstacles for Logical Inductor Decision Theory — AI Alignment Forum
https://www.alignmentforum.org/posts/5bd75cc58225bf06703753d4/two-major-obstacle...
1 user
Last: Jan 07, 2026
100% confidence
Automation collapse — AI Alignment Forum
https://www.alignmentforum.org/posts/2Gy9tfjmKwkYbF9BY/automation-collapse
1 user
Last: Jan 07, 2026
100% confidence
Open Source Replication of Anthropic’s Crosscoder paper for model-diffing — AI Alignment Forum
https://www.alignmentforum.org/posts/srt6JXsRMtmqAJavD/open-source-replication-o...
1 user
Last: Jan 07, 2026
100% confidence
Video lectures on the learning-theoretic agenda — AI Alignment Forum
https://www.alignmentforum.org/posts/NWKk2eQwfuGzRXusJ/video-lectures-on-the-lea...
1 user
Last: Jan 07, 2026
100% confidence
Logical Inductor Tiling and Why it's Hard — AI Alignment Forum
https://www.alignmentforum.org/posts/5bd75cc58225bf067037556d/logical-inductor-t...
1 user
Last: Jan 07, 2026
100% confidence
The Logistics of Distribution of Meaning — AI Alignment Forum
https://www.alignmentforum.org/posts/MhBRGfTRJKtjc44eJ/the-logistics-of-distribu...
1 user
Last: Jan 07, 2026
100% confidence
The 2023 LessWrong Review: The Basic Ask — AI Alignment Forum
https://www.alignmentforum.org/posts/pudQtkre7f9GLmb2b/the-2023-lesswrong-review...
1 user
Last: Jan 07, 2026
100% confidence
The Telephone Theorem: Information At A Distance Is Mediated By Deterministic Constraints — AI Alignment Forum
https://www.alignmentforum.org/posts/jJf4FrfiQdDGg7uco/the-telephone-theorem-inf...
1 user
Last: Jan 07, 2026
100% confidence
Principles for Alignment/Agency Projects — AI Alignment Forum
https://www.alignmentforum.org/posts/A7GeRNLzuFnhvGGgb/principles-for-alignment-...
1 user
Last: Jan 07, 2026
100% confidence
Two proposed projects on abstract analogies for scheming — AI Alignment Forum
https://www.alignmentforum.org/posts/5zsLpcTMtesgF7c8p/two-proposed-projects-on-...
1 user
Last: Jan 07, 2026
100% confidence
Goodhart Taxonomy — AI Alignment Forum
https://www.alignmentforum.org/posts/EbFABnst8LsidYs5Y/goodhart-taxonomy
1 user
Last: Jan 07, 2026
100% confidence
VC Theory Overview — AI Alignment Forum
https://www.alignmentforum.org/posts/uEwECj53prjKLcBC5/vc-theory-overview
1 user
Last: Jan 07, 2026
100% confidence
Alignment Research Field Guide - AI Alignment Forum
https://www.alignmentforum.org/posts/PqMT9zGrNsGJNfiFR/alignment-research-field-...
1 user
Last: Jan 07, 2026
100% confidence
UDT shows that decision theory is more puzzling than ever — AI Alignment Forum
https://www.alignmentforum.org/posts/wXbSAKu2AcohaK2Gt/udt-shows-that-decision-t...
1 user
Last: Jan 07, 2026
100% confidence
The Inner Alignment Problem — AI Alignment Forum
https://www.alignmentforum.org/s/r9tYkB2a8Fp4DN8yB/p/pL56xPoniLvtMDQ4J
2 users
Last: Jan 07, 2026
100% confidence
Cooperative Oracles: Nonexploited Bargaining — AI Alignment Forum
https://www.alignmentforum.org/posts/5bd75cc58225bf067037541a/cooperative-oracle...
1 user
Last: Jan 07, 2026
100% confidence
Cooperative Oracles: Stratified Pareto Optima and Almost Stratified Pareto Optima — AI Alignment Forum
https://www.alignmentforum.org/posts/5bd75cc58225bf0670375441/cooperative-oracle...
1 user
Last: Jan 07, 2026
100% confidence
Confusions re: Higher-Level Game Theory — AI Alignment Forum
https://www.alignmentforum.org/posts/FPML8k4QtjJxk3Y4M/confusions-re-higher-leve...
1 user
Last: Jan 07, 2026
100% confidence
Universal Inductors — AI Alignment Forum
https://www.alignmentforum.org/posts/5bd75cc58225bf067037520a/universal-inductor...
1 user
Last: Jan 07, 2026
100% confidence
Reflective oracles as a solution to the converse Lawvere problem — AI Alignment Forum
https://www.alignmentforum.org/posts/5bd75cc58225bf067037550d/reflective-oracles...
1 user
Last: Jan 07, 2026
100% confidence
Formal Open Problem in Decision Theory — AI Alignment Forum
https://www.alignmentforum.org/posts/5bd75cc58225bf06703753a9/formal-open-proble...
1 user
Last: Jan 07, 2026
100% confidence
Fixed Points - AI Alignment Forum
https://www.alignmentforum.org/s/5WF3wmwvxX9TEbFXf
1 user
Last: Jan 07, 2026
100% confidence
The Ubiquitous Converse Lawvere Problem — AI Alignment Forum
https://www.alignmentforum.org/posts/5bd75cc58225bf06703753b9/the-ubiquitous-con...
2 users
Last: Jan 07, 2026
100% confidence
Modal Fixpoint Cooperation without Löb's Theorem — AI Alignment Forum
https://www.alignmentforum.org/posts/2WpPRrqrFQa6n2x3W/modal-fixpoint-cooperatio...
1 user
Last: Jan 07, 2026
100% confidence
Reflexive Oracles and superrationality: Pareto — AI Alignment Forum
https://www.alignmentforum.org/posts/5bd75cc58225bf0670375068/reflexive-oracles-...
1 user
Last: Jan 07, 2026
100% confidence
Partial Agency — AI Alignment Forum
https://www.alignmentforum.org/posts/4hdHto3uHejhY2F3Q/partial-agency
1 user
Last: Jan 07, 2026
100% confidence
Stitching SAEs of different sizes — AI Alignment Forum
https://www.alignmentforum.org/posts/baJyjpktzmcmRfosq/stitching-saes-of-differe...
1 user
Last: Jan 07, 2026
100% confidence