curius graph
☾
Dark
all pages
search
showing 6451-6500 of 168121 pages (sorted by popularity)
« prev
1
...
128
129
130
131
132
...
3363
next »
Discovering Language Model Behaviors with Model-Written Evaluations — LessWrong
3 users ▼
Technical debt
3 users ▼
[2106.09685] LoRA: Low-Rank Adaptation of Large Language Models
3 users ▼
[2006.08381] DreamCoder: Growing generalizable, interpretable knowledge with wake-sleep Bayesian program learning
3 users ▼
Chris Olah’s views on AGI safety - LessWrong
3 users ▼
AI Creation and the Cosmic Host
3 users ▼
Pluck and hard work, or luck of birth? Two stories, one man | Aeon Essays
3 users ▼
Petri: An open-source auditing tool to accelerate AI safety research \ Anthropic
3 users ▼
smoothbrains.net — Home
3 users ▼
Inductive bias - Wikipedia
3 users ▼
ARC-AGI Without Pretraining | iliao2345
3 users ▼
Richard S. Sutton - Wikipedia
3 users ▼
Geometric Rationality is Not VNM Rational — LessWrong
3 users ▼
Towards a scale-free theory of intelligent agency
3 users ▼
Anthropic/values-in-the-wild · Datasets at Hugging Face
3 users ▼
Likert scale
3 users ▼
[2401.05566] Sleeper Agents: Training Deceptive LLMs that Persist Through Safety Training
3 users ▼
[1503.02531] Distilling the Knowledge in a Neural Network
3 users ▼
Emergent introspective awareness in large language models \ Anthropic
3 users ▼
Meta-rationality: An introduction | Meta-rationality
3 users ▼
Categorical distribution
3 users ▼
Our Mission, Technology, and Approach
3 users ▼
Stop Climbing!
3 users ▼
Spectral density
3 users ▼
Mel scale - Wikipedia
3 users ▼
What I use
3 users ▼
My 40-liter backpack travel guide
3 users ▼
Representation Engineering Mistral-7B an Acid Trip
3 users ▼
The Boring Part of Bell Labs – Aceso Under Glass
3 users ▼
Gibbs sampling - Wikipedia
3 users ▼
This page is a quine.
3 users ▼
Is Success the Enemy of Freedom? (Full) - LessWrong
3 users ▼
Goodhart Taxonomy — LessWrong
3 users ▼
Functional near-infrared spectroscopy
3 users ▼
Thinking through how pretraining vs RL learn
3 users ▼
Ilya Sutskever – We're moving from the age of scaling to the age of research
3 users ▼
[1609.09106] HyperNetworks
3 users ▼
⚓️ Thought Anchors
3 users ▼
Advent of Code 2021
3 users ▼
Paper Trails
3 users ▼
Potemkin village
3 users ▼
Links
3 users ▼
when you break your own heart - by vincent huang
3 users ▼
Everything Is Correlated · Gwern.net
3 users ▼
Narrow Misalignment is Hard, Emergent Misalignment is Easy — LessWrong
3 users ▼
[2406.14546] Connecting the Dots: LLMs can Infer and Verbalize Latent Structure from Disparate Training Data
3 users ▼
Herbert A. Simon
3 users ▼
The Hour I First Believed | Slate Star Codex
3 users ▼
Functional ultrasound through the skull
3 users ▼
Activation Oracles: Training and Evaluating LLMs as General-Purpose Activation Explainers
3 users ▼
« prev
1
...
128
129
130
131
132
...
3363
next »