curius graph

all pages

showing 35701-35750 of 160880 pages (sorted by popularity)

« prev 1...713 714715716 717...3218 next »

Interactive Scalable Interfaces for Machine Learning Interpretability — Fred Hohman

Simple Long Convolutions for Sequence Modeling · Hazy Research

[2303.03846] Larger language models do in-context learning differently

MosaicBERT: Pretraining BERT from Scratch for $20

[2303.08112] Eliciting Latent Predictions from Transformers with the Tuned Lens

Observability | Practical Observability

Decision Transformer Interpretability - AI Alignment Forum

What Is Bfloat16 Arithmetic? – Nick Higham

[2303.05119] Entropic Wasserstein Component Analysis

TRAK

Why didn't we get GPT-2 in 2005?

Announcing OpenFlamingo: An open-source framework for training vision-language models with in-context learning | LAION

[2303.11249] What Makes Data Suitable for a Locally Connected Neural Network? A Necessary and Sufficient Condition Based on Quantum Entanglement

Actually, Othello-GPT Has A Linear Emergent World Representation — Neel Nanda

Reverse engineering the NTK

On AI Deployment: AI supply chains (and why they matter)

rnoti-p1034.pdf

The Independent Compositional Subspace Hypothesis for the Structure of CLIP's Last Layer | OpenReview

PureJaxRL

When are Neural Networks more powerful than Neural Tangent Kernels? – Off the convex path

AI for General Science - Large language models for scientific hypothesis/research ideas generation | Xinming Tu

Parfit: A Philosopher and His Mission to Save Morality by David Edmonds - review by Jane O’Grady

[2304.03843] Why think step-by-step? Reasoning emerges from the locality of experience

Scaling, emergence, and reasoning (Jason Wei, NYU) - Google Slides

Revisiting the classics: Jensen’s inequality – Machine Learning Research Blog

[2103.00564] An Introduction to Johnson-Lindenstrauss Transforms

[2303.14177] Scaling Expert Language Models with Unsupervised Domain Discovery

PsyArXiv Preprints | Surprisal does not explain syntactic disambiguation difficulty: evidence from a large-scale benchmark

[2303.17951] FP8 versus INT8 for efficient deep learning inference

Seurat CCA? It's just a simple extension of PCA! | Xinming Tu

Niels Bohr's Memorandum to President Roosevelt | The Manhattan Project | Historical Documents | atomicarchive.com

Public Policy for Realists - by Pradyumna Prasad

What Is Iterative Refinement? – Nick Higham

[2305.08809] Interpretability at Scale: Identifying Causal Mechanisms in Alpaca

PsyArXiv Preprints | How hard is cognitive science?

Direct Approach Interactive Model

Playing Doc’s Games—I | The New Yorker

Six Experiments in Action Minimization

Aligning Faithful Interpretations with their Social Attribution - ACL-TACL-2021_2021.tacl-1.18

The longest training run

Learning explanations that are hard to vary | OpenReview

What We Get Wrong About AI & China—Asterisk

What is the curl of a vector field, really? – theHigherGeometer

Lens

What does it mean to understand how a scientific literature is put together? - Marginal REVOLUTION

Inside Argentina's currency exchange black markets | devonzuegel.com

[2307.05599] AlephZero and Mathematical Experience

Can we develop theoretical explanations for today’s AI systems? - generally intelligent

Anthropic \ Studying Large Language Model Generalization with…

Embroid: Correcting and Improving LLM Predictions Without Labels · Hazy Research

« prev 1...713 714715716 717...3218 next »