curius graph

all pages

showing 133351-133400 of 160880 pages (sorted by popularity)

« prev 1...2666 266726682669 2670...3218 next »

[2212.03827] Discovering Latent Knowledge in Language Models Without Supervision

Discover 1X | A Gearless Future

Choosing the right GPU for deep learning on AWS | by Shashank Prasanna | Towards Data Science

Why Nvidia’s AI Supremacy is Only Temporary « Pete Warden's blog

[2202.11233] Retrieval Augmented Classification for Long-Tail Visual Recognition

HQQ quantization

Quantization

Quantization for Neural Networks - Lei Mao's Log Book

[2212.08045] CLIPPO: Image-and-Language Understanding from Pixels Only

[2403.03163] Design2Code: How Far Are We From Automating Front-End Engineering?

How I Learned to Concentrate | The New Yorker

How to Build a GPT-3 for Science | Future

Command-R: Retrieval Augmented Generation at Production Scale

How to ML Paper - A brief Guide - Google Docs

Timo Schick | Toolformer: Language Models Can Teach Themselves to Use Tools - YouTube

Steering Llama-2 with contrastive activation additions — LessWrong

[2402.02622] DenseFormer: Enhancing Information Flow in Transformers via Depth Weighted Averaging

[2402.15449] Repetition Improves Language Model Embeddings

2403.17919.pdf

2308.14711.pdf

arxiv.org/pdf/2301.03598.pdf

TrueSkill Through Time: Revisiting the History of Chess - Microsoft Research

Rachel Thomas, PhD - Making Peace with Personal Branding

Erik Demaine

[2305.19452] Bigger, Better, Faster: Human-level Atari with human-level efficiency

2310.15421.pdf

More Agents Is All You Need

2311.04930.pdf

Min P style sampling - an alternative to Top P/TopK · Issue #27670 · huggingface/transformers

Can tech save us from worst of climate change effects? Doesn’t look good — Harvard Gazette

Answer.AI - SB-1047 will stifle open-source AI and decrease safety

What is memory safety and why does it matter? - Prossimo

Critics Says Harvard’s Endowment Is Underperforming and Overly Secretive. Is It? | News | The Harvard Crimson

Let's reproduce GPT-2 (124M) - YouTube

dlc-handout-13-2-attention-mechanisms.pdf

uncertain_ground_truth/monte_carlo.py at main · google-deepmind/uncertain_ground_truth

Contributions QX

[2410.06508] Towards Self-Improvement of LLMs via MCTS: Leveraging Stepwise Knowledge with Curriculum Preference Learning

[2501.06252] $\text{Transformer}^2$: Self-adaptive LLMs

Introduction to Poker Theory - YouTube

[2104.03113] Scaling Scaling Laws with Board Games

Model Merging, Mixtures of Experts, and Towards Smaller LLMs

Exclusive | The Secrets and Misdirection Behind Sam Altman’s Firing From OpenAI - WSJ

Advanced Web Scraping With Python: Extract Data From Any Site

Simple Alpha Zero

Rich's Top 10 Readings - Google Docs

Random Network Distillation: a new take on Curiosity-Driven Learning | by Thomas Simonini | data from the trenches | Medium

The Lean FRO Year 2 Roadmap — Lean FRO

synthetic-data-kit/use-cases/awesome-synthetic-data-papers at main · meta-llama/synthetic-data-kit

werner-duvaud/muzero-general: MuZero

« prev 1...2666 266726682669 2670...3218 next »