curius graph
☾
Dark
all pages
search
showing 133351-133400 of 160880 pages (sorted by popularity)
« prev
1
...
2666
2667
2668
2669
2670
...
3218
next »
[2212.03827] Discovering Latent Knowledge in Language Models Without Supervision
1 user ▼
Discover 1X | A Gearless Future
1 user ▼
Choosing the right GPU for deep learning on AWS | by Shashank Prasanna | Towards Data Science
1 user ▼
Why Nvidia’s AI Supremacy is Only Temporary « Pete Warden's blog
1 user ▼
[2202.11233] Retrieval Augmented Classification for Long-Tail Visual Recognition
1 user ▼
HQQ quantization
1 user ▼
Quantization
1 user ▼
Quantization for Neural Networks - Lei Mao's Log Book
1 user ▼
[2212.08045] CLIPPO: Image-and-Language Understanding from Pixels Only
1 user ▼
[2403.03163] Design2Code: How Far Are We From Automating Front-End Engineering?
1 user ▼
How I Learned to Concentrate | The New Yorker
1 user ▼
How to Build a GPT-3 for Science | Future
1 user ▼
Command-R: Retrieval Augmented Generation at Production Scale
1 user ▼
How to ML Paper - A brief Guide - Google Docs
1 user ▼
Timo Schick | Toolformer: Language Models Can Teach Themselves to Use Tools - YouTube
1 user ▼
Steering Llama-2 with contrastive activation additions — LessWrong
1 user ▼
[2402.02622] DenseFormer: Enhancing Information Flow in Transformers via Depth Weighted Averaging
1 user ▼
[2402.15449] Repetition Improves Language Model Embeddings
1 user ▼
2403.17919.pdf
1 user ▼
2308.14711.pdf
1 user ▼
arxiv.org/pdf/2301.03598.pdf
1 user ▼
TrueSkill Through Time: Revisiting the History of Chess - Microsoft Research
1 user ▼
Rachel Thomas, PhD - Making Peace with Personal Branding
1 user ▼
Erik Demaine
1 user ▼
[2305.19452] Bigger, Better, Faster: Human-level Atari with human-level efficiency
1 user ▼
2310.15421.pdf
1 user ▼
More Agents Is All You Need
1 user ▼
2311.04930.pdf
1 user ▼
Min P style sampling - an alternative to Top P/TopK · Issue #27670 · huggingface/transformers
1 user ▼
Can tech save us from worst of climate change effects? Doesn’t look good — Harvard Gazette
1 user ▼
Answer.AI - SB-1047 will stifle open-source AI and decrease safety
1 user ▼
What is memory safety and why does it matter? - Prossimo
1 user ▼
Critics Says Harvard’s Endowment Is Underperforming and Overly Secretive. Is It? | News | The Harvard Crimson
1 user ▼
Let's reproduce GPT-2 (124M) - YouTube
1 user ▼
dlc-handout-13-2-attention-mechanisms.pdf
1 user ▼
uncertain_ground_truth/monte_carlo.py at main · google-deepmind/uncertain_ground_truth
1 user ▼
Contributions QX
1 user ▼
[2410.06508] Towards Self-Improvement of LLMs via MCTS: Leveraging Stepwise Knowledge with Curriculum Preference Learning
1 user ▼
[2501.06252] $\text{Transformer}^2$: Self-adaptive LLMs
1 user ▼
Introduction to Poker Theory - YouTube
1 user ▼
[2104.03113] Scaling Scaling Laws with Board Games
1 user ▼
Model Merging, Mixtures of Experts, and Towards Smaller LLMs
1 user ▼
Exclusive | The Secrets and Misdirection Behind Sam Altman’s Firing From OpenAI - WSJ
1 user ▼
Advanced Web Scraping With Python: Extract Data From Any Site
1 user ▼
Simple Alpha Zero
1 user ▼
Rich's Top 10 Readings - Google Docs
1 user ▼
Random Network Distillation: a new take on Curiosity-Driven Learning | by Thomas Simonini | data from the trenches | Medium
1 user ▼
The Lean FRO Year 2 Roadmap — Lean FRO
1 user ▼
synthetic-data-kit/use-cases/awesome-synthetic-data-papers at main · meta-llama/synthetic-data-kit
1 user ▼
werner-duvaud/muzero-general: MuZero
1 user ▼
« prev
1
...
2666
2667
2668
2669
2670
...
3218
next »