Sample Pages (Top 50 by confidence)
ML Mentorship: Some Q/A about RL | Eric Jang
https://evjang.com/2021/07/30/rl-qa.html
1 user
Last: Jan 07, 2026
100% confidence
The Difficulty of Passive Learning in Deep Reinforcement Learning
https://psc-g.github.io/posts/research/rl/tandem
1 user
Last: Jan 07, 2026
100% confidence
2020 RL highlights
https://psc-g.github.io/posts/research/rl/2020highlights
1 user
Last: Jan 07, 2026
100% confidence
7 Model-based RL Papers I liked from NeurIPS 2020 · RL & other stories
https://proceduralia.github.io/home/2020/12/30/neurips2020.html
1 user
Last: Jan 07, 2026
100% confidence
Solving Montezuma's Revenge with Planning and Reinforcement Learning
http://agarri.ga/publication/solving-mr-planning-rl
1 user
Last: Jan 07, 2026
100% confidence
Exploration for the Efficient Deployment of Reinforcement Learning Agents
https://openreview.net/pdf?id=E3zbXrF2Xq
1 user
Last: Jan 07, 2026
100% confidence
Understanding Actor Critic Methods and A2C
https://towardsdatascience.com/understanding-actor-critic-methods-931b97b6df3f
1 user
Last: Jan 07, 2026
100% confidence
RAGEN - RL Agent
https://ragen-ai.github.io
1 user
Last: Jan 07, 2026
100% confidence
An Introduction to Deep Reinforcement Learning
https://huggingface.co/blog/deep-rl-intro
1 user
Last: Jan 07, 2026
100% confidence
Exploration Strategies in Deep Reinforcement Learning | Lil'Log
https://lilianweng.github.io/posts/2020-06-07-exploration-drl
1 user
Last: Jan 07, 2026
100% confidence
Quasimetric RL (QRL)
https://www.tongzhouwang.info/quasimetric_rl
1 user
Last: Jan 07, 2026
100% confidence
Is Value Learning Really the Main Bottleneck in Offline RL? | HTML5
https://ar5iv.labs.arxiv.org/html/2406.09329
1 user
Last: Jan 07, 2026
100% confidence
Graph-Assisted Stitching for Offline Hierarchical Reinforcement Learning | HTML5
https://ar5iv.labs.arxiv.org/html/2506.07744
1 user
Last: Jan 07, 2026
100% confidence
When Should We Prefer Offline Reinforcement Learning Over Behavioral Cloning? | HTML5
https://ar5iv.labs.arxiv.org/html/2204.05618
1 user
Last: Jan 07, 2026
100% confidence
The Challenges of Exploration for Offline Reinforcement Learning | HTML5
https://ar5iv.labs.arxiv.org/html/2201.11861
1 user
Last: Jan 07, 2026
100% confidence
Behavior From the Void: Unsupervised Active Pre-Training | HTML5
https://ar5iv.labs.arxiv.org/html/2103.04551
1 user
Last: Jan 07, 2026
100% confidence
Temporal Difference Models: Model-Free Deep RL for Model-Based Control | HTML5
https://ar5iv.labs.arxiv.org/html/1802.09081
1 user
Last: Jan 07, 2026
100% confidence
Provably Good Batch Reinforcement Learning Without Great Exploration | HTML5
https://ar5iv.labs.arxiv.org/html/2007.08202
1 user
Last: Jan 07, 2026
100% confidence
[2006.10701] Deep Reinforcement Learning amidst Lifelong Non-Stationarity
https://ar5iv.labs.arxiv.org/html/2006.10701
1 user
Last: Jan 07, 2026
100% confidence
Reinforcement learning with prediction-based rewards
https://openai.com/research/reinforcement-learning-with-prediction-based-rewards
1 user
Last: Jan 07, 2026
100% confidence
2022 - by Julian Quevedo - reinforcement learning
https://jujujulian.substack.com/p/2022?s=r
1 user
Last: Jan 07, 2026
100% confidence
WBE and DRL: a Middle Way of imitation learning from the human brain : r/reinforcementlearning
https://www.reddit.com/r/reinforcementlearning/comments/9pwy2f/wbe_and_drl_a_mid...
1 user
Last: Jan 07, 2026
100% confidence
Reinforcement learning an introduction
http://incompleteideas.net/book/RLbook2020.pdf
2 users
Last: Jan 07, 2026
100% confidence
Exploration via Elliptical Episodic Bonuses
https://e3bagent.github.io
1 user
Last: Jan 07, 2026
100% confidence
🗂️ A Taxonomy of Reinforcement Learning Algorithms | Arushi Somani
https://www.amks.me/notes/taxonomy
1 user
Last: Jan 07, 2026
100% confidence
Neural Architecture Search
https://lilianweng.github.io/lil-log/2020/08/06/neural-architecture-search.html
1 user
Last: Jan 07, 2026
100% confidence
Reinforcement Learning [Part 0] | Luca Palmieri
https://www.lpalmieri.com/posts/rl-introduction-00
1 user
Last: Jan 07, 2026
100% confidence
Rapidly exploring random tree - Wikipedia
https://en.wikipedia.org/wiki/Rapidly_exploring_random_tree
1 user
Last: Jan 07, 2026
100% confidence
Reinforcement Learning: Exploring Policy vs. Value-Based Methods
https://dataheadhunters.com/academy/reinforcement-learning-exploring-policy-vs-v...
1 user
Last: Jan 07, 2026
100% confidence
Reinforcement Learning with Prediction-Based Rewards
https://openai.com/blog/reinforcement-learning-with-prediction-based-rewards
1 user
Last: Jan 07, 2026
100% confidence
A (Long) Peek into Reinforcement Learning
https://lilianweng.github.io/lil-log/2018/02/19/a-long-peek-into-reinforcement-l...
1 user
Last: Jan 07, 2026
100% confidence
Curiosity-driven Exploration by Self-supervised Prediction
https://pathak22.github.io/noreward-rl
1 user
Last: Jan 07, 2026
100% confidence
It's Time to Move On: Primacy Bias and Why It Helps to Forget | ICLR Blogposts 2024
https://iclr-blogposts.github.io/2024/blog/primacy-bias-and-why-it-helps-to-forg...
1 user
Last: Jan 07, 2026
100% confidence
Domain Randomization for Sim2Real Transfer | Lil'Log
https://lilianweng.github.io/posts/2019-05-05-domain-randomization
2 users
Last: Jan 07, 2026
100% confidence
Contrastive Learning As a Reinforcement Learning Algorithm
https://ben-eysenbach.github.io/contrastive_rl
1 user
Last: Jan 07, 2026
100% confidence
[2107.02850] Survey of Self-Play in Reinforcement Learning
https://arxiv.org/pdf/2107.02850.pdf
1 user
Last: Jan 07, 2026
100% confidence
finding geodesics on graphs using reinforcement learning
https://arxiv.org/pdf/2010.04820.pdf
1 user
Last: Jan 07, 2026
100% confidence
Reinforcement Learning in Newcomblike Problems
https://proceedings.neurips.cc/paper/2021/file/b9ed18a301c9f3d183938c451fa183df-...
2 users
Last: Jan 07, 2026
100% confidence
Whiteson Research Lab | University of Oxford
https://whirl.cs.ox.ac.uk/pages/research.html
1 user
Last: Jan 07, 2026
100% confidence
Evolving Curricula
https://accelagent.github.io
1 user
Last: Jan 07, 2026
100% confidence
Epsilon-Greedy Q-learning | Baeldung on Computer Science
https://www.baeldung.com/cs/epsilon-greedy-q-learning
1 user
Last: Jan 07, 2026
100% confidence
Part 1: Key Concepts in RL — Spinning Up documentation
https://spinningup.openai.com/en/latest/spinningup/rl_intro.html?highlight=advan...
1 user
Last: Jan 07, 2026
100% confidence
State–action–reward–state–action - Wikipedia
https://en.wikipedia.org/wiki/State%E2%80%93action%E2%80%93reward%E2%80%93state%...
1 user
Last: Jan 07, 2026
100% confidence