BlueDot Evals Reading Group

We aim to stay on top of cutting-edge research in AI Safety Evaluations and develop a thoughtful community of critical thinkers eager to apply their skills to AI Safety. Sign up to attend using the links in the Event Link column in the Schedule table below. In addition to participating as an attendee, you can suggest a paper for us to cover here. If you’d like to take even more initiative, you can volunteer to present a paper using this form and we will get back to you.

If you’re new here and wondering what this is all about, check out our guide “How to Eval” where we explain what an eval is, how to get the most from the reading group, and more!

</aside>

Schedule

Date	Paper
7 October	https://arxiv.org/abs/2507.20526
14 October	https://alignment.anthropic.com/2025/automated-auditing/
21 October	Evidence for Limited Metacognition in LLMs
28 October	https://arxiv.org/abs/2507.23701
4 November	AgentMisalignment: Measuring the Propensity for Misaligned Behaviour in LLM-Based Agents
11 November	Agentic Reinforcement Learning for Search is Unsafe
18 November	TBC
25 November	TBC
2 December	TBC

FAQ

Attendance and Comms

How can I attend? What’s the schedule?
How can I get into the Slack?

Discussion Norms

Do I ask questions during the presentation?
Who are the people asking questions after the presentation - are they official discussants?

Papers and Presenting

How do the papers get selected? Can I suggest papers?
Can I present?
Why should I present?

Presentation Archive

Here are some of the papers we’ve read in the past:

Date	Paper	Slides/Recording
2 September 2025	https://arxiv.org/abs/2406.07358
9 September 2025	https://arxiv.org/abs/2503.08679
16 September 2025	https://metr.org/blog/2025-08-08-cot-may-be-highly-informative-despite-unfaithfulness/
23 September 2025	https://arxiv.org/abs/2507.05246
30 September 2025	https://arxiv.org/abs/2402.07510