This page keeps track of AI safety papers I am reading, to help me remember why I want to read a paper, where I get stuck, etc.
The “Status/lessons” column tracks where I am in the paper and what I didn’t understand, what background I seem to be missing, etc.
The “Source/motivation” column tracks how I came across the paper and why I want to read the paper. (These two things are often connected, so I combined the column.)
|“Reflective Oracles: A Foundation for Game Theory in Artificial Intelligence”||At statement of theorem 4.1. Decided I wasn’t comfortable enough with game theory (2019-01-09).||I’ve seen this paper mentioned a bunch.|
|“Logical Induction”||I read the beginning parts of this paper twice and watched Andrew Critch’s talk on YouTube. I am slowly digesting the definitions and so forth.||This seems to be one of MIRI’s big results, so I want to understand it. I think I originally decided to read it because I wanted to understand decision theory better.|
|“AI safety via debate”||I finished reading the paper (2019-01-04, 2019-01-05). I think I need to know more about computational complexity (to appreciate the debate hierarchy analogy) and about machine learning in general||I wanted to understand the Paul/OpenAI approach better.|
|“Supervising strong learners by amplifying weak experts”||I finished reading the paper (2019-01-05). I think I need more familiarity with machine learning to appreciate the paper.||I wanted to understand the Paul/OpenAI approach better.|