Long-run Anki review load
(I like to think of the calculations done on this page as the “curse of spaced repetition”: if you keep adding a constant number of cards per day to Anki, your review load will grow (logarithmically) over time, meaning as the years go by you will have more and more work to do every day just to keep up with your existing knowledge.)
Let’s say you add cards to Anki per day. To simplify calculations for now, let’s say the ease factor is an integer, which we’ll call , and you never get cards wrong, so each card is reviewed on the day it is created, then days later; after that, days later; after that, days later, and so on. What does the long-run review load look like? In other words how many cards will be due on day for some large ?
On day , there will be:
- cards due that were created that day,
- cards that were created days ago that are now due,
- cards that were created days ago that are now due,
- …
- cards that were created days ago that are now due,
where is the largest such that .
The sum by the geometric series formula.
So we want . For simplicity, we can just assume is chosen to be a day such that a exists that makes the two values exactly equal. Solving for , we get .
The actual number of cards due on day is , so finally we obtain for the number of cards due.
For some reason, prior to going through the math above, I had the mistaken impression that because of spaced repetition’s exponential spacing rule, no matter how long I kept using Anki I’d have roughly the same amount of reviews each day. But you can see this is wrong: the review load is logarithmic in the number of days, so the review load grows without bound! This means that (in asymptotic terms) a spaced repetition practitioner will at first have quite an easy time keeping up on cards, but if they keep adding roughly the same amount of cards on average every day, then over time their daily practice will become more and more (unboundedly!) onerous.
So in the limit of indefinite lifespan, we have a few possibilities:
- We give up on learning new facts after a certain point, or learn fewer and fewer facts as time goes on.
- We “eat the cost”, agreeing to the unbounded review load as time goes on.
- We keep deleting cards as time goes on – maybe once “basic” facts are absorbed, we can delete them and instead have a smaller number of “synthesis” cards or something.
- We use a “superexponential” backoff schedule – something that makes cards go further out, enough to keep the review load constant.
- (Some sort of technological breakthrough that makes spaced repetition practice unnecessary.)
I find the first three options quite sad (I don’t want to stop learning, I don’t want burdensome reviews, and I don’t want data loss)! And I’m not sure how feasible the third option is. The fourth option is, well, probably what will end up happening but depends on thinking about AI probably so is outside the scope of this note.
Of course, the behavior in the limit might not concern us if we only expect to live a typical human lifespan, and so we can hope the constants are nice enough that the review load stays small. Taking and (we started out assuming is an integer, but there’s nothing in the final formula that requires it to be such), we get:
Year | Review load at start of year |
---|---|
1 | 5 |
2 | 34 |
3 | 38 |
4 | 40 |
5 | 42 |
6 | 43 |
7 | 44 |
8 | 45 |
9 | 46 |
10 | 46 |
11 | 47 |
12 | 47 |
13 | 48 |
14 | 48 |
15 | 49 |
16 | 49 |
17 | 50 |
18 | 50 |
19 | 50 |
20 | 50 |
21 | 51 |
22 | 51 |
23 | 51 |
24 | 52 |
25 | 52 |
26 | 52 |
27 | 52 |
28 | 52 |
29 | 53 |
30 | 53 |
31 | 53 |
32 | 53 |
33 | 53 |
34 | 53 |
35 | 54 |
36 | 54 |
37 | 54 |
38 | 54 |
39 | 54 |
40 | 54 |
41 | 55 |
42 | 55 |
43 | 55 |
44 | 55 |
45 | 55 |
46 | 55 |
47 | 55 |
48 | 55 |
49 | 56 |
50 | 56 |
We can compare this to other hobbies. A tennis player might not enjoy the thought that as they keep playing tennis throughout the years, they will have to spend more and more time on it each day (unboundedly!). But they would probably find it a bargain that on an average day of their 20th year playing tennis, they will only have to spend 47% () more effort on it as they spend on an average day near the end of one year of practice. (One imagines that at the end of the first year they are still a new hobby player, casually playing, while after 20 years they are significantly more invested in the game and player a lot more as a result.)
Appendix: simulation
Here’s some Python code that will simulate reviews:
#!/usr/bin/env python3
from math import ceil
# each day will be a list of integers giving the interval lengths of the cards
# that are reviewed on that day
= []
days
= []
final_load
def inner_append(lst, int_idx, val):
if len(lst) < int_idx + 1:
for _ in range(int_idx - len(lst) + 1):
lst.append([])
lst[int_idx].append(val)
if __name__ == "__main__":
for n in range(365*50+1):
# new cards added on this day
for _ in range(5):
=n, val=3)
inner_append(days, int_idx
# record the review load for this day
len(days[n]))
final_load.append(
# do the reviews
for card in list(days[n]):
=n + card, val=card * 3)
inner_append(days, int_idx
days[n].remove(card)
# print("days:", days)
# print("final_load:", final_load)
for year in range(50):
print(f"year {year}: {final_load[365*year]}")
External links
- After I wrote all of the above, I came across this reddit post that provides some empirical demonstration