Long-run Anki review load

View source | View history | Atom feed for this file

(I like to think of the calculations done on this page as the “curse of spaced repetition”: if you keep adding a constant number of cards per day to Anki, your review load will grow (logarithmically) over time, meaning as the years go by you will have more and more work to do every day just to keep up with your existing knowledge.)

Let’s say you add cards to Anki per day. To simplify calculations for now, let’s say the ease factor is an integer, which we’ll call , and you never get cards wrong, so each card is reviewed on the day it is created, then days later; after that, days later; after that, days later, and so on. What does the long-run review load look like? In other words how many cards will be due on day for some large ?

On day , there will be:

where is the largest such that .

The sum by the geometric series formula.

So we want . For simplicity, we can just assume is chosen to be a day such that a exists that makes the two values exactly equal. Solving for , we get .

The actual number of cards due on day is , so finally we obtain for the number of cards due.

For some reason, prior to going through the math above, I had the mistaken impression that because of spaced repetition’s exponential spacing rule, no matter how long I kept using Anki I’d have roughly the same amount of reviews each day. But you can see this is wrong: the review load is logarithmic in the number of days, so the review load grows without bound! This means that (in asymptotic terms) a spaced repetition practitioner will at first have quite an easy time keeping up on cards, but if they keep adding roughly the same amount of cards on average every day, then over time their daily practice will become more and more (unboundedly!) onerous.

So in the limit of indefinite lifespan, we have a few possibilities:

I find the first three options quite sad (I don’t want to stop learning, I don’t want burdensome reviews, and I don’t want data loss)! And I’m not sure how feasible the third option is. The fourth option is, well, probably what will end up happening but depends on thinking about AI probably so is outside the scope of this note.

Of course, the behavior in the limit might not concern us if we only expect to live a typical human lifespan, and so we can hope the constants are nice enough that the review load stays small. Taking and (we started out assuming is an integer, but there’s nothing in the final formula that requires it to be such), we get:

Year Review load at start of year
1 5
2 34
3 38
4 40
5 42
6 43
7 44
8 45
9 46
10 46
11 47
12 47
13 48
14 48
15 49
16 49
17 50
18 50
19 50
20 50
21 51
22 51
23 51
24 52
25 52
26 52
27 52
28 52
29 53
30 53
31 53
32 53
33 53
34 53
35 54
36 54
37 54
38 54
39 54
40 54
41 55
42 55
43 55
44 55
45 55
46 55
47 55
48 55
49 56
50 56

We can compare this to other hobbies. A tennis player might not enjoy the thought that as they keep playing tennis throughout the years, they will have to spend more and more time on it each day (unboundedly!). But they would probably find it a bargain that on an average day of their 20th year playing tennis, they will only have to spend 47% () more effort on it as they spend on an average day near the end of one year of practice. (One imagines that at the end of the first year they are still a new hobby player, casually playing, while after 20 years they are significantly more invested in the game and player a lot more as a result.)

Appendix: simulation

Here’s some Python code that will simulate reviews:

#!/usr/bin/env python3

from math import ceil

# each day will be a list of integers giving the interval lengths of the cards
# that are reviewed on that day
days = []

final_load = []

def inner_append(lst, int_idx, val):
    if len(lst) < int_idx + 1:
        for _ in range(int_idx - len(lst) + 1):
            lst.append([])
    lst[int_idx].append(val)

if __name__ == "__main__":
    for n in range(365*50+1):
        # new cards added on this day
        for _ in range(5):
            inner_append(days, int_idx=n, val=3)

        # record the review load for this day
        final_load.append(len(days[n]))

        # do the reviews
        for card in list(days[n]):
            inner_append(days, int_idx=n + card, val=card * 3)
            days[n].remove(card)

        # print("days:", days)
        # print("final_load:", final_load)
    for year in range(50):
        print(f"year {year}: {final_load[365*year]}")

Backlinks