API design for precomputation cache [closed]

In my numeric code library I have a function totient_sum that depends on an expensive one-time precomputation totsum_range = [...], then different calls to totient_sum(n) are quick. There are several ways I could manage this precomputation as an API: totient_sum(n, totsum_range): The user is responsible for doing the precomputation explicitly and passing it in. More control but more annoying to explicitly pass in the precomputation. CLRS algorithms are usually written like this and it is a more functional style (pure, no hidden state). TotientSum class with init: The user initializes an instance like TotientSum(10**8) that does the precomputation in self.totsum_range. Then the function is implemented as a method TotientSum.totient_sum(n). The nice thing about this is it lets the user control the initialization. I believe sympy's Sieve class is written this way. totient_sum(n) and hidden module-level cache: The library maintains hidden storage for pre-computation. This is what I believe sympy's Sieve does. The hidden storage starts off uninitialized. The first time totient_sum(n) is called, the pre-computation is done and then stored, and subsequent calls can extend the stored list if needed. This is convenience at the expense of hidden computation magic to the user. Did I miss anything in my consideration of pros and cons? To me it is a balancing act of how much the user explicitly does versus module implicitly doing. Most widely used python libraries seem to lean heavily on making classes for everything rather than the bare function or pure functional approach.

Apr 24, 2025 - 15:39
 0
API design for precomputation cache [closed]

In my numeric code library I have a function totient_sum that depends on an expensive one-time precomputation totsum_range = [...], then different calls to totient_sum(n) are quick. There are several ways I could manage this precomputation as an API:

  1. totient_sum(n, totsum_range): The user is responsible for doing the precomputation explicitly and passing it in. More control but more annoying to explicitly pass in the precomputation. CLRS algorithms are usually written like this and it is a more functional style (pure, no hidden state).

  2. TotientSum class with init: The user initializes an instance like TotientSum(10**8) that does the precomputation in self.totsum_range. Then the function is implemented as a method TotientSum.totient_sum(n). The nice thing about this is it lets the user control the initialization. I believe sympy's Sieve class is written this way.

  3. totient_sum(n) and hidden module-level cache: The library maintains hidden storage for pre-computation. This is what I believe sympy's Sieve does. The hidden storage starts off uninitialized. The first time totient_sum(n) is called, the pre-computation is done and then stored, and subsequent calls can extend the stored list if needed. This is convenience at the expense of hidden computation magic to the user.

Did I miss anything in my consideration of pros and cons? To me it is a balancing act of how much the user explicitly does versus module implicitly doing. Most widely used python libraries seem to lean heavily on making classes for everything rather than the bare function or pure functional approach.