Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Moka counter & multiple cache integration #179

Closed
hellrezaa opened this issue Sep 12, 2022 · 3 comments
Closed

Moka counter & multiple cache integration #179

hellrezaa opened this issue Sep 12, 2022 · 3 comments
Assignees
Labels
question Further information is requested

Comments

@hellrezaa
Copy link

hellrezaa commented Sep 12, 2022

Dear Sir,
I'm a new rustacean so sorry if my questions are a little amateur! I really love the concept of moka and what you've built, it is exactly what I was searching for!

I would like to integrate multiple async / future caches with different TTL & capacities, to run on my actix web server.

I have two key questions I would like to ask:

  1. I would love to use moka futures as a counter, however I noticed that there isn't an alternative for the classic "entry API" found in std::hashmap.
    Like this Reddit example counter.
    The entry API in std::hashmap allows to first search, if not exist insert a default value (e.g start counter at 1) or if it exists, to simply +=1 avoiding two lookups.
    I searched the source code and saw moka cht offers a "insert_or_modify" function which could reproduce this!
    From what I understand, the futures moka library only implements get, get_with and insert, not the moka-cht "insert_or_modify" - so how would I go about approaching building a counter using moka futures?
    Ideally I would love to be able to just do a get and increment the counter if found. However I would be willing to compromise and do a "insert_or_modify" from moka cht library, I'm just not sure how to integrate this into the moka futures library - is insert or modify synchronous? I am assuming it is lower level / upstream?

  2. I noticed to enforce the TTL policy we use a housekeeper that runs on "scheduled thread pool", whilst actix web runs a web server app instance on each thread.
    Does this mean that moka cache can be friendly & share the CPU with the actix web server? I have read web servers are blocking operations, so moka housekeeper can integrate itself into this?
    I know you have stated we are fine for actix-rt, however I just wanted some reassurance that multiple caches, let's say 4 caches for example, would be able to share a single CPU with actix web server app? Would this create 4 unique housekeepers that would all share my single thread with my web server?
    I am assuming the cleanup process would spread out and my cloud provider may not be very happy, but as long as it works I'm happy!

Many thanks for your time, amazing work!!! 🙌🏽

@tatsuya6502 tatsuya6502 self-assigned this Sep 13, 2022
@tatsuya6502 tatsuya6502 added the question Further information is requested label Sep 13, 2022
@tatsuya6502
Copy link
Member

tatsuya6502 commented Sep 13, 2022

Hi. Thank you for trying Moka.

  1. I would love to use moka futures as a counter

You are right. The current version of Moka (v0.9.x) does not provide such a method to implement counter. It might be a good addition to a future version, but we do not have enough bandwidth now to do it.

Also, you cannot directly use moka::cht::SegmentedHashMap::insert_or_modify from your application, because the cht hashmap is one of many internal data structures that Moka's cache layer uses. For example, cache's max capacity is managed by something called "access order queue", and time to live is managed by "write order queue". They are separate data structures from the cht hashmap. If your application uses insert_or_modify directly, it will break the cache as it will leave other internal data structures unmodified.

But there is a good news; you may be able to implement counter with current version of Moka, by doing some trick. You will use Arc<AtomicU64> as the counter, instead of u64, so that you can increment the counter without writing to the cache. (API documents: std::sync::Arc, std::sync::atomic)

The following code snippet will demonstrate how to do it.

However, you should use this trick with caution. It will not work with some features like event listener. I will explain more details later.

// Cargo.toml
// [dependencies]
// actix-rt = "2.7.0"
// moka = { version = "0.9.4", features = ["future"] }

use std::{
    sync::{
        atomic::{AtomicU64, Ordering},
        mpsc, Arc,
    },
    time::Duration,
};

fn main() -> Result<(), Box<dyn std::error::Error>> {
    const KEY: &str = "key";

    let _ = actix_rt::System::new();
    let arbiter = actix_rt::Arbiter::new();

    let cache = moka::future::Cache::builder()
        .max_capacity(100)
        // You should use time to idle, instead of time to live.
        .time_to_idle(Duration::from_secs(60))
        .build();

    let (tx, rx) = mpsc::channel();

    arbiter.spawn(async move {
        // First access. (Insert)
        // This will insert a new counter (with 0) to the cache and return it.
        let count = cache
            .get_with(KEY, async { Arc::new(AtomicU64::default()) })
            .await;
        assert_eq!(count.load(Ordering::Acquire), 0);
        // And then, increment the counter by 1. (count: 1)
        count.fetch_add(1, Ordering::AcqRel);

        // Second access. (Update)
        // This will return the existing counter. (count: 1)
        let count = cache
            .get_with(KEY, async { Arc::new(AtomicU64::default()) })
            .await;
        assert_eq!(count.load(Ordering::Acquire), 1);
        // And then, increment the counter by 1. (count: 2)
        count.fetch_add(1, Ordering::AcqRel);

        assert_eq!(count.load(Ordering::Acquire), 2);

        tx.send("Done").unwrap();
    });

    let _ = rx.recv()?;

    Ok(())
}

The basic idea is to use a cache read operation to get an existing counter, and increment it without using a cache write operation.

Here are the limitation of this trick:

Expiration (TTL vs TTI)

  • (problem) Incrementing an existing counter does not reset its TTL expiration clock.
    • If you set time-to-live to 60 seconds, a counter will expire after 60 seconds, even if it is incremented every 1 second.
  • (workaround) If your application needs to reset the expiration clock when incrementing a counter, use time-to-idle instead.
    • If you set time-to-idle to 60 seconds, and a counter is incremented every 1 second, that counter will not expire after 60 seconds from the insertion.
    • If you stop incrementing the counter, it will expire after 60 seconds from the last increment.

Eviction Listener

https://docs.rs/moka/0.9.4/moka/future/struct.Cache.html#example-eviction-listener

  • (problem) Incrementing an existing counter does not emit a cache update event, so if you set an event listener to the cache, it will not be called.
  • There is no workaround for this.

Size-based Eviction

https://docs.rs/moka/0.9.4/moka/future/struct.Cache.html#example-size-based-eviction

  • (problem) Incrementing an existing counter does not recalculates its weighted size.
  • There is no workaround for this.
    • You would not use size-based eviction with counter anyway, so this should not be a problem.

@tatsuya6502
Copy link
Member

tatsuya6502 commented Sep 13, 2022

Does this mean that moka cache can be friendly & share the CPU with the actix web server?

Short answer:

Yes, moka cache is friendly and shares the CPU with the actix web server.

Long answer:

On your cloud provider, you will run an instance of Virtual Private Server (VPS) with a UNIX-like operating system such as Linux or FreeBSD, right? On that VPS instance, you will run your web application built upon actix-web, and optionally run a web server (e.g. nginx) in front of it.

Suppose your VPS runs Linux, your web application will be a single Linux process, and it will contain actix web's async runtime, and moka caches with a global housekeeper thread pool.

actix web's async runtime will run two thread pools; one for async tasks including request handlers you will write, and another for blocking tasks. Each thread pool will have multiple Linux threads (inside the Linux process).

moka caches will run a single global housekeeper thread pool. The housekeeper thread pool will have multiple Linux threads. All cache instances will share a single housekeeper thread pool (inside the same Linux process).

Linux will schedule and run all Linux threads by time slicing. So, the housekeeper thread pool will share the CPU with the actix web's async runtime.

Note that the threads in moka's global housekeeper thread pool should not be very busy unless your application does a lot of insertions of new entries (counters), e.g. millions insertions per second. So they will be parked most of the time and will not consume much CPU time.

@tatsuya6502
Copy link
Member

Closing this issue as all questions were answered. Please reopen if needed.

I created #227 to track the followings:

The current version of Moka (v0.9.x) does not provide such a method to implement counter. It might be a good addition to a future version, but we do not have enough bandwidth now to do it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants