Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sqlite history backend #131

Closed
wants to merge 1 commit into from
Closed

sqlite history backend #131

wants to merge 1 commit into from

Conversation

fdncred
Copy link
Collaborator

@fdncred fdncred commented Jul 30, 2021

This is a first stab at a MVP that uses a sqlite backend for history. This code compiles, however I have not run it even once. I had difficulty fitting it into the existing history traits since they seem to want a file backed history and the database trait wants to be async. Good work to the team so far on making history work so well. So, I shoe-horned this code into a pseudo filebacked/sqlbacked history just to get it to compile. It's not how I'd want it to run and I don't like it in its current state but it gets some points across about how awesome we could make the history.

Ideally, we'd use the current history trait so that when you hit up arrow/traverse the history, it does some type of select from the database and returns a HistoryItem or Vec. Right now, it's just creating a database, loading the commands into a vec and ignoring the rest of the HistoryItem.

Another thought I had is, that it would be nice to have so caching perhaps. So, ever navigation doesn't require a select. I'm not sure that's viable, but it's worth thinking about.

If you want to play around with the sqlite prototype app, feel free to clone it from here https://github.com/fdncred/hiztery. All the functions of the Database trait are implemented in this test app.

I'm hoping for people to jump in and make this work as an alternate history backend because there is a lot of potential.

Here's an attempt at some rudimentary documentation.

Each row in the database follow this struct. There are some todo comments about potentially adding other columns that may come in handy later.

pub struct HistoryItem {
// primary key, unique id
    pub history_id: Option<i64>,
// command
    pub command: String,
// current working directory, could be used to filter history
// queries to showing only commands that you ran in this directory
    pub cwd: String,
// how long it took the command to run
    pub duration: i64,
// what was the exit status/return status of the command that just ran
    pub exit_status: i64,
// this is the pid of the process running
    pub session_id: i64,
// what time did this happen, stored in nanos
    pub timestamp: chrono::DateTime<Utc>,
// other ideas
// used to keep track of how many times you run a specific command
// it could provide the ability to create our own z, zoxide type tools
// pub run_count: i64
// these two separate the commands from the parameters, not sure
// what to do with these but something cool could be created
// pub command_root: String
// pub command_params: String // stored as param, param, param
}

All the database interaction happens in the database.rs file. Here's the db trait.

pub trait Database {
    async fn save(&mut self, h: &HistoryItem) -> Result<(), sqlx::Error>;
    async fn save_bulk(&mut self, h: &[HistoryItem]) -> Result<(), sqlx::Error>;
    async fn load(&self, id: &str) -> Result<HistoryItem, sqlx::Error>;
    async fn list(&self, max: Option<usize>, unique: bool)
        -> Result<Vec<HistoryItem>, sqlx::Error>;
    async fn range(
        &self,
        from: chrono::DateTime<Utc>,
        to: chrono::DateTime<Utc>,
    ) -> Result<Vec<HistoryItem>, sqlx::Error>;
    async fn update(&self, h: &HistoryItem) -> Result<(), sqlx::Error>;
    async fn history_count(&self) -> Result<i64, sqlx::Error>;
    async fn first(&self) -> Result<HistoryItem, sqlx::Error>;
    async fn last(&self) -> Result<HistoryItem, sqlx::Error>;
    async fn before(
        &self,
        timestamp: chrono::DateTime<Utc>,
        count: i64,
    ) -> Result<Vec<HistoryItem>, sqlx::Error>;
    async fn search(
        &self,
        limit: Option<i64>,
        search_mode: SearchMode,
        query: &str,
    ) -> Result<Vec<HistoryItem>, sqlx::Error>;
    async fn query_history(&self, query: &str) -> Result<Vec<HistoryItem>, sqlx::Error>;
    async fn delete_history_item(&self, id: i64) -> Result<u64, sqlx::Error>;
}

it also supports these types of searches. see the hiztery repo for how these search modes work.

pub enum SearchMode {
    // #[serde(rename = "prefix")]
    Prefix,

    // #[serde(rename = "fulltext")]
    FullText,

    // #[serde(rename = "fuzzy")]
    Fuzzy,
}

credit goes to atuin where most of the db code came from.

Just thinking outloud here, it would be nice if the History traits used generics so we don't always have to return a String for the command. we could return a HistoryItem or whatever 3rd parties want to return. Same for some of the input parameters. On some of the History traits I'd like to pass in a HistoryItem.

Copy link
Contributor

@nixypanda nixypanda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had difficulty fitting it into the existing history traits since they seem to want a file backed history and the database trait wants to be async.

Is there any particular reason for database trait being async? I mean would the performance gains be worth the mental overhead to think async?

we'd use the current history trait so that when you hit up arrow/traverse the history, it does some type of select from the database and returns a HistoryItem or Vec

Any examples of what all fields we will use in engine from this HistoryItem structure? I mean from a usage standpoint I can only speculate what all you have in mind. If you can share some thoughts on what features you had in mind around this, that would be great.
Some of the use cases that I can envision include

  • Being able to run history queries like <command> b/w -7d to -14d (where the user is like I am sure this is in the history somewhere around last week or something. Obviously, this is mere speculation so please don't really focus on the syntactic aspect - which can be improved)
  • jump/z/etc like autocompletion based on all the available history data.
  • based on previous error codes we highlight the command as red (?)
  • etc. (I would like to know what all you were planning on)

We can always modify the current implementation of FileBackHistory to use a common trait/same struct. Once we agree on something we can modify it or we can simply go with a trait that currently provides a string command and use that and then postpone this thinking.

use std::str::FromStr;

#[async_trait]
pub trait Database {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like to have queries all start with get. It just plays nicer with autocomplete etc.
load -> get_by_id
range -> get_in_range
first -> get_first
etc....

The usage will look like this

let command_repo = CommandRepo::new();
let command = command_repo.get_by_id(...)?;

or something similar. Thoughts?

Copy link
Collaborator Author

@fdncred fdncred Aug 3, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ya, sure. sounds good to me. i'd love to standardize things. i'm not sure if this is a @jntrnr call or not. 1/2 of me thinks JT wouldn't care, the other 1/2 of me thinks JT may be particular about function names. Either way, I'm cool with whatever.

&self,
limit: Option<i64>,
search_mode: SearchMode,
query: &str,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can use some help understanding what this query is. Other info is pretty clear from the type info but I am not so sure about query.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't mean to be a broken record here but all of the database code is best understood by looking at the database sample app I wrote called hiztery mentioned above. Here's how it's implemented in that code base.

        Some(HizteryCmd::Search {
            search_mode,
            limit,
            query,
        }) => {
            // cargo run -- search -m "p" -q "code"
            debug!(
                "Searching with phrase: {}, limit: {:?}, mode: {}",
                &query, limit, &search_mode
            );
            let s_mode = match search_mode.as_ref() {
                "p" => SearchMode::Prefix,
                "f" => SearchMode::FullText,
                "z" => SearchMode::Fuzzy,
                _ => SearchMode::FullText,
            };

            let result = sqlite.search(limit, s_mode, &query).await;
            match result {
                Ok(r) => {
                    debug!("Found {} hits", r.len());
                    for (idx, hit) in r.iter().enumerate() {
                        debug!("Hit # [{}] History: [{}]", idx + 1, hit.command);
                    }
                }
                _ => debug!("No hits found for phrase: {}", &query),
            }
        }

Ok(Self { pool, pid })
}

async fn setup_db(pool: &SqlitePool) -> Result<(), sqlx::Error> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have pool as an instance variable for this struct. Right? So why do we need to inject this in the function too?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whatever you think is best. Just a lot of copy-n-pasting.

Ok(())
}

fn convert_time(h: &HistoryItem) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain what this function is for?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function isn't used yet but I mocked it up so that it could be used to translate from timestamp_nanos to print out human readable times. This is how save stores datetime.

    async fn save_raw(
        tx: &mut sqlx::Transaction<'_, sqlx::Sqlite>,
        h: &HistoryItem,
    ) -> Result<(), sqlx::Error> {
        // We don't need the history_id here because it's an auto number field
        // so it should be ever increasing
        sqlx::query(
            "insert or ignore into history_items(timestamp, duration, exit_status, command, cwd, session_id)
                values(?1, ?2, ?3, ?4, ?5, ?6)",
        )
        .bind(h.timestamp.timestamp_nanos())
        .bind(h.duration)
        .bind(h.exit_status)
        .bind(h.command.as_str())
        .bind(h.cwd.as_str())
        .bind(h.session_id)
        .execute(tx)
        .await?;

        Ok(())
    }

Comment on lines +350 to +351
search_mode: SearchMode,
query: &str,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't we merge these together? Won't enums allow for that?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ya, sure. Feel free to rework it if you like. I'm up for anything really. I just want to keep the power of sqlite in here.

@fdncred
Copy link
Collaborator Author

fdncred commented Aug 3, 2021

Is there any particular reason for database trait being async? I mean would the performance gains be worth the mental overhead to think async?

because the database.rs file was ripped off from atuin and i just left it. :)

If you can share some thoughts on what features you had in mind around this, that would be great.

the comments above in the HistoryItem struct code have some ideas for usage. My real point about this type of stuff is if you don't capture it, you'll never do anything with it. so let's capture everything we can.

  • yup - date range searching is supported
  • yup - z-type cd'ing should be supported - i use zoxide and with the addition of run_count in the db table and HistoryItem struct, it should be do-able
  • yup - capturing error codes and doing something with them should be exposed.
  • there's also a search mode that may allow us to do some broot, skim, fzf type tech
  • i also like the perf metrics we could capture so you could do 'show me the top 5 longest running commands'

We can always modify the current implementation of FileBackHistory to use a common trait/same struct.

i think that would be good but it would be nice if it were maybe Generics so people can implement it how they want. My sticking point was the current one is so String based, which of course, is a default. But if you want to do something different, like return a HistoryItem you can't easily because you have to return strings.

@fdncred
Copy link
Collaborator Author

fdncred commented Aug 3, 2021

wow! thank for all these great questions! this is awesome! I'm really just trying to plant the seed here so reedline can do awesome things with a sqlite backed history. i encourage you to work on this, up to and including taking ownership of it and make any changes that seem appropriate. i'm down with whatever the team wants to do. i don't have the rust skillz that ya'll do.

@sophiajt
Copy link
Contributor

sophiajt commented Aug 9, 2021

Some quick thoughts -

I really like the idea of using sqlite as our history backend. There seems to be a ton of potential here not just for showing the history, but for filtering it and using it in fun ways like atuin does.

I wanted to give it a couple days on the async part. Coming back, I agree with @sherubthakur that for the full implementation I think we want to avoid async. It increases the complexity quite a bit, and we really don't use any asynchrony anywhere else in reedline. If there are other sqlite crates we could use and get similar power without taking on the additional complexity, I'd be more comfortable.

On a related note, with this new functionality, feels like we should definitely start brainstorming how we want to add UI elements in reedline. The nice thing about using crossterm is that we, in theory, have a pretty flexible terminal interface to work with. So if we wanted we should be able to draw popups and whatnot. There are also TUI libraries that work with crossterm (this one comes to mind but I think there are others, too)

@fdncred fdncred closed this Aug 10, 2021
@fdncred
Copy link
Collaborator Author

fdncred commented Aug 23, 2021

If anyone is still watching this dead PR, I created a new demo app here that uses the same database trait and sqlite struct. It should be enough to review this PR or create a new one.

@phiresky
Copy link
Contributor

phiresky commented Apr 1, 2022

Since i almost didn't find this code I just want to link this related issue in nushell: nushell/nushell#3364 as well as
this: https://github.com/m42e/zsh-histdb-skim which is semi-related (screenshot below for UI inspiration, I think skim could be used here directly), as well as the above-mentioned atuin which I didn't know about either.

image

@fdncred
Copy link
Collaborator Author

fdncred commented Apr 1, 2022

unfortunately, skim isn't cross platform due to the use of termion. but still the screenshots are cool. i really like the idea of "session or directory or host or everywhere" type of searching history. and being the nerd that i am, i love the stats on the right. thanks for adding this. i think we're getting closer to starting on this. there are people who want to use and developers who want to write it. :)

@phiresky
Copy link
Contributor

phiresky commented Apr 1, 2022

unfortunately, skim isn't cross platform due to the use of termion

Ah, I didn't know about that. The relevant issue is here, I guess: skim-rs/skim#293

think we're getting closer to starting on this

I'm taking a bit of a stab at it right now, let me know if there's anything specific I should consider. I'm doing it with rusqlite (to avoid async) and my plan is to use a db structure similar to histdb.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants