Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

flexsearch integration? #14

Closed
testbird opened this issue Apr 1, 2021 · 1 comment
Closed

flexsearch integration? #14

testbird opened this issue Apr 1, 2021 · 1 comment
Assignees
Labels
documentation Improvements or additions to documentation question Further information is requested wontfix This will not be worked on

Comments

@testbird
Copy link

testbird commented Apr 1, 2021

Hi,
the readme mentions and contains a file listing referring to flexsearch.min.js

I gather it's possible to have a client (browser) based search that uses the generated index files. That looks like a really great feature to serve with a static site.

Could you include that in the plugin, or some template/doc or working example to get beginners going?

PS: This plugin looks like great alternative to the blackhole plugin.
Only keep catching myself wanting to use a shorter, speaking php bin/plugin generate index or ...pages ;-)

@OleVik OleVik self-assigned this Apr 1, 2021
@OleVik OleVik added documentation Improvements or additions to documentation question Further information is requested wontfix This will not be worked on labels Apr 1, 2021
@OleVik
Copy link
Owner

OleVik commented Apr 1, 2021

FlexSearch, or any other search-engine, itself is out-of-scope for inclusion in this plugin, because their standards vary a lot and some implementation is always necessary. I'm hoping to assemble a concise documentation for a plugin containing several search-engines, I just need to find some additional hours in the day. That'll rely on this plugin for a standardized output of searchable data.

Preamble out of the way, we can lift an example from the Scholar theme:

  1. Generate a metadata-index and a pages-index - including content - with the Static Generator, yielding user/data/persist/index.js and user/data/persist/index.full.js
    • We'll identify their data by the variable-names they use when generated, GravMetadataIndex and GravDataIndex, respectively. We won't usually load them at the same time, but it's nice to have the option
    • The two commands to do this are php bin/plugin static-generator index --wrap "/" and php bin/plugin static-generator index --wrap --content "/". The first indexes every Page below the root Page and wraps it as a .js-file, and the second does the same but including the Page(s) content
  2. Load the data into a template, so it's accessible to the browser. Because of how the indices are wrapped, they'll be accessible through window.GravMetadataIndex and window.GravDataIndex when loaded
    • A simple assets.addJs('user://data/persist/index.js') will suffice
    • Also load FlexSearch with assets.addJs('theme://node_modules/flexsearch/dist/flexsearch.min.js'), which is easily installed with npm install flexsearch, in the theme-folder
  3. Create a handler for initializing the FlexSearch-engine, or initialize it directly in some JS-file you are loading
    • The linked example includes bells and whistles for debouncing and listening to events in the DOM, but the relevant parts are:
    • FlexSearchOptions = options, which uses the options-variable we passed in with json_encode(config.theme.flexsearch.index) in the template. Two profiles are defined in scholar.yaml, we're using the simpler one
    • FlexSearchOptions.doc = {id: "url", field: fields}, where we tell FlexSearch which field to use as an index - in Grav that's always the route, which is unique - and which fields to search
    • The fields variable was passed from the template into the function, with ["title", "date", "taxonomy:categories", "taxonomy:tags", "media"]
    • var dataIndex = new FlexSearch(FlexSearchOptions);, which spawns the FlexSearch-instance with our options
    • dataIndex.add(data);, which adds the data we passed to the function into FlexSearch. This is all the documents in GravMetadataIndex, from user/data/persist/index.js
    • dataIndex.search(searchQuery, FlexSearchOptions.limit), which searches for a term passed in as searchQuery
    • The following promise-handlers do things with the results, specifically render them in the HTML

There's a few things to note about this approach:

  • We do not want to concatenate the JS-files containing data to search, we want any CDN-layer to cache it separately from other JS
  • FlexSearch is far and away the architecturally best search-engine in JS I've found in regards to performance and results, but it's fallen out of maintenance. It still beats Lunr, ElasticLunr, and quite a few fuzzy-search engines. I now favor MiniSearch as an alternative, even though it's far simpler in comparison
  • Debouncing matters, for the user typically more than the size of the data loaded. A slow interface is worse than a long load. Given that there's minimal wrapping or overhead in that data, search metadata has a low page-load cost compared to much other JS that a site typically uses
  • Static Generator is a quite generic solution, which means there's some assembly required for themes to work both dynamically with PHP and offline when assembled, when creating a static copy. It'll index Grav's Page(s) very effectively, but it does not adopt to any specific theme or solution for rendering the data

@OleVik OleVik closed this as completed Aug 16, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation question Further information is requested wontfix This will not be worked on
Projects
None yet
Development

No branches or pull requests

2 participants