🦛 CHONK your texts with Chonkie ✨ - The no-nonsense RAG chunking library
-
Updated
Feb 23, 2025 - Python
🦛 CHONK your texts with Chonkie ✨ - The no-nonsense RAG chunking library
NCRF++, a Neural Sequence Labeling Toolkit. Easy use to any sequence labeling tasks (e.g. NER, POS, Segmentation). It includes character LSTM/CNN, word LSTM/CNN and softmax/CRF components.
Content-Addressable Data Synchronization Tool
An extensible Java framework for building event-driven applications that break up XML and non-XML data into chunks for data integration
Alternative casync implementation
A package for parsing PDFs and analyzing their content using LLMs.
A fast, lightweight and easy-to-use Python library for splitting text into semantically meaningful chunks.
A TensorFlow implementation of Neural Sequence Labeling model, which is able to tackle sequence labeling tasks such as POS Tagging, Chunking, NER, Punctuation Restoration and etc.
The RAG Experiment Accelerator is a versatile tool designed to expedite and facilitate the process of conducting experiments and evaluations using Azure Cognitive Search and RAG pattern.
An LLM GUI application; enables you to interact with your files, offering dynamic parameters that can modify response behavior during runtime.
webpack 2, react hotloader 3, react router v4, code splitting and more
📑 Split Laravel jobs into multiple separate job chunks
An asynchronous event-driven HTTP client based on netty.
Грамматический Словарь Русского Языка (+ английский, японский, etc)
Fast multi-threaded content-dependent chunking deduplication for Buffers in C++ with a reference implementation in Javascript. Ships with extensive tests, a fuzz test and a benchmark.
Labelling Sequential Data in Natural Language Processing with R - using CRFsuite
🍱 semantic-chunking ⇢ semantically create chunks from large document for passing to LLM workflows
Add a description, image, and links to the chunking topic page so that developers can more easily learn about it.
To associate your repository with the chunking topic, visit your repo's landing page and select "manage topics."