Releases: bacalhau-project/bacalhau
v1.6.4
S3 Publisher Enhancements
- Plain Encoding Support: Introduced an option to publish job results without gzip compression, facilitating more efficient data pipelines by allowing subsequent jobs to access individual files directly.
Environment Variable Management
- Host Environment Variable Forwarding: Added the ability to securely forward host environment variables to job executions. This feature enables passing credentials and secrets from the host to jobs through a controlled allowlist mechanism.
Docker Image Improvements
- CA Certificates Installation: Ensured that CA certificates are installed in our Docker images, enhancing security and enabling secure HTTPS communications within containerized environments.
These updates aim to enhance the functionality, security, and flexibility of Bacalhau.
Full Changelog: v1.6.3...v1.6.4
v1.6.3
Docker Image Enhancements
-
New Docker-in-Docker (DinD) Image: Introduced a new Docker image variant specifically designed for compute nodes:
- Added
bacalhau:v1.6.3-dind
andbacalhau:latest-dind
images - Full Docker execution capability for compute nodes
- Built-in Docker daemon support for container workloads
- Requires privileged mode for Docker-in-Docker functionality
- Added
-
Image Variants Clarification:
- Base image (
bacalhau:v1.6.3
,bacalhau:latest
): Optimized for client usage, orchestrator nodes, and non-Docker compute workloads - DinD image (
bacalhau:v1.6.3-dind
,bacalhau:latest-dind
): Designed for compute nodes requiring Docker execution capabilities
- Base image (
Operational Improvements
- Docker Deployment:
- Clearer separation of concerns between different node types
- Improved documentation for Docker image usage
Full Changelog: v1.6.2...v1.6.3
v1.6.2
Performance and Stability Improvements
- Enhanced Network Performance: Implemented batching and rate limiting of executions per scheduling cycle, ensuring all executions are processed while maintaining system stability when orchestrating jobs across hundreds of nodes
- Improved Message Handling: Extended publish and subscribe timeout windows for more graceful network communication
- Critical Bug Fixes:
- Resolved boltdb transaction panic issues during context cancellation
- Fixed execution re-approval logic in the scheduler
- Enhanced transaction handling for better stability
Observability Enhancements
- Comprehensive Metrics Coverage: Introduced OpenTelemetry metrics across multiple core components:
- Job store metrics for storage monitoring
- Scheduler metrics for job distribution insights
- Planner metrics for execution planning visibility
- NCL (Network Communication Layer) metrics
- Message handler metrics for network communication monitoring
Operational Improvements
- Development Environment:
- Enhanced devstack logging format
- Added support for node joining existing networks
- Improved configuration passing mechanisms
Full Changelog: v1.6.1...v1.6.2
v1.6.1
Major Improvements
-
Partitioned Execution Support:
Added support for splitting jobs across multiple executions with automatic partition management. The feature includes:- Partition assignment and tracking
- Independent execution progress monitoring
- Granular failure handling with retry of only failed partitions
- Each execution receives its partition details through environment variables, enabling partition-aware processing when needed
-
S3 Input Partitioning:
Added automatic data distribution for S3 inputs across multiple executions using configurable strategies:- Multiple partitioning strategies: Users can choose between object-based distribution for even splitting, regex patterns for structured data, substring matching for fixed formats, or date-based partitioning for temporal data
- Even distribution of data without requiring custom partition code
- Support for shared data access through non-partitioned inputs
- Automatic data subset assignment to each execution
Full Changelog: v1.6.0...v1.6.1
v1.6.0
Bacalhau v1.6.0 Release Notes
We are excited to announce the release of Bacalhau v1.6.0, introducing a new communication architecture that significantly improves the reliability and resilience of distributed compute networks.
Key Features and Improvements
New Bacalhau Messaging Protocol (BMP)
At the heart of this release is the new messaging protocol, a complete redesign of node communication that brings significant improvements to network reliability:
Key Benefits
- Self-Healing Network: Compute nodes and orchestrators automatically reconnect and sync after network interruptions
- Offline-First Operation: Compute nodes can start and operate even when disconnected from the orchestrator
- Automatic State Recovery: When nodes reconnect, they automatically share all missed job execution information and results
- Zero Data Loss: Ensures no job execution data or results are lost during network disruptions
- Seamless Recovery: Network interruptions are handled transparently without requiring manual intervention
Technical Improvements
- Reliable Message Delivery: Ordered, at-least-once message delivery between nodes
- Automatic Recovery: Built-in failure detection and recovery mechanisms
- Connection Health Monitoring: Proactive health checks and connection management
- Event-Based Architecture: Decoupled event processing from message delivery
- Efficient Checkpointing: Maintains system state for reliable recovery
- Backward Compatibility: Maintains compatibility with v1.5 orchestrators
Enhanced Web UI Experience
- Direct Result Downloads: Download job results directly from the interface
- Simplified Configuration: Automatic request routing eliminates manual IP configuration
- Improved Architecture: Streamlined backend setup while maintaining security
Operational Improvements
- Reverse Proxy Support: Added capability to run orchestrator behind a reverse proxy
- Agent Configuration: New
bacalhau agent config
command to inspect agent configuration - TLS Support: Added TLS encryption support for NATS communication
- Better Logging: Implemented more human-readable logging patterns
Upgrade Notes and Backward Compatibility
Bacalhau v1.6.0 maintains backward compatibility while introducing the new BMP:
- Compute nodes maintain compatibility with v1.5 orchestrators, and vice versa
- Support for re-handshake from legacy clients
We're excited for you to experience the enhanced reliability and resilience provided by the BMP in Bacalhau v1.6.0. This release represents a significant architectural advancement in making distributed computing more robust and dependable.
v1.5.1
Major Improvements
- Enhanced Web UI Routing: Improved routing of Web UI requests without requiring backend address definition
- Faster Startup: Dramatically reduced node startup time from ~9 seconds to ~1.5 seconds by optimizing IMDS access
- Job Management: Added support for stopping jobs using short IDs
- Bug Fix: Resolved issues with default publishers functionality
Breaking Changes
- Removed exec command and job translation functionality
Additional Changes
- Added Docker compose support
- Improved API error handling
Links
- Full Changelog: v1.5.0...v1.5.1
v1.5.1-rc1
Major Improvements
- Enhanced Web UI Routing: Improved routing of Web UI requests without requiring backend address definition
- Faster Startup: Dramatically reduced node startup time from ~9 seconds to ~1.5 seconds by optimizing IMDS access
- Job Management: Added support for stopping jobs using short IDs
- Bug Fix: Resolved issues with default publishers functionality
Breaking Changes
- Removed exec command and job translation functionality
Additional Changes
- Added Docker compose support
- Improved API error handling
Links
- Full Changelog: v1.5.0...v1.5.1-rc1
v1.5.0
Bacalhau v1.5 Release Notes
We're thrilled to announce the release of Bacalhau 1.5.0, a significant update that introduces powerful new features and enhancements. Building on the momentum from our previous releases, Bacalhau 1.5 focuses on simplifying configuration, improving visibility, and enhancing overall performance.
Key Features and Improvements
Simplified Configuration Management
- New File-Based Configuration System: We've introduced a more intuitive file-based configuration system, replacing complex CLI flags. This change makes setting up and managing Bacalhau networks more straightforward and less error-prone.
- Flexible Configuration Options: Users can now provide:
- A single config file
- Multiple config files that are merged
- Key-value pairs directly via the
-c
- flag (e.g.,-c key=value
)
- Decoupled Configuration: Configuration is now decoupled from the repo (now called data dir), allowing for more flexible setups.
Enhanced Data Directory Structure
- Improved Organization: We've clearly separated compute and orchestrator related data, providing a cleaner structure.
- Consolidated Metadata: System metadata is now consolidated into a single
system_metadata.json
file for easier management.
New WebUI
- Embedded Management Interface: Introduced a comprehensive WebUI for easier management and monitoring of your Bacalhau network. This significant feature allows users to visualize and interact with their Bacalhau deployment without relying solely on the CLI.
Enhanced Job Visibility and Reporting
- Granular Event Reporting: Improved reporting on job progress, including detailed scheduling actions, failures, and retries.
- Better Error Messages: Enhanced error reporting system with meaningful messages and debugging hints.
API Enhancements
- Pagination for Job History: Implemented pagination support for job history, improving the user experience when dealing with a large jobs and making it easier to navigate through job and execution history events.
Upgrade Notes and Backward Compatibility
While Bacalhau 1.5.0 introduces some breaking changes, we've ensured a smooth upgrade path:
- Most CLI flags have been removed in favor of configuration files, but we gracefully handle deprecated flags for backward compatibility.
- The structure of the data directory has changed, but we automatically handle the migration when you first run the new Bacalhau version.
- Many old configuration options have been deprecated in favor of the new structure and config keys.
Please refer to our [updated documentation](https://docs.bacalhau.org/) for detailed instructions on upgrading to Bacalhau 1.5.0 and taking advantage of the new configuration system.
We're excited for you to explore the new features and enhancements in Bacalhau 1.5.0. Whether you're a seasoned Bacalhau user or just getting started, this update will empower you to build and run distributed compute networks more effectively than ever before.
v1.4.0
Announcing Bacalhau 1.4.0
We’re excited to announce the release of Bacalhau 1.4.0, a significant update that introduces powerful new features and enhancements. Building on the momentum from our previous releases this year (1.2.0, 1.3.0, 1.3.1, and 1.3.2), Bacalhau 1.4 strengthens our platform’s performance, scalability, and user experience, solidifying its position as a leading platform for building and running distributed compute networks.
In this release, we focused on three major efforts, with particular emphasis on those deploying Bacalhau at scale:
Performance and Scalability Enhancements
-
Extended Job Queuing: Bacalhau 1.4.0 introduces a more robust queuing system, improving job scheduling and execution efficiency, especially in high-demand or globally distributed networks. By intelligently managing job queues, Bacalhau ensures smoother operations and increased throughput, leading to higher success rates for your distributed compute tasks.
-
Migration to NATS, Deprecation of libp2p and Embedded IPFS Node: We’ve fully transitioned to NATS.io as Bacalhau’s communication backbone, moving away from libp2p and the embedded IPFS node. This change streamlines communication and reduces overhead, marking a significant step towards a more efficient and scalable network. IPFS integration remains available with external nodes for those who need it.
Improved User Experience
-
Updated CLI and HTTP API: Bacalhau 1.4.0 introduces a revamped command-line interface (CLI) and HTTP API. These updates align the CLI commands with the new API structure and enhance overall usability. While most changes are seamless for existing users, some command adjustments have been made (e.g., bacalhau create becomes bacalhau job run). Our updated documentation will guide you through the transition smoothly.
-
Job Spec Updates: We've introduced an updated Job Specification format while deprecating some features of the previous format. This change requires users to update their job specs but brings improved clarity and consistency.
-
Enhanced Error Reporting: Bacalhau 1.4.0 improves error reporting, making it easier to diagnose and troubleshoot issues. This enhancement contributes to a more stable and reliable experience, helping users quickly resolve any problems that arise. For detailed guidance, please consult our documentation on the new Job Spec requirements.
-
Introduction of Node Manager: In Bacalhau 1.4.0, we’re introducing the Node Manager. This feature simplifies node operations, providing a clear view of all compute nodes and their status. You can approve, deny, or delete nodes as needed, making management straightforward. Heartbeats from nodes keep the Node Manager updated on their connectivity, enhancing overall stability and performance.
Smooth Transition for Existing Users
- Error Handling and Guidance: We understand that transitioning to a new version can be challenging. To ease this process, we’ve implemented helpful error messages and guidance for those adjusting to the changes in CLI behavior and job specifications. We’ve also created a table to show how some of the Bacalhau API endpoints have been remapped. If you’re not ready to upgrade, you can continue using version 1.3.1 while maintaining your private Bacalhau cluster.
Join Us on the Journey
We’re excited for you to explore the new features and enhancements in Bacalhau 1.4.0. Over the next five days, we’ll dive deeper into each topic in our “5 Days of Bacalhau” blog series. Whether you’re a seasoned Bacalhau user or just getting started, this update will empower you to build and run distributed compute networks more effectively than ever before.
v1.3.2
What's Changed
- Splitting to Consumer and Producer Client by @udsamani in #4027
- refactor: define config instance. config is no longer global by @frrist in #3959
- ops: update prod cluster vars to reflect current state by @frrist in #4034
- introduce job queueing when no nodes were found by @wdbaruni in #4049
- filter out nodes with high queue capacity by @wdbaruni in #4051
- Update canaries to 1.3.1 by @wdbaruni in #4039
Full Changelog: v1.3.1...v1.3.2