Skip to content

Releases: aeron-io/aeron

1.41.0

14 Apr 13:57
Compare
Choose a tag to compare
  • Allow NameResolver to be configured for the ConsensusModule in order to support custom name resolution when configuring the ingress channel.
  • Delay election state transitions if there is an active leader to avoid unnecessary reset and new election.
  • Make AeronCluster.asyncConnect work completely asynchronously. Don't report exceptions to the error handler that are used for async resources.
  • Add a system property and API to allow changing a directory where an Archive mark file (archive-mark.dat) is stored.
  • Check the state of the interface when trying to resolve the multicast interface. Only use interfaces that are up. Issue #1387
  • CnC file length validation. Issue #1410
  • Fix issue of not capturing return code when recording signal arrives after an error to the archive client.
  • Support migrating segments to the beginning or end of an existing archive recording.
  • [C] Fix issue of using transport after it had been removed.
  • [Java] Fix concurrent close of receive destination counters on multi-destination subscriptions.
  • [C] Fix remove_if methods on pointer value maps which previously could miss an item.
  • Add debug logging for clustered service acking.
  • Add a specific error for archive replication failing to create a remote connection.
  • Fix leak with Archive replay session if the async publication has a session clash.
  • Shorten duration of cluster election after a leader has closed gracefully.
  • [C] Fix image rejoin by swapping correcting cooldown map insertion and removal. PR #1338
  • Candidate ballot for 5+ node cluster cannot be cut short on quorum otherwise most up to date member may not be elected.
  • [C] Allow for attempted recreation of an Image if initial attempt fails. PR #1435
  • Perform most replay validations before sending OK to the client so errors are synchronous when starting a replay.
  • Delete all recording segment files when a recording is truncated to its start position.
  • Close ArchiveMarkFile last when shutting down Archive to capture all errors.
  • [C++] Apply std::forward to fragment handler to avoid unnecessary copy. PR #1405
  • Fix handling of padding greater than max message length in Archive replay.
  • Add debug logging for Archive recording signals.
  • Close log subscription first when clustered service is cleanly closed to drop follower out of flow control as soon as possible.
  • Drop cluster follower as soon as possible out of flow control to allow cluster to progress when follower is cleanly closed.
  • [C] Report timeout accurately when driver keepalive beyond timeout. PR #1429
  • Add ability to run Archive with only IPC control channels for clients.
  • Add ClusterTool.isLeader method.
  • Add Image to Subscription before calling available handler rather than after.
  • Set URI in receiver counters to match subscription channel.
  • Add cluster member node state file and migrate out state that needs to be persistent, such as candidateTermId and member list, so the mark file can be in /dev/shm.
  • [C] Fix issue with removing naming resolver neighbor that deleted adjacent memory.
  • [C] Improve socket error handling on Windows.
  • [Java] Add toString() to many Aeron classes to help debugging.
  • [C] Improve parsing of unsigned 32-bit integers.
  • [C] Set max of resource free queue length and resource free limit to INT32_MAX. This stops them being incorrectly set to 0 by aeron_config_parse_uint32 when comparing against int32 0. PR #1421
  • Deprecate cluster dynamic join feature. This is to be replaced with a more robust and user friendly premium offering.
  • [C] Fix counter leak when subscription fails.
  • [C] Fix spy channel memory leak when destination is removed for multi-destination subscription.
  • [C] Fix channel memory leak on error when creating publications or subscriptions.
  • Fix NPE on timeout exception for cluster client in some connect states.
  • [Java] Improve efficiency of URI parsing.
  • [C] Fix error messages with incorrect varargs.
  • Warnings clean up in codebase to have less noisy CodeQL analysis.
  • Support having mark files for Archive, ConsensusModule, and ClusteredServiceContainer to be in alternative directory such a /dev/shm so timeouts can be avoided when recording writes queue up on a network filesystem.
  • Add timestamp params to stripped channel for pass through to Archive operations.
  • Queue resource freeing operations in driver to avoid timeouts when unmapping operations are slow.
  • [C++] Work around compiler concurrency bug for AtomicArrayUpdater that can impact client Subscriptions causing image list to become corrupted.
  • Improve javadoc for recording signal usage.
  • Be strict on handling cluster leader liveness to the current leadership term.
  • Only try unblocking a client command after liveness timeout to avoid "lost" commands. PR #1369
  • Make archive counters unique so multiple archives can run on the same media driver.
  • Truncate files after ArchiveTool.compact is invoked to free disk space.
  • Fix basic auction cluster tutorial configuration.
  • Improve ClusterConfig sample to allow for ingress configuration.
  • Add counters for the number of active recordings or replays in an Archive.
  • Add counters for reporting on read and write operations in an Archive.
  • Support allowing a ClusteredService being started before the ConsensusModule.
  • Improve false sharing protections for more consistent latency.
  • Simplify ReplayMerge samples to not require entity tags.
  • Add batch script for launching low-latency media driver on Windows.
  • Support message lengths greater than MTU in ping pong samples.
  • Fix options handling in cping sample.
  • Improve handling of timeouts in cluster elections for more robust state transitions when network is unstable. Effects are more pronounced in 5+ member clusters.
  • [Java] Add Aeron.addAsyncSubscripiton for non-block setup.
  • Compute source identity of images more precisely based on channel configuration.
  • Improved handling of out of disk space errors.
  • Support taking a cluster consensus module snapshot when member names are greater than MTU in length.
  • Allow a follower to veto a member being elected cluster leader if they believe the leader is not valid. This is important in 5+ node clusters.
  • Extend debugging for voting in cluster elections.
  • Increment error counter when invalid version exceptions occur.
  • Handle backpressure from commands between dedicated threads in driver with controlled polls to avoid live locks.
  • [C] Add support for controlled poll operations on SPSC and MPSC ring buffers.
  • Increase command queues to allow for more concurrent active changes in publications and images.
  • Serve cluster backup queries from followers to take load from the leader.
  • [C] Fix build when dot is used as thousands separator. PR #1372
  • Upgrade to JUnit 5.9.2.
  • Upgrade to BND 6.4.0.
  • Upgrade to ByteBuddy 1.14.3.
  • Upgrade to Mockito 4.11.0.
  • Upgrade to Version 0.46.0.
  • Upgrade to Gradle 7.6.
  • Upgrade to SBE 1.28.1.
  • Upgrade to Agrona 1.18.0.

1.40.0

21 Oct 20:44
Compare
Choose a tag to compare
  • Memory align allocated buffers in PublicationTest so it works on Apple M1 processors.
  • Check that NoOpLock is only allowed to be used when using Aeron client in invoker mode.
  • Handle case of a delayed concurrent offer to a publication in which other threads have raced terms ahead without throwing an exception.
  • Collapse term appenders into publications to reduce memory footprint and avoid data dependent loads.
  • Short circuit Image polling operation when bound limit is less than current position to prevent term overrun.
  • Add different aliases for consensus module/service container subscriptions. PR #1366.
  • Stop an active cluster log replay when ClusterBackup is closed rather than waiting for timeout.
  • Send unavailable counter events to Aeron clients when a client closes or times out.
  • Allow Consensus Module Agent to be run via an Invoker in addition to having its own thread.
  • Apply liveness checks to Archive and Cluster mark files so that multiple instances cannot be run in the same directory and corrupt files.
  • [Java] Use fixed format for timestamps in agent debug logs.
  • Allow Archive replicate to overwrite all metadata for an empty recording.
  • [C] Handle log buffer files with term_length == AERON_LOGBUFFER_TERM_MAX_LENGTH on Windows. PR #1360.
  • [C] Fix inclusion of symbols for debug builds on Windows.
  • Remove localhost defaults for Archive and Cluster to help avoid mis-configuration in production. PR #1356.
  • Await 'REPLICATE_END' when catching up as a follower across multiple leadership terms to avoid clashing session-id.
  • Allow setting of receive socket buffer and window on cluster log channel subscribers. PR #1345.
  • Fix application of send socket buffer lengths as configured when using MDC.
  • Fix ArchiveTool.dump when fragment length is set <= 0.
  • Capture closing sessions into snapshot so session close event is lost on cluster shutdown.
  • Remove brackets from counters labels to make it easier for extract to Prometheus.
  • Send cluster client session open acknowledgement before appending to the log to avoid race with service sending egress on open event. Issue #1351.
  • [C] Fix off by one error local socket address into channel indicator counter.
  • Add protocol version support to cluster consensus protocol.
  • Add more context to error messages on Archive ReplaySession. PR #1349.
  • Apply strict validation of consensus module snapshot state when messages are offered from clustered services. A number of customers have not been strict with all cluster nodes being deterministic and doing exactly the same thing which can result in corrupted and diverged snapshots.
  • Consensus module state snapshot can be inspected with the describe-latest-cm-snapshot option to ClusterTool.
  • If a consensus module snapshot is shown to be corrupt it may be fixed by running ConsensusModuleSnapshotPendingServiceMessagesPatch and if non-support customers wish to have help then they can contact sales@aeron.io. The patch can fix the leader and the fixed snapshot then needs to be replicated to the followers which can be done with AeronArchive.replicate using the correct recording ids.
  • Add a tool to replicate a specific recording between archives. PR #1363.
  • [C++] use getAsString calls for pollers for record descriptors for channel fields. Add test from PR #1348.
  • Add ClusteredService.doBackgroundWork which can be used for maintaining external connections beyond ingress and egress.
  • Increase default message timeout from 5 to 10 seconds for Archive clients.
  • Add EOS flag to status messages (SMs) once a stream is totally received so the sender can take clean up action.
  • When EOS status message is received by a sender then allow the publication linger on unicast to be cut short so resources are received sooner.
  • When EOS status message is received by a sender then remove the receiver from flow control for multicast and MDC with tagged and min FC.
  • Fix the closing of session specific subscriptions to prevent resource leak.
  • Add scripts for testing raw network performance on Windows.
  • Close egress from cluster on change of leader so clients can detect it before a new leader is elected.
  • Don't timeout and close cluster client session if quorum cannot be temporarily reached.
  • Add logging support for ClusterBackup state changes.
  • Close cluster clients when complete cluster is restarted.
  • Support automatic reconnect from cluster client when the same leader is re-elected after a net split or temporarily loosing quorum.
  • Add authentication for ClusterBackup to a cluster.
  • Validate Archive mark file length before reading when mapped read-only to avoid access violations.
  • Preserve iteration order for cluster client session based on session id so snapshots can have binary compatibility.
  • Capture leadership term id for cluster backup queries.
  • Account for padding when sweeping pending services messages to avoid out of bounds exception.
  • Prevent -1 leadership term ids appearing in the RecordingLog.
  • Allow Archive replication and replay request to specify session level file IO max buffer length for throttling a stream.
  • Add support for custom app version validation to clustered services with AppVersionValidator.
  • Add false sharing protection to DutyCycleTracker.
  • Update doc on ReplayMerge to indicate the AeronArchive client should not be shared. Issue #1340.
  • Upgrade to Versions 0.43.0.
  • Upgrade to Mockito 4.8.1.
  • Upgrade to Google Test 1.12.1.
  • Upgrade to JUnit 5.9.1.
  • Upgrade to ByteBuddy 1.12.18.
  • Upgrade to Gradle 7.5.1.
  • Upgrade to SBE 1.27.0.
  • Upgrade to Agrona 1.17.1.

Java binaries can be found here.

1.39.0

14 Jul 13:38
Compare
Choose a tag to compare
  • [Java] Fix IllegalStateException that could exist for an MDS subscription on the rapid recycling of ReplayMerge operations.
  • [C] Align ring buffer implementations and feature set with Java.
  • [Java] Make sure that C and Java are aligned on resend window. Re-instate the max message length being accounted in the bottom of the resend window for Java.
  • Add duty cycle duration tracking to all agents across all modules.
  • [C++] Improve efficiency by reducing the number of copy operations for fragment assembly when a stream has many fragmented messages.
  • [C] Default to CLOCK_REALTIME for send/receive timestamps.
  • [Java] Add setters for send/receive timestamp clocks to the MediaDriver.Context.
  • Fix handling of fragment assemble when reliable=false is set for a channel and loss occurs.
  • Improve handling of short sends on MDC publication to backoff from overloading a socket.
  • Add round-robin facility to MDC publication for increased fairness.
  • [Java] Publish aeron-test-support package as a JAR.
  • [Java] Downgrade "unknown replay" errors to warnings for cluster catchup.
  • [Java] Add appVersion to event logging for consensus module and check for correct app version when replaying log.
  • [Java] Prevent timeout warnings with cluster dynamic nodes and log replication.
  • [Java] Add cluster dynamic join state change logging events.
  • Add counters for the number of receivers in min and tagged flow control strategies.
  • [Java] Avoid race unmapping buffers on concurrent close of media drivers.
  • Modify flow control strategies to have new method for when elicited setups are sent and add counters manager to init methods. Modify Min and Tagged flow control to use setup snd-lmt as min position until timeout or receiver added on SM.
  • [Java] Account for possible padding in log buffer when checking for bottom resend window for retransmits.
  • [C] Flush output when printing configuration.
  • [C] Raise warning on failure to setup media timestamping.
  • [Java] Update recordingId on any signal with a valid recording id when handling signals for snapshot replication.
  • [Java] When attempting ClientSession.tryClaim, ensure that there is enough buffer space when returning a mocked offer for a follower.
  • [C] Ensure publication image is released before it it freed.
  • [C] Fix scanf that could result in buffer overflow when parsing HTTP for configuration.
  • [Java] Change default cluster session timeout from 5 to 10 seconds.
  • Prevent receiver joining min/tagged flow control if they are more than a window behind.
  • [C] Add sample for working with large messages.
  • [Java] Add logging event for appending a cluster session close.
  • Upgrade to BND 6.3.1.
  • Upgrade to Mockito 4.6.1.
  • Upgrade to ByteBuddy 1.12.10.
  • Upgrade to SBE 1.26.0.
  • Upgrade to Agrona 1.16.0.

Java binaries can be found here.

1.38.2

29 Apr 00:02
Compare
Choose a tag to compare

C Driver/Client Release Only

  • [C] Driver - Ensure the correct control address is used when adding multicast destinations with MDS.
  • [C] Driver - Allow thread affinity on CPU 0.
  • [C] API - Check handler parameter before polls. Check images for NULL before polling images.

No Java binaries for this release.

1.38.1

14 Apr 18:10
Compare
Choose a tag to compare
  • Upgrade to SBE 1.25.3.
  • Upgrade to Agrona 1.15.1.

Java binaries can be found here.

1.38.0

14 Apr 18:09
Compare
Choose a tag to compare
  • [Java/C/C++] Ensure driver is in ready state when requesting termination from client.
  • [Java] Reduce allocation when listing archive directories to find segment files.
  • [Java] Add flag to ClusterTerminationException to indicate if the termination was expected.
  • [Java] Expand agent logging for consensus module operations, be careful if using all for cluster events as volume may now be greatly expanded.
  • [C] Use connect and send to improve latency in C driver when sending data at lower volumes.
  • [Java] Improve reliability of transferring snapshots to ClusterBackup via archive replication with improved re-try semantics.
  • [Java] Support adding an IPC ingress destination to cluster leader for ingress optimisation.
  • [Java] Create replay publication asynchronously to reduce latency pauses in Archive.
  • [Java/C++] Add new RecordingSignal.REPLICATE_END recording signal to indicate end of a replication operation.
  • [Java/C++] Make delivery of RecordingSignals to archive client sessions reliable and ordered.
  • [Java] Support specifying interface with endpoints in cluster config for multi-home members. PR #1290.
  • [C] Add thread affinity support to C media driver. PR #1298.
  • [C/C++] Update CMake build to use FetchContent instead of ExternalProject.
  • [C/C++] Fix build on ARM with clang. PR #1291.
  • [Java] Improve progress tracking and retry semantics for cluster members catching up in elections.
  • [C/C++] Enable support for parallel build on Windows.
  • [Java] Add ability to async remove/close a publication by registration id.
  • [Java] Fix publication leak in ClusterBackup when backup response timesout.
  • [C] Improve agent logging in C media driver to be more consistent with Java drive.
  • [C] Allow for configurable IO vector for sendmmsg and recmmsg in the C media driver. PR #1285.
  • [C] Support static linking of the C media driver. PR #1261.
  • [Java/C] Support ability to extend concurrent publications by setting initial values to be equivalent to exclusive publications.
  • [Java] Fixed bug in PriorityHeapTimerService.cancelTimerByCorrelationId. PR #1281.
  • [C++] Improve error reporting in Archive client when a response is not received.
  • [Java/C++] Additional user specified delegating Invoker for Archive client to be used for progressing actions when awaiting responses.
  • [Java] Rename Archive segment files before delete to avoid races with streams being extended.
  • [C++] Fixes for ChannelUriStringBuilder. PR #1268.
  • [Java] Add admin command so that cluster snapshot can be triggered remotely via an authorised session.
  • [Java] Support authorisation of service actions with a new API AuthorisationService. The hooks for this have been added to Archive requests and Cluster Snapshot requests.
  • [Java/C] Support adding spy and IPC destinations to MDS subscriptions so destinations can be all channel types.
  • [Java] Ensure Cluster will start on a consistent initial term id when racing to create first term.
  • [Java] Prevent unnecessary creation of RecordingLog files when using ClusterTool.
  • [Java] Add cluster session timeout to set adjusted when debugging.
  • [C] Fixes to prevent message duplication and unnecessary sending of messages in MDS.
  • Minimum CMake version was raised to 3.14.
  • Upgrade to HdrHistogram_c 1.11.4.
  • Upgrade to BND 6.2.0.
  • Upgrade to Versions 0.42.0.
  • Upgrade to Mockito 4.4.0.
  • Upgrade to ByteBuddy 1.12.9.
  • Upgrade to Shadow 7.1.2.
  • Upgrade to Gradle 7.4.2.
  • Upgrade to JUnit 5.8.2.
  • Upgrade to Checkstyle 9.3.
  • Upgrade to SBE 1.25.2.
  • Upgrade to Agrona 1.15.0.

Java binaries can be found here.

1.37.0

26 Nov 18:10
Compare
Choose a tag to compare
  • [Java] Improve error messages on channel conflicts.
  • [C] Remove replicated command prefix in debug agent logging.
  • [Java] Use async publication add for async connect to an Archive to minimise the impact of name resolution pauses.
  • [Java] Make ClusterConfig.calculatePort public.
  • [C] Correct channel length on metadata for stream counters.
  • [Java] Extract channel value from counter label when longer than what will fit in metadata for StreamStat.
  • [Java] Relocate HdrHistogram and ByteBuddy in aeron-all JAR.
  • Upgrade to BND 6.1.0.
  • Upgrade to ByteBuddy 1.12.2.
  • Upgrade to Mockito 4.1.0.
  • Upgrade to SBE 1.25.1.
  • Upgrade to Agrona 1.14.0.

Java binaries can be found here.

1.36.0

19 Nov 22:25
Compare
Choose a tag to compare
  • [C/C++] Handle SIGINT in code samples.
  • [Java] Retry adding cluster member publication in election canvass to address late name registration in containers such as Kubernetes.
  • [Java] Log resolution failures in Cluster as warning event rather than exception.
  • [Java] Fix timestamp when publishing new leadership terms. PR #1254.
  • [C] Use separate transport bindings for the conductor doing name resolution. PR #1253.
  • [Java/C++] Allow the setting of a RecordingSignalConsumer in the archive client context which is delegated to when processing control channel responses.
  • [C] Improve error handling and logging on Windows when dealing with network system calls.
  • [Java] Verify cluster log is always contiguous when joining a new image in a service.
  • [Java] Fix race condition when sending RecordingSignal.SYNC during archive replication. PR #1252.
  • [Java/C] Improve choice of subscription for choosing channel URI when labelling receiver counters.
  • [Java] Sort counters displayed with StreamStat so they are logically grouped.
  • [Java] Improve error messages so they are more contextual.
  • [Java] Extend debugging logging for archive and cluster operations.
  • [Java] Check for errors when cluster snapshots are replayed.
  • [Java] Improve tracking of cluster commit position when replicating during an election.
  • [Java] Allow replication to skip over empty leadership terms due to failed elections when initially starting cluster.
  • [C] Better handling of finding user for default aeron.dir when USER is not set in environment.
  • [Java/C++] Reduce cache invalidations when using pollers for archive and cluster response streams.
  • [Java] Add support for changing cluster log params by truncated to the latest snapshot and resetting configuration. PR #1233.
  • [Java] Don't catch subclasses of Throwable and instead catch Exception so that the JVM can handle subclasses of Error.
  • [Java/C] Improve validation of ports used in channel URIs.
  • [C] Support building on Apple ARM.
  • [Java] Add priority heap backing implementation for cluster timers as an alternative to the default timer wheel implementation
  • Upgrade to Mockito 4.0.0.
  • Upgrade to Shadow 7.1.0.
  • Upgrade to BND 6.0.0.
  • Upgrade to Gradle 7.2.
  • Upgrade to ByteBuddy 1.12.1.
  • Upgrade to Checkstyle 9.1.
  • Upgrade to SBE 1.25.0.
  • Upgrade to Agrona 1.13.0.

Java binaries can be found here.

1.35.1

06 Sep 17:30
Compare
Choose a tag to compare
  • [Java] Fix selection of channel based on add publication registration id rather than original registration id. Issue #1218.

Java binaries can be found here.

1.35.0

09 Aug 16:21
Compare
Choose a tag to compare
  • [Java] Fix truncation of linger timeout in ChannelUriStringBuilder which lead to a short linger of Archive replays.
  • [Java] Remove incorrect publication linger validation.
  • [C] Add sanitize build for MSVC and fix issues found.
  • [C] Add missing free of counters associated with Cubic congestion control.
  • [C++] Fix missing use of FragmentAssembler in Archive response and clean up type warnings.
  • [Java] Fix packaging declaration in POM file.
  • [Java] Separate thread factories for replay and recording agents in Archive for when setting thread affinity is required.
  • [Java] Javadoc improvements.
  • [C] Agent logging fixes. PR #1198.
  • [Java/C] Support a list of bootstrap neighbours for fault tolerance in gossip protocol for driver naming.
  • [C] Handle connection reset without error when polling a socket on Windows.
  • [C++] Don't progress with archive connect until response subscription is available. PR #1196.
  • [Java] Use async publication adding for response channels from the Archive and response channels for egress and backup queries from the Cluster to reduce latency pauses for existing operations.
  • [Java] Ability to add publications asynchronously to Aeron client.
  • [C/Java] Support timestamping of packets for channel send and receive plus media/hardware receive timestamping if supported. PR #1195.
  • [Java] Ensure termination hook is run on unexpected interrupt during cluster election.
  • [Java] Reset cluster election state if in election and an exception happens outside the election work cycle.
  • [Java] Finish deleting pending archive recording for deletion on shutdown.
  • [Java] Ensure cluster log recording has stopped before restarting the election process to avoid spurious election failure from past recording stopping.
  • Upgrade to Google Test 1.11.0.
  • Upgrade to Mockito 3.11.2.
  • Upgrade to ByteBuddy 1.11.9.
  • Upgrade to Gradle 7.1.1.
  • Upgrade to SBE 1.24.0.
  • Upgrade to Agrona 1.12.0.

Java binaries can be found here.