Skip to content

General Concerns #140

Closed
Closed
@benwilson512

Description

@benwilson512

Over the course of working through the PR to fix #128 and #129 I've run into a variety of different problems that I'd like to discuss. These problems are unfortunate, because Exq serves a very important role in the community. Many people are coming from a Ruby / Rails background and Exq helps make integrating Elixir with existing Ruby systems much easier. Nonetheless, the problems I present here individually and in aggregate are serious enough to warrant great caution on the part of any who might consider using this library.

I bring these problems up not only because I want Exq to succeed, but because I only have the available time to fix a handful of the most serious ones. A lot of valuable work has gone into this library, particularly around managing the redis association in such a way as to be compatible with Sidekiq. I'm hopeful that this post will bring about awareness of the situation so that others can address what I cannot.

NOTE: Two of these I've already created separate issues for, but I include them here for the sake of completeness.

* indicates that it will be at least partially addressed in benwilson512#1

Severe Issues

The ambiguities around supervision in general and job supervision in particular mean that I have no idea how this library behaves when I call ``./bin/myapp stop` via exrm for example. While it shuts down is it possible for a job to still get started? What happens to running jobs? Clarity here is paramount as I need to be sure that deploying new code to production will not orphan jobs.

General Issues

These are a hodgepodge of anti-patterns, code smells, and barriers to contribution.

  • *No real top level Supervisor
  • *Every GenServer has its own one_for_one supervisor for no discernable reason.
  • *Hardcoded configuration defaults (not in themselves bad) exist in many disparate locations, often redundantly.
  • Max restarts in supervisors are higher than normal, and in some cases VERY high.
  • Configuration not actually passed through to all children
    • The {:ok, sup} = Exq.Enqueuer.start_link([port: 6379]) example from the readme would seem to indicate that individual components of the system can be started with custom configuration. However, the various hardcoded Code.get calls mean that some parts will still fallback to Exq defaults whereas others will actually use the provided configuration.
  • Queue state is largely duplicated between the manager's process state and an :ets table that is only accessed by that same process.
  • handle_call({:retries}, _from, state) why is :retries wrapped in a tuple? This pattern is ubiquitous.
  • GenServer.start_link(__MODULE__, [opts], []) and then init([opts])
    • Also ubiquitous. The needless wrapping and unwrapping of values adds noise and confusion.
  • handle_info catch all clauses with Logger.error("INVALID MESSAGE #{info}")
    • Unlike handle_call and handle_cast, handle_info is not constrained to only receive pre-defined messages. It is not necessary to go out of the way to Logger.error them.
  • handle_info / terminate / stop clauses included in modules that don't do anything. use GenServer provides these basic callbacks already
  • Every test has Code.require_file "test_helper.exs", __DIR__ at the top. mix test runs test_helper by default
  • api.ex and enqueuer.ex contain a bunch of identical functions that do a GenServer.call or cast to an unspecified pid.
  • Tests are very brittle
    • The seed is hard coded
    • Global state via mix config and process name registration abounds
    • :timer.sleep calls abound.
  • Documentation: none
  • TypeSpec: none

Going Forward

These are the problems I've run into while trying to fix the supervision architecture. I have not yet had time to review in detail job lifecycle management, anything related to redis usage, nor the messaging patterns of the application.

I hope to have time during the next week or so to finish the changes I've started to improve the supervision tree. Notably, this may not include supervising the workers, as that should probably be addressed as part of a general look at job life cycle management. We will see.

Afterward, I believe the most useful next step would be documentation. What does each GenServer do? How does it interact with the other GenServers? What is the lifecycle of a job? Some clarity to these questions would go a long way towards facilitating further contributions that can improve upon the current state of affairs.

Once again I really do appreciate the effort that's been put into Exq, it is a valuable part of the Elixir community.

Thanks!

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions