Why Distributed Task Queues?

Offload long jobs to background processes example: video conversion
Offload too many [small] jobs to background processes example: commenting system
Keep track of jobs, monitor, auto-restart
Schedule jobs
Replace cron jobs. Cron jobs can run maximum once a minute. That is a limitation. And they don’t have the best interface.

Message Queue vs. Task Queue

Message Queue are the basic functionality of passing, holding, and delivering messages Example: Redis, RabbitMQ
Tasks Queue manage work to be done and is considered a type of message queue Example: Celery

Distributed Task Queus

in Python

Based on Popularity

Solution	Stars on git	Downloads/mo
Celery	4600	400,000
RQ (Redis Queue)	2600	40,000
⤿ Django RQ	428	13,000
Huey	824	3,000
MrQ (Mr. Q)	340	5,000
Taskmaster	346	1,000

Who uses them? Everyone.

Solution	User
Celery	Instagram, Mozilla, Truecar
RQ (Redis Queue)	?
⤿ Django RQ	?
Huey	?
MrQ (Mr. Q)	Pricing Assistant (creator)
Taskmaster	Disqus (creator)

Celery Architecture

Celery Architecture Image from parallel programming in Python book

RQ Architecture

Client ⥂ Redis ⥄ Worker

Celery Task Example

from celery import Celery
app = Celery('tasks', broker='amqp://guest@localhost//')
@app.task
def add(x, y):
    import time; sleep(5*60)
    return x + y

------------------

>>> from tasks import add
>>> result = add.delay(4, 4)
>>> result.ready()
False
5 minutes later:
>>> result.ready()
True

RQ Task example

from rq import Queue
from redis import Redis
from somewhere import count_words_at_url

# Tell RQ what Redis connection to use
redis_conn = Redis()
# no args implies the default queue
q = Queue(connection=redis_conn)

# Delay execution of count_words_at_url('http://nvie.com')
job = q.enqueue(count_words_at_url, 'http://nvie.com')
print job.result   # => None

# Now, wait a while, until the worker is finished
time.sleep(2)
print job.result   # => 889

Monitoring

Celery - Flower
RQ - Dashboard

Celery monitoring: Flower

Workers Flower monitoring for Celery

Tasks Flower monitoring for Celery. Tasks.

CPU usage Flower monitoring for Celery. CPU usage.

RQ Monitoring: RQ Dashboard

RQ monitoring

MRQ Monitoring: MRQ Dashboard

MRQ monitoring

Celery vs. RQ - overview

	Celery	RQ
Complexity of code	Very complicated	Easy to understand
Documentation	Take a while to read	Simple
Monitoring	Flower	RQ Dashboard
message brokers	RabbitMQ, Redis, MongoDB	Redis
result backends	RabbitMQ, Redis, Memcached, MongoDB, Cassandra,…	Redis
Concurrency	Master-Slave processes	supervisord() + fork
Scheduler	Celerybeats	3rd party
Language	Can send tasks from one language to another	Only Python
Subtasks	Can create tasks within tasks	Nope
Django support	Built-in	Django-rq

Why use RQ?

It all comes down to simplicity. and… MEMORY LEAK

Redis Memory Leak

Celery has Memory Leak issues
Some memory leak can happen with older broker libraries. i.e. librabbitmq
Celery monitor (Flower) has huge memory leak.
RQ offers less but its memory leak should be much smaller than Celery (not verified it myself)

Why use Celery?

Use Celery when:

RQ limits you to use Redis both as message broker and result backend. If you need another broker/result backend.
Redis can drop messages. But it will pick it up later. If that bothers you, you can’t use RQ.
Celery is way more feature rich and flexible than RQ.
If you don’t ever really need to know the magic behind the scene in Celery.

Complexity

How complex is RQ:

RQ complexity

How complex is Celery:

Celery complexity

And in case you were wondering…

How complex is Django:

Django complexity

So if you can understand Django source code, you should be able to understand Celery’s too.

Celery: Get Your Task Together