We raise our own exception which hides the real error (often an ImportError),
making it difficult to see what happend. Instead, save the original exception
too.
When a pickled job string can't be unpickled because some required
module isn't loadable, this leads to an `UnpickleError` in the worker
(not in the horse).
Currently we just assume "garbage" in the job's data field, and silently
ignore it.
This is bad.
Really bad.
Because it avoids the normal exception handling mechanism that RQ has.
Historically, this "feature" was introduced to ignore any invalid pickle
data ("bad strings") on queues, and go on. However, we must assume data
inside `job.data` to be valid pickle data.
While an invalid _format_ of pickle data (e.g. the string "blablah"
isn't valid) leads to unpickle errors, unpickling errors will also occur
when the job can't be validly constructed in memory for other reasons,
like being unable to load a specific class.
Django is a good example of this: try submitting jobs that use
`django.conf.settings` while the `DJANGO_SETTINGS_MODULE` env var isn't
set. Currently, RQ workers will drop these jobs and dismiss them like
any non-valid pickle data. You won't be notified.
This patch changes RQ's behaviour to never ignore invalid string data on
any queue and _always_ handle these errors explicitly (but without
bringing the main loop down, of course).
This is to support users that are relying on the current custom property
implementation. A warning will be displayed on the console, stating
that this support will be removed from RQ version 0.4.
It could probably require a bit of caching, to prevent too many fetches
per time slot (for example, return the locally cached value if that
value is as fresh as a second or so).
This fixes#120.
Connections can now be set explicitly on Queues, Workers, and Jobs.
Jobs that are implicitly created by Queue or Worker API calls now
inherit the connection of their creator's.
For all RQ object instances that are created now holds that the
"current" connection is used if none is passed in explicitly. The
"current" connection is thus hold on to at creation time and won't be
changed for the lifetime of the object.
Effectively, this means that, given a default Redis connection, say you
create a queue Q1, then push another Redis connection onto the
connection stack, then create Q2. In that case, Q1 means a queue on the
first connection and Q2 on the second connection.
This is way more clear than it used to be.
Also, I've removed the `use_redis()` call, which was named ugly.
Instead, some new alternatives for connection management now exist.
You can push/pop connections now:
>>> my_conn = Redis()
>>> push_connection(my_conn)
>>> q = Queue()
>>> q.connection == my_conn
True
>>> pop_connection() == my_conn
Also, you can stack them syntactically:
>>> conn1 = Redis()
>>> conn2 = Redis('example.org', 1234)
>>> with Connection(conn1):
... q = Queue()
... with Connection(conn2):
... q2 = Queue()
... q3 = Queue()
>>> q.connection == conn1
True
>>> q2.connection == conn2
True
>>> q3.connection == conn1
True
Or, if you only require a single connection to Redis (for most uses):
>>> use_connection(Redis())
This aids unpacking in the case of a function that isn't importable from
the worker's runtime. The unpickling will now (almost) always succeed,
and throw an ImportError later on, when the function is actually
accessed (thus imported implicitly).
The end result is a job on the failed queue, with exc_info describing
the import error, which is tremendously useful.
Jobs are now stored in separate keys, and only job IDs are put on Redis
queues. Much of the code has been hit by this change, but it is for the
good.
No really.
There is no job tuple anymore, but instead Jobs are picklable by
themselves natively. Furthermore, I've added a way to annotate Jobs
with created_at and enqueued_at timestamps, to drive any future Job
performance stats. (And to enable requeueing, while keeping hold of the
queue that the Job originated from.)
This fixes#17.