177 Commits (d0cea42fbbf616806294d7b59889f2a8a95c8e98)

Author SHA1 Message Date
Cyrille Lavigne 6fc9454675
Handle deserializing failures gracefully (#1428)
* adds unit test for a deserialization error

This tests that deserialization exceptions are properly logged, and fails in
the manner described in #1422 .

* Catch deserializing errors in Worker.handle_exception()

This fixes #1422 , and makes

tests/test_worker.py::TestWorker::test_deserializing_failure_is_handled

pass.

* made unit test less specific

This is required to get the test to pass under other serializers / other
python versions.

* Added generic DeserializationError

* switched ValueError to DeserializationError in a test

The changed test is creating an invalid job, which now raises
DeserializationError when data is accessed, as opposed to ValueError.
4 years ago
Selwin Ong 5b5cfdf9ab
Jobs that get cleaned up should also be retried (#1467) 4 years ago
Omer Lachish 76ac0afbcd
Cleanup zombie worker leftovers as part of StartedJobRegistry's cleanup() (#1372)
* cleanup jobs that are not really running due to zombie workers

* remove registry entries for zombie jobs

* return only the job ids on cleanup

* test zombie job cleanup

* format code

* rename variable to explain that second element in tuple is expiry, not score

* remove worker_key

* detect zombie jobs using old heartbeats

* reuse get_expired_job_ids

* set score using current_timestamp

* test idle jobs using stale heartbeats

* extract timeout into variable

* move heartbeats into StartedJobRegistry

* use registry.heartbeat in tests

* remove heartbeats when job removed from StartedJobRegistry

* remove idle and expired jobs from both wip and heartbeats set

* send heartbeat_ttl to registry.add

* typo

* revert everything 😶

* only keep job heartbeats as score (and get rid of job timeouts as scores

* calculate heartbeat_ttl in an overrideable function + override it in SimpleWorker + move storing StartedJobRegistry scores to job.heartbeat()

* set heartbeat to monitoring interval for infinite timeouts

* track elapsed_execution_time as part of worker

* reset current job working time when work on a job is done

* persisting the job working time as part of monitoring
4 years ago
Biel Cardona 08ef54dcf4
Workers dequeuing jobs from queues using both Round-Robin and Random strategies (#1425)
* implemented round-robin and random access to queues

* added tests for RoundRobinQueue

* reverted change in gitignore

* removed linebreak

* added tests for random queues

* added documentation for round robin and random queues

* moved round robin strategy to worker

* reverted changes to queue.py

* reverted changes to workers.md

* reverted changes to test_queue

* added tests for RoundRobinWorker and RandomWorker

* added doc for round robin and random workers

* removed f-strings for backward compatibility

* corrected a mistake

* minor changes (code style)

* now using _ordered_queues instead of queues for reordering queues
4 years ago
Adda Satya Ram 11c8631921
Add exception to catch redis connection failure to retry after wait time (#1387)
* add exception catch for redis connection failure

* Add test for connection recovery

* add exponential backoff

* limit worker max connection wait time to 60 seconds

* fix undefined class variable

* fix string formatting issue while printing error log

* cap max connection wait time:better code style

Co-authored-by: corynezin <cory.nezin@gmail.com>
4 years ago
JackBoreczky 016da14723
Fix custom serializer in job fetches (#1381)
* Ensure that the custom serializer defined is passed into the job fetch calls

* add serializer as argument to fetch_many and dequeue_any methods

* add worker test for custom serializer

* move json serializer to serializers.py
4 years ago
Selwin Ong f3e924cdd1
Added job.worker_name (#1375)
* Added job.worker_name

* Fix compatibility with Redis server 3.x

* Document job.worker_name

* Removed some Python 2 compatibility stuff.

* Remove unused codes
4 years ago
Ruslan Mullakhmetov 9adcd7e50c
feat: avoided "zombie" processes after killing work horse (#1348)
* feat: avoided "zombie" processes after killing work horse by setting work horse process group and killing this group

* fixed tests

* tests: added test to check that all workhorse subprocesses are killed

* tests: updated guthub run tests dependencies since they are not using (dev-)requirements.txt

Co-authored-by: Ruslan Mullakhmetov <ruslan@twentythree.net>
4 years ago
Selwin Ong 01d71c8984
Fixes an issue where retried jobs should not be put in FailedJobRegistry (#1336) 4 years ago
Ruslan Mullakhmetov c2931b45b6
handled unhandled exceptions in horse (#1303)
* handled unhandled exceptions in horse to prevent a job from being silently dropped without going into FailedRegistry

* changes after review

* made sure that work_horse always terminates in a proper way with tests

* minor refactoring

* fix for failing test

* fixes for the other tests

- removed exception handling (done in monitor_work_horse)
- adjusted some tests for the checks that are not relevant anymore

* review suggested changes

* cleanup

Co-authored-by: Ruslan Mullakhmetov <ruslan@twentythree.net>
4 years ago
Selwin Ong 49b156ecc7
Job retry feature. Docs WIP (#1299)
* Initial implementation of Retry class

* Fixes job.refresh() under Python 3.5

* Remove the use of text_type in job.py

* Retry can be scheduled

* monitor_work_horse() should call handle_job_failure() with queue argument.

* Flake8 fixes

* Added docs for job retries
4 years ago
wevsty 4e1eb97056
Split kill_house() fix issues #1234 (#1300)
* Split kill_house() fix issues #1234

Details View issues #1234

* Removing the catch finally

* rename wait_horse() to wait_for_horse()

* rename wait_horse() to wait_for_horse()

* update test_handle_shutdown_request()

Change test_handle_shutdown_request() exitcode assert

* Restore kill_horse() output

* optimization wait_for_horse()
5 years ago
Selwin Ong 1d8ea8e7a3
Worker key TTLs are set to be a bit longer to account for system hiccups (#1279)
* Worker key TTLs are set to be a bit longer to account for system hiccups

* Fix test_work_horse_force_death
5 years ago
Babatunde Olusola e1cbc3736c
Implement Customizable Serializer Support (#1219)
* Implement Customizable Serializer Support

* Refractor serializer instance methods

* Update tests with other serializers

* Edit function description

* Edit function description

* Raise appropriate exception

* Update tests for better code coverage

* Remove un-used imports and un-necessary code

* Refractor resolve_serializer

* Remove un-necessary alias from imports

* Add documentation

* Refractor tests, improve documentation
5 years ago
Samuel Colvin 4036471203
fixing HerokuWorkerShutdownTestCase after #1194 (#1213) 5 years ago
mr-trouble 5f949f4cef Add a hard kill from the parent process with a 10% increased timeout … (#1169)
* Add a hard kill from the parent process with a 10% increased timeout in case the forked process gets stuck and cannot stop itself.

* Added test for the force kill of the parent process.

* Changed 10% to +1 second, and other misc changes based on review comments.
5 years ago
Selwin Ong baa0cc268a
Job scheduling (#1163)
* First RQScheduler prototype

* WIP job scheduling

* Fixed Python 2.7 tests

* Added ScheduledJobRegistry.get_scheduled_time(job)

* WIP on scheduler's threading mechanism

* Fixed test errors

* Changed scheduler.acquire_locks() to instance method

* Added scheduler.prepare_registries()

* Somewhat working implementation of RQ scheduler

* Only call stop_scheduler if there's a scheduler present

* Use OSError rather than ProcessLookupError for PyPy compatibility

* Added `auto_start` argument to scheduler.acquire_locks()

* Make RQScheduler play better with timezone

* Fixed test error

* Added --with-scheduler flag to rq worker CLI

* Fix tests on Python 2.x

* More Python 2 fixes

* Only call `scheduler.start` if worker is run in non burst mode

* Fixed an issue where running worker with scheduler would fail sometimes

* Make `worker.stop_scheduler()` more resilient to errors

* worker.dequeue_job_and_maintain_ttl() should also periodically run maintenance tasks

* Scheduler can now work with worker in both burst and non burst mode

* Fixed scheduler logging message

* Always log scheduler errors when running

* Improve scheduler error logging message

* Removed testing code

* Scheduler should periodically try to acquire locks for other queues it doesn't have

* Added tests for scheduler.should_reacquire_locks

* Added queue.enqueue_in()

* Fixes queue.enqueue_in() in Python 2.7

* First stab at documenting job scheduling

* Remove unused methods

* Remove Python 2.6 logging compatibility code

* Remove more unused imports

* Added convenience methods to access job registries from queue

* Added test for worker.run_maintenance_tasks()

* Simplify worker.queue_names() and worker.queue_keys()

* Updated changelog to mention RQ's new job scheduling mechanism.
5 years ago
Vladimir Protasov 8c34e2b353 Store worker's RQ and Python versions (#1125)
* Store worker version to Redis

* Store worker's Python version to Redis

* Store worker version in __init__ body as suggested in review
5 years ago
Vladimir Protasov b62b9b0727 Fix unreliable test (#1126)
Also make error message more useful in case of future failures.
5 years ago
Selwin Ong d1813cdff9 Fixed test errors caused by _sentry_trace_headers 6 years ago
Selwin Ong f9d42e8a17
Added logging statements to handle_job_success and handle_job_failure (#1112) 6 years ago
Paul Robertson e1c135d4de add the ability to have the worker stop executing after a max amount of jobs (#1094)
* add the ability to have the worker stop executing after a max amount of jobs

* rename to max-jobs

* updated logging messages
6 years ago
Ted Summer 79a6fd7999 Fix timeout adding job to StartedJobRegistry (#1086)
* Fix timeout adding job to StartedJobRegistry

* Fix prepare_job_execution handling neg timeout

* Add test for inf job timeout in StartedJobRegistry

* refactor(worker): simplify checking neg timeout
6 years ago
Selwin Ong c4cbb3af2f
RQ v1.0! (#1059)
* Added FailedJobRegistry.

* Added job.failure_ttl.

* queue.enqueue() now supports failure_ttl

* Added registry.get_queue().

* FailedJobRegistry.add() now assigns DEFAULT_FAILURE_TTL.

* StartedJobRegistry.cleanup() now moves expired jobs to FailedJobRegistry.

* Failed jobs are now added to FailedJobRegistry.

* Added FailedJobRegistry.requeue()

* Document the new `FailedJobRegistry` and changes in custom exception handler behavior.

* Added worker.disable_default_exception_handler.

* Document --disable-default-exception-handler option.

* Deleted worker.failed_queue.

* Deleted "move_to_failed_queue" exception handler.

* StartedJobRegistry should no longer move jobs to FailedQueue.

* Deleted requeue_job

* Fixed test error.

* Make requeue cli command work with FailedJobRegistry

* Added .pytest_cache to gitignore.

* Custom exception handlers are no longer run in reverse

* Restored requeue_job function

* Removed get_failed_queue

* Deleted FailedQueue

* Updated changelog.

* Document `failure_ttl`

* Updated docs.

* Remove job.status

* Fixed typo in test_registry.py

* Replaced _pipeline() with pipeline()

* FailedJobRegistry no longer fails on redis-py>=3

* Fixes test_clean_registries

* Worker names are now randomized

* Added a note about random worker names in CHANGES.md

* Worker will now stop working when encountering an unhandled exception.

* Worker should reraise SystemExit on cold shutdowns

* Added anchor.js to docs

* Support for Sentry-SDK (#1045)

* Updated RQ to support sentry-sdk

* Document Sentry integration

* Install sentry-sdk before running tests

* Improved rq info CLI command to be more efficient when displaying lar… (#1046)

* Improved rq info CLI command to be more efficient when displaying large number of workers

* Fixed an rq info --by-queue bug

* Fixed worker.total_working_time bug (#1047)

* queue.enqueue() no longer accepts `timeout` argument (#1055)

* Clean worker registry (#1056)

* queue.enqueue() no longer accepts `timeout` argument

* Added clean_worker_registry()

* Show worker hostname and PID on cli (#1058)

* Show worker hostname and PID on cli

* Improve test coverage

* Remove Redis version check when SSL is used

* Bump version to 1.0

* Removed pytest_cache/README.md

* Changed worker logging to use exc_info=True

* Removed unused queue.dequeue()

* Fixed typo in CHANGES.md

* setup_loghandlers() should always call logger.setLevel() if specified
6 years ago
Wolfgang Langner 8fc987dc68 Make logging in worker consitent. (#1030)
Switch some messages from warn to info because it is normal requested bahavior.
6 years ago
Finnci 14db0ecd26 Update/add flag for description logging (#991)
* test workers

* indent

* add docs and add option to the cli

* rename flag for cli

* logging
6 years ago
Samuel Colvin 2f35222ddb skip test_1_sec_shutdown with pypy (#1020)
* skip test_1_sec_shutdown with pypy, fix #1019

* skip all HerokuWorkerShutdownTestCase with pypy
6 years ago
Darshan Rai ada2ad03ca modify zadd calls for redis-py 3.0 (#1016)
* modify zadd calls for redis-py 3.0

redis-py 3.0 changes the zadd interface that accepts a single
mapping argument that is expected to be a dict.
https://github.com/andymccurdy/redis-py#mset-msetnx-and-zadd

* change FailedQueue.push_job_id to always push a str

redis-py 3.0 does not attempt to cast values to str and is left
to the user.

* remove Redis connection patching

Since in redis-py 3.0, Redis == StrictRedis class, we no longer
need to patch _zadd and other methods.
Ref: https://github.com/rq/rq/pull/1016#issuecomment-441010847
6 years ago
Selwin Ong 6559b0ffd7
Replace "timeout" argument in queue.enqueue() with "job_timeout" (#1010) 6 years ago
Selwin Ong ad66d872f0 Fixed a unicode test. 6 years ago
Selwin Ong 47d291771f
SimpleWorker's ttl must always be longer than jobs. (#1002) 6 years ago
Selwin Ong 531fde8e3c worker.main_work_horse should always return 0 7 years ago
Thomas Kriechbaumer 3133d94b58 add periodic worker heartbeats (#945)
* add periodic worker heartbeats

fixes #944

* improve worker default option handling
7 years ago
Selwin Ong 7a3c85f185
Added the ability to fetch workers by queue (#911)
* job.exc_info is now compressed.

* job.data is now stored in compressed format.

* Added worker_registration.unregister.

* Added worker_registration.get_keys().

* Modified Worker.all(), Worker.all_keys() and Worker.count() to accept "connection" and "queue" arguments.
7 years ago
Samuel Colvin df571e14fd improve logging in worker.py (#902)
* improve logging in worker

* tests for log_result_lifespan
7 years ago
Selwin Ong f500186f3d
Job compression (#907)
job.exc_info and job.data is now stored in compressed format in Redis.

* job.data is now stored in compressed format.
7 years ago
vanife ff36e0656e Fixed an issue where `birth` not present in Redis (#901)
* Fixed an issue where `birth` not present in Redis

Fixed an issue where worker.refresh() may fail if `birth` is not present in Redis

* added test coverage
7 years ago
Selwin Ong 7b9c3b6b66 Fixed an issue where worker.refresh() may fail if last_heartbeat is not present in Redis. 7 years ago
Selwin Ong 1d7b5e834b
Worker statistics (#897)
* First stab at implementing worker statistics.

* Moved worker data restoration logic to worker.refresh().

* Failed and successfull job counts are now properly incremented.

* Worker now keeps track of total_working_time

* Ensure job.ended_at is set in the case of unhandled job failure.

* handle_job_failure shouldn't crash if job.started_at is not present.
7 years ago
Samuel Colvin 260fd84f51 add milliseconds into timestamps, fix #721 7 years ago
Theo 261f4ac3d5 Fixed #866 - Flak8 errors 7 years ago
Theo 096c5ad3c2 Fixed #866 - Flak8 errors 7 years ago
Samuel Colvin 423da3683c remove python 2.6 support 7 years ago
Selwin Ong dc45ab8799 Worker.find_by_key should use hmget instead of repeated hget calls. (#826) 8 years ago
Peng Liu b7d4b4ec1b Solve the UnicodeDecodeError while decode literal things. (#817)
* Solve the UnicodeDecodeError while decode literal things.

* Add test case for when worker result is a unicode or str object that other than
pure ascii content.
8 years ago
Selwin Ong f6b4c286c9 Merge pull request #757 from jaywink/fix-unicode-decode-error
Fix UnicodeDecodeError when failing jobs
8 years ago
Samuel Colvin fd9babe8ce correct heroku worker exit logic
as per @Chronial's comment on b4b99f3
8 years ago
Jason Robinson 213969742e Fix UnicodeDecodeError when failing jobs
Worker handle_exception and move_to_failed_queue couldn't handle a situation where the exception raised had non-ascii characters. This caused a UnicodeDecodeError when trying to format the exception strings.

If on Python 2, ensure strings get decoded before building the exception string.

Closes #482
8 years ago
Selwin Ong 83007b2074 Merge pull request #786 from jezdez/backend-class-overrides
Allow passing backend classes from CLI and other APIs
8 years ago
Benjamin Root 30a7ab4899 Add similar test for when the job fails 8 years ago