Workers do not seem to run in parallel #148
Closed
Labels
Comments
You are correct, very few plugins have asyncio support right now. This is a work in progress, and I am more than happy to accept PRs to help with that, along with documentation improvements. |
I think at the minimum the documentation should explain this for WorkerPlugins and it should maybe make it clear for the section where it talks about converting worker plugins from v2-> v3. |
I created two PRs exif and opswat... |
mlaferrera
added a commit
that referenced
this issue
Jun 18, 2020
mlaferrera
added a commit
that referenced
this issue
Jun 18, 2020
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Describe the bug
Workers need to await a coroutine in order to run in parallel.
To Reproduce
I created a demo to illustrate what I am talking about: https://github.com/ytreister/stoq/tree/workers_in_parallel/demo
You can run
scan.py
which show the following:All workers await a coroutine: (This is what I want it to do)
All workers do not await a coroutine: (This is not how I wanted it to work)
Notice how in the first each worker runs in asynchronously and in the second they run serially.
Expected behavior
I think the stoQ framework works as expected, but maybe the documentation needs some more explaining so that users can take full advantage of the async feature. I do not think any of the public stoq plugins are written so that they can be run asyncronously.
Client:
Explanation
Once I converted and started running my plugins in stoQ 3.x, I inspected the logs and noticed that the plugins did not seem to run in parallel during a given round. My OPSWAT metadefender plugin was the dead giveaway because it would take up to 1 minute per file.
I created a demo to illustrate what happens when the plugins await a co-routine that takes some time to execute versus call a regular function that takes some time to execute. As I expected, when awaiting a coroutine (in my demo this happens when I pass
b'asyncio'
as the payload) the worker plugins execute asynchronously. When my worker plugins do not await a co-routine, they basically run in serial.All of the stoq-plugins-public seem to not await a co-routine. For example opswat.py public plugin should maybe use one of the techniques described here:
https://stackoverflow.com/questions/22190403/how-could-i-use-requests-in-asyncio
so that the framework does not have to wait for the scan to complete before other worker plugins begin their scan.
Another example might be a plugin that runs some local command such as
exif.py
ortrid.py
. These both call subprocess, so perhaps something from here:https://docs.python.org/3/library/asyncio-subprocess.html
It might not take very long for these function to execute, but I thought one of the main reason to us asyncio is so that we do not have to wait for other unrelated workers to run.
For me, my plugins either
I think there should be at least guidance for how to setup each of the above types of workers so that they can run asynchronously
The text was updated successfully, but these errors were encountered: