Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: initial async generator support #1560

Merged
merged 8 commits into from Aug 5, 2018
Merged

feat: initial async generator support #1560

merged 8 commits into from Aug 5, 2018

Conversation

@aearly
Copy link
Collaborator

@aearly aearly commented Jul 10, 2018

Closes #1551

We have to create a special eachOfLimit implementation for this, but once fully tested, all collections methods should fall into place.

@aearly aearly requested review from beaugunderson and megawac Jul 10, 2018
@aearly aearly added this to the 3.0 milestone Jul 10, 2018
@aearly aearly added the feature label Jul 10, 2018
running -= 1;
if (err) return handleError(err)

if (err === false) {
Copy link
Collaborator

@megawac megawac Jul 10, 2018

Can we avoid calling replenish int his case and go straight to calling the complete callback?

Copy link
Collaborator Author

@aearly aearly Jul 10, 2018

You're right, I should return in this block.


function replenish() {
//console.log('replenish')
if (running >= limit || awaiting) return
Copy link
Collaborator

@megawac megawac Jul 10, 2018

How do you end up with more than one generator running asynchronously with this code? If I understand the code correctly, replenish only gets called

  1. to spawn the generator when asyncEachOfLimit is called
  2. whenever a generator completes
  3. after iterateeCallback is called

Do we not need to loop limit - running inside of replenish to call several generators at once?

Copy link
Collaborator Author

@aearly aearly Jul 10, 2018

You'd only end up with more than one iteratee running if your iteratee takes longer to process than your generator takes to generate. For example, in the tests, I have the generator ticking every 1 ms, but have the iteratee take 5 ms.

Async generators are like streams -- you get a linear sequence of items. You can't await multiple items at the same time (and if you did, you'd get the same item multiple times, as you do when you call .then() on the same promise multiple times). This is why we can't loop in replenish -- we'd end up calling the iteratee with the same item multiple times.

Copy link
Collaborator

@megawac megawac Jul 10, 2018

I see thanks for explaining, I haven't used generators before.

So for my understanding, the replenish inside of the generator's then function is to spawn the asynchronous processes up until limit are running and the replenish inside of the iterateeCallback is to trigger additional items to run if the limit was reached?

Copy link
Collaborator

@megawac megawac Jul 10, 2018

You can't await multiple items at the same time (and if you did, you'd get the same item multiple times...)

Are you sure of this, as is right now we have the possibility of calling next multiple times before the promise resolves if iteratee is synchronous. In such a case will things break?

Copy link
Collaborator

@hargasinski hargasinski Jul 10, 2018

This is why we can't loop in replenish -- we'd end up calling the iteratee with the same item multiple times.

Sorry, I may be misunderstanding something, but I thought calls to next are queued and resolved separately. I'm looking at the Async iterators and async iterables section of the proposal.

That being said, I think we should still probably avoid looping in replenish as we still have to wait for earlier promises to resolve.

Copy link
Collaborator Author

@aearly aearly Jul 11, 2018

Ah, you're right, I was confusing .next() with .then(). We can call .next() as many times as we need to, perhaps up to the concurrency limit.

if (running >= limit || awaiting) return
//console.log('replenish awaiting')
awaiting = true
generator.next().then(({value, done: iterDone}) => {
Copy link
Collaborator

@megawac megawac Jul 10, 2018

Can you have a generator that has no items?

Copy link
Collaborator

@megawac megawac Jul 10, 2018

Nevermind dumb question, I see how this works

//console.log('done')
return callback(null);
}
replenish()
Copy link
Collaborator

@megawac megawac Jul 10, 2018

I think you have to avoid calling replenish when done === true or have replenish check if done === true.

Lets say hypothetically we have a limit of 2 and the final 2 iteratees are running. When one of these iteratees resolves, running = 1 and done = true but we'll still replenish and call the generator.next().then(). This has the possibility of entering a race condition, if the other iteratee resolves in the time between the then callback is called, you can actually end up calling callback twice

if (running >= limit || awaiting) return
//console.log('replenish awaiting')
awaiting = true
generator.next().then(({value, done: iterDone}) => {
Copy link
Collaborator

@hargasinski hargasinski Jul 10, 2018

Similar to what @megawac was saying, I think we need a if (done) return; line at the very top of this callback. If one of the previous iteratees cancelled or errored, we shouldn't be invoking the iteratee with successive items.

For example, if the first iteratee starts and we start waiting for the second item. If the first iteratee cancels before the then callback is called with the second item, when it is called, we would still invoke the iteratee with the second item.

@@ -15,6 +17,9 @@ export default (limit) => {
if (!obj) {
return callback(null);
}
if (isAsyncGenerator(obj)) {
Copy link
Collaborator

@hargasinski hargasinski Jul 11, 2018

This is a minor thing, but should this be expanded to support async iterators in general? Something similar to how we currently have getIterator except with Symbol.asyncIterator.

Copy link
Collaborator Author

@aearly aearly Jul 11, 2018

Ah, good point, didnt know about that detail of the spec.

replenish()
}

function handleError(err) {

This comment has been hidden.


function replenish() {
//console.log('replenish')
if (running >= limit || awaiting || done) return
Copy link
Collaborator Author

@aearly aearly Jul 11, 2018

BTW, I experimented with calling replenish() multiple times, awaiting mutiple .next()s at the same time, but there's some grey area in how it could work. Since you don't know the length of the iterator up front, you have to set some arbitrary limits in the pure parallel case, e.g. 10. Whenever you call .next() N times synchronously, you end up calling next() and receiving {done: true} N times spuriously at the end. We don't need to await multiple .next()s at the same time, because as far as I know, there will be a strict linear sequence of yields in the generator, meaning it's pointless to await more than one. It leaves the job of backpressure to the generator implementation too.

@megawac
Copy link
Collaborator

@megawac megawac commented Jul 22, 2018

@aearly can we update our eslint config to be consistent on semi colons please

@aearly
Copy link
Collaborator Author

@aearly aearly commented Jul 30, 2018

Okay, I've added support for async iterables. Implementation was pretty simple, hard part is getting all the various babel stuff to be happy.

@aearly
Copy link
Collaborator Author

@aearly aearly commented Jul 30, 2018

Also, I'd want to do a linter change in a separate PR. Going to be a big ugly diff if we switch.

@aearly aearly changed the title [wip] initial async generator support feat: initial async generator support Aug 5, 2018
@aearly aearly merged commit 61268c5 into master Aug 5, 2018
3 checks passed
@aearly aearly deleted the async-generator-support branch Aug 5, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

3 participants