Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Store SnapShot method does not buffer read the database which could be an issue if database size is large #273

Open
JeanGolang opened this issue Feb 7, 2017 · 6 comments

Comments

@JeanGolang
Copy link

@JeanGolang JeanGolang commented Feb 7, 2017

Hello Philip!
I think there is an issue with this part of the code of rqlite (store/store.go).

func (s *Store) Database(leader bool) ([]byte, error) {
	if leader && s.raft.State() != raft.Leader {
		return nil, ErrNotLeader
	}
	// Ensure only one snapshot can take place at once, and block all queries.
	s.mu.Lock()
	defer s.mu.Unlock()

	f, err := ioutil.TempFile("", "rqlilte-snap-")
	if err != nil {
		return nil, err
	}
	f.Close()
	defer os.Remove(f.Name())

	if err := s.db.Backup(f.Name()); err != nil {
		return nil, err
	}

	return ioutil.ReadFile(f.Name())
}

As far as I understood from your code it's a support function for creating a snapshot of the database. However, this implementation reads the whole database into memory, in your particular case in the variable fsm.database, err = s.Database(false), so if the size of your database is larger than RAM available, I guess the program would crash.

I guess it's better to read the database in buffers or send chunks of the database via a channel when writing it to the so called SnapshotSink.

Thanks!

@JeanGolang JeanGolang changed the title Store SnapShot method does not buffer read the database Store SnapShot method does not buffer read the database which could be an issue if database size is large Feb 7, 2017
@otoolep
Copy link
Member

@otoolep otoolep commented Feb 7, 2017

Thanks -- it's a weakness, I can look into addressing.

@JeanGolang
Copy link
Author

@JeanGolang JeanGolang commented Feb 14, 2017

Hi again, Philip!
Do you happen to know how to fix this problem? I guess it's a "bug" in hashicorp/raft implementation.
Because as far as I understood, hashicorp/raft doesn't call Snapshot() and Persist() concurrently, so I don't know of a way to communicate chunks of the database between Snapshot() and Persist() - do you have an idea how to do this?

@otoolep
Copy link
Member

@otoolep otoolep commented Feb 14, 2017

@otoolep
Copy link
Member

@otoolep otoolep commented Feb 14, 2017

@JeanGolang
Copy link
Author

@JeanGolang JeanGolang commented Feb 28, 2017

otoolep, thanks a lot! I've sorted it out with the raft library authors - here
I guess this could be closed, as if it's more of the raft issue than an issue with your package.

@otoolep
Copy link
Member

@otoolep otoolep commented Feb 28, 2017

Thank you @JeanGolang -- I will look into using the new interface exposed by the Raft package.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
2 participants