Join GitHub today
GitHub is home to over 50 million developers working together to host and review code, manage projects, and build software together.
Sign upGitHub is where the world builds software
Millions of developers and companies build, ship, and maintain their software on GitHub — the largest and most advanced development platform in the world.
Repository Liberation #58
Comments
Agreed. If you're running against a 11.10.34x release, the Let me think about how best to document this under https://github.com/github/backup-utils#backup-snapshot-file-structure. I do think it's worth pointing out and is something we plan to retain moving forward. |
Neato! |
@rtomayko How does this change with the new repository format? Given that the repositories are now rather abstractly named and share files among forks. |
@xeago Good catch. This is definitely more complicated with the changes in GHE 2.2 but still very possible. I'd like to provide some new tools with backup-utils to make this easier. My sense is there are two cases worth thinking about. 1. Grabbing a copy of a single repositoryUseful when you need to pull a deleted repository from historical snapshot or when some kind of corruption has ruined an active copy. This can be accomplished via a simple
Using The hard part in this scenario is obtaining the path location of the repository in the snapshot since repositories are no longer named on disk in simple "user/repo.git" format. There is a "info/nwo" file in each repository directory with the "user/repo" name, which could be used for this purpose. We've also been considering creating a hierarchy of symlinks on the instance to map "user/repo" names to their numeric repository directories. If we included this in the backup, locating repositories by name would be quite a bit easier. 2. Exporting all repositories in user/repo formatThis is closer in spirit to the issue's original request. Given a hierarchy of repositories stored in the new filesystem layout, produce a "liberated" copy -- something you could throw into pretty much any git server environment. This could be accomplished by applying the solution for 1) on all repositories present in the backup but I think it'd be worth considering some optimizations here because cloning each repository would both take a long time and require a large amount of disk space.
With these optimizations, we should be able to reconstruct the GHE <= v2.1 repository backup structure on demand fairly quickly and without requiring 2x or more disk space, assuming the export is written to the same volume. |
Any news on this? |
@azzlack I'm sorry I missed this. It's now on my radar. There are no plans to work on this in the near future but I'll discuss that with the team. |
* Use uuid hostnames for dgit, alambic, pages and gists replicas if uuids are configured * Move cluster hostname detection into ghe-cluster-hostnames
* Use uuid hostnames for dgit, alambic, pages and gists replicas if uuids are configured * Move cluster hostname detection into ghe-cluster-hostnames
It'd be nice if this tool supported a liberation mode wherein the entirety of the repos stored in the backups could be made into a collection of bare repositories organized by owner.
I realize that this would mean that the user may no longer be a customer, but I think it'd be a remarkable good-will gesture.