[ Sorry, this is a long one! Read it only if you're interested. ;) ] Am Dienstag, den 09.12.2008, 20:56 +0100 schrieb David Paleino: > On Tue, 09 Dec 2008 14:48:12 +0100, Manuel Prinz wrote: > > * Excellent branch/merge support. That's fast and cheap and allows > > one to have several independant branches to work on. This is > > especially handy if you need to touch upstream's code, as it > > makes changes more visible than a huge quilt-patch. > > So, consider a package I'm team-maintaining in Debian, john. I'm currently > applying several patches, something like 17 or so. Should I keep 17 separate > branches for this? And how to handle patches touching the same files but in > different points? (i.e. what the "series", "00list" and kinda files do, > establishing patch applying order) It depends on the workflow you choose. Indeed, having 17 branches sounds scary but it has some advantages. One thing is that they are distinct, so commits to a branch belong to a bugfix or feature you develop. It also allows to get rid of it as soon as the changes are integrated upstream. Also, if the changes are split into several commits (for a reason), it is easy to send all of them upstream without fiddling the right commits from the commit log. (Alternativ: send one large chunk. From my experience not always welcome by upstream.) The obvious drawback is the integration work you need to do for every new upstream release. You would use an integration branch into which you'd first merge upstream, followed by all feature branches. Currently, there are only ways to do it semi-automatically. Of course, merge conflicts may happen, as you mentioned, when touching the same file. These just need to be resolved. But this is pretty much the same thing you do with a patch series: You flatten the hierarchy, so that one patch depends on each other. The drawback of this is that you can't just take a patch out of the series and apply it to the source if it modifies a file that was already modified earlier or will be modified later. Resolving merge conflicts from branch merges is no different from flatten patch sets with quilt; but the former has the advantage that it has no dependence on other branches, which a patch set can't provide. Nevertheless, you can of course use a model in which you have on branch in which you do all your changes to the upstream code. This is pretty much like the SVN workflow. It's doable with Git but IMHO has drawbacks. To mention pkg-vcs again: The aim is to experiment with ways to ease cross-distro development and DVCS are a big part of that. Think of having a fix for a real nasty bug in your real-nasty-bug-fix branch: If you keep it distinct, other distros (like Gentoo or Fedora or whatever) can cherry-pick the changes in that branch and apply it to their distribution. (Or the other way round: Debian can profit from them.) Of course this is quite hypothetical at the moment since not everyone uses Git and there is no infrastructure that supports the exchange, but it the situation is not too bad. If you have a flattened patch in a quilt series, grabbing and integrating that into a different distribution may be hard or even impossible. So maintaining branches might be for the profit of the whole FS community. We waste too much time fixing the same stuff in every distribution already. Making exchange easier is IMHO worth to achieve. > > * Support for cherry-picking. Importing changes from other > > developers easily can be very handy at times. (I sometimes even > > cherry-pick from my own branches.) > > What do you mean by "importing changes from other developers"? > I know that this might be related to the concept of DVCS, but isn't this prone > to errors on merge/push? No. Git uses the repository content to define it's state. SVN uses a revision number. Lets assume you apply commits in different orders than upstream or a developer, everyone has it's repo in the same state, since the content is identical. This is usually no problem with Git at all; but it sometimes is with SVN, since r1234 on your repo might not be equal to r1234 in the repo of a different developer. > > * With TopGit: Automatic patch generation from branches. No real > > need to update a series quilt patches, one can auto-create them. > > (This also has the nice side effect that it does not clutter the > > commit diff.) > > Read my consideration at point 1 :) > However, I must admit I don't know what TopGit is, /me looks info on it. It's still under development but rocks already. I do not use it productively yet, though. There's a lot of experimenting going on about it on pkg-vcs. > > * "All in one" repo: Upstream code + Debian packaging as seperate > > branch. With pristine-tar, I can recreate the upstream tarball > > when not available. svn-bp just does not work if you're on the > > road and don't have the tarball. (Keeping all tarballs for all > > versions costs disk space as well, which is limited on my EEE.) > > I've had problems in the past with pristine-tar, but that was probably due to > my lack of "experience" with git. (and no, don't ask me what, I don't really > remember which problems I had :( ) When using "git-import-orig --pristine-tar" one should be fine in most cases. It might be confusing that the content is stored in a branch that is decoupled from the rest of the repo. Git allows that, SVN doesn't, and this was quite confusing for me as well when I switched. > About "not having the tarball", that's what "get-orig-source" targets are for. ... if a) this optional target exists and b) you have an internet connection and c) you just want to get the latest version. If have an upstream that removes old versions when providing new ones. No chance to ever get them again. (And mailing upstream about old versions is not very comfortable, really.) > Also, if you keep the upstream code in the same repository, how can you > "checkout" different versions? Yes, of course. And pristine-tar enables you to recreate a tarball from the upstream sources that is bit-identical to the one that is provided upstream. > Using different tags? Yes, tags are the way to mark certain points in history, like the import of new upstream sources or a Debian upload. > I admit that, if everything is done via tags and diffs between > revisions, you could save lot of space. That's how it works. (On most setups, disk space is not so much often an issue; but on others, it really is. And one is quite thankful then that one can work nevertheless.) > But I can't really see why one should work on different versions, at > least more than two -- so the "disk space cost" argument isn't really a point, > IMHO) Well, consider that a year after the release of Lenny, you have to fix a security issue in your package, but upstream has released several new version which are already in unstable. No big deal though, because you can branch from the version in stable and merge the fix into that branch, build and upload. In case you lost the upstream tarball, you can simply recreate it. (Yes, well, you can also fetch it from the archive. You got me there. Anyway, still need to have your mirror around or an internet connection.) When packaging diverges, as in backports or security fixes, I think Git can deal with this situations better. I did not have this situation, so this is just speculation. Dealing with diverging branches is what DVCS were developed for. > > * Offline commits. This is my personal favorite since I'm quite > > often lacking an internet connection. Commiting regularly is > > IMHO important, especially if a revert is needed. (I used SVK > > for years but was never happy with it; merge conflicts were just > > a pain back then with SVN. Heared the new SVN version fixes it > > somewhat. But SVK was an improvement in that point.) > > This is a big plus for git -- I hate having to connect to commit my changes. > But, once again, how are merges handled? And if two developers change the same > file in more-or-less the same point? Is git smart enough to handle this? No. And that's the beauty of it. ;) > > * Git is stupid. It means that it does not try to guess anything > > if can't decide what to do. The user has to be explicit in what > > he wants to do; you (almost) never get any "smart" behavior from > > Git. There is no unexpected behavior since you request Git to do > > what it should do. (I do not like unexpected behavior due to > > "smart" software. That's probably more a personal point.) > > Err... I can't get it :) > I once read that "git is like the factory giving you the pieces and the > instructions sheet, and letting you build your own airplane", while "svn is a > full featured airline that is more-or-less good for all pilots". I would not put it this way. To me it's more like a full-featured airline that allows pilots to add their own favorite aircraft and flight plan. Or something like that. What I wanted to say in my last mail was: If Git can't resolve a merge conflict, it does not even try to do so. It just tells you about it and leaves you with the pieces. You then have a look at it with your favorite merge tool (or whatever you prefer to use), expects you to resolve the conflict and tell it that it can happily go on merging. It may sound like work but it really is the best way to handle it, since in most cases brains are superior to algorithms. Git expects the brains to resolve conflicts since developers are expected to know what is the best way to handle those. > > * Import and export of patches from and to email. This is just > > great: Take a diff and export it to an email, add a comment and > > send it. Other Git users can simply apply it to their repo from > > their mail client. (Or save the mail to a file and apply that > > with the Git tools.) Once used to it, it's very handy and makes > > exchanging patches really easy. > > This is fine, but it seems like we're having this functionality with "svn > diff". Just a matter of attaching/including that output to a mail ;) Well, sure. But it's more complicated. I do not see it as *the* feature of Git but it is useful. If you can save 2 minutes sending such a patch and you have 10 patches, just do the maths. ;) > > As an example: In maintaining Open MPI, we have currently several issues > > to solve. With different branches, I can address all of them seperately > > which I do find the time. Hacking at all in the trunk clutters history a > > lot and causes confusion. > > Fine. And (repeating my previous question) what if two different bugs involve > the same code at, let's say, the same function, and you solve them in two > different incompatible ways? The various branches will work separately, but > once merged, you'd get unexpected behaviour No, you won't, because it will lead to a merge conflict then. You need to resolve this, so it's totally expected behavior, since you where the one dealing with it. This is like flattening quilt patches: You'd get a conflict, you do some editing, and refresh the patch. Same thing here. Everything that does some auto-merging or auto-conflict-resolution in this is case is IMHO broken by design. As long as I have full control over how to handle problems, I'd call this "expected behavior". Merge resolutions are also commits in Git, so they are documented in the history log. One other point I forgot, because it's too much of a habit now: Every Git repo is a full repository. It can easily be backuped. In fact, if you have your repo online, you do have a backup since you can just clone it. Or clone the repository of a fellow developer. They're all alike. (Ignoring local changes.) This is surely a real nice thing: If you destroy your repo, just get it back from somewhere. I hope I could shed some light into why I like Git. If you have questions, please ask. And I do not want to get everyone to use Git. Everyone should use what they feel comfortable with. For me, switching saved time and headaches. I can't warrant that for everyone, though. ;) Best regards Manuel
Attachment:
signature.asc
Description: Dies ist ein digital signierter Nachrichtenteil