Unless there's something new in the packager I've not seen yet, using d.o pulls in production bypasses the packager. That is, you're then missing: - The full version information in the info file, which is used by update manager. - The License.TXT file that every module is supposed to have. Is that no longer the case? I'm pretty sure both of those still only happen with a tarball, so if you want those (and I do) then you need to use a tarball. Also, if you want to manage both core and contrib modules that way it means you are now using git submodules, which it's generally agreed suck AFAIK, or complex sub-tree merging that is out of reach of 99% of developers. Hell, I've done it and I don't want to do it. :-) Shallow clones are fine for removing disk size, certainly. But there's workflow considerations there that I don't believe Git solves (at least not yet). If I'm not doing site-specific or company-specific branches of core or modules (that is, hacking core or hacking modules, which is a no-no in 95% of cases), then the extra patch-level control that the more complex all-Git approach would allow is useless because I'm not even using it. I'm not saying there are no use cases for an all-git-all-the-time site building process, just that it has implications that you're glossing over in return for a benefit that the majority of use cases don't even need. --Larry Garfield On 3/1/11 12:38 PM, Sam Boyer wrote:
On 3/1/11 8:13 AM, larry@garfieldtech.com wrote:
I think the question is more about non-custom dev history; there's little need for a client site to have the complete development history of Drupal 4.3 in its repo, for instance.
So you do a shallow clone that skips irrelevant branches and only grabs recent history on the ones you want, that's fine.
Lately, what I've been doing/advocating is using Drush and real releases to download stuff from Drupal.org (core, contrib modules, etc.) and then checking the whole site into Git. If I update a module, I use Drush for that and then update the code in my Git repo. Then deploy to production using *my* git repo (which has my full dev history but not every commit in every one of my projects ever) and tags.
...which is *exactly* what I'm saying is pointless. Why stick a stupider intermediary - tarballs - into a system that's already highly capable of doing patch& vendor management? The only thing you've accomplished is diluting the capabilities of your version control system to manage upstream changes.
That keeps me on real releases, avoids unnecessary repository bloat, but still gives me a full history of all work on that project specifically.
"Unnecessary repository bloat?" Two great words there, let's address each one:
"Unnecessary": well, the full branch history is a requirement if you want to use git's smart merging algorithms. So the only way it's "unnecessary" is if you prefer manually hauling chunks out of patch-generated .rej and .orig files.
"Bloat": Really, step back and think about this. Are you solving a real, compelling problem faced by most modern servers? How much does it matter that your Drupal tree is, say, 70MB instead of 700MB? It really doesn't. Not even on shared hosting. And, let's not forget - judicious use of shallow clones& compression whittles that number way, WAY down. IMO, ripping out the vendor history is something a lot of us got in the habit of doing because we were used to having CVS vendor data that earned us nothing but headaches, and it was an easy "optimization" that made our Drupal trees feel more svelte.
Well, now it does get you something. It gets you a _ton_. Now, all you need for company-specific or site-specific customizations that can easily coexist with rich vendor data is some branch naming conventions and practice with reading git logs. Yeah, that takes some learning too, but it's worth it.
cheers s
--Larry Garfield
On 3/1/11 1:56 AM, Sam Boyer wrote:
I tend to advocate full clone. You're talking about a task that version control is designed for. Now that we've made the switch, IMO native code:Git::bytecode:another VCS, or worse, patch stacks, etc. I don't know what drush did before to "make this easy" - maybe pop off patch stacks, update the module, then re-apply the patches? Fact is, though, nothing Drush could have done under CVS can compare to patching with native Git commits: your patches can speak the same language as upstream changes, and you have all of Git's merge& rebase behavior at your fingertips to reconcile them.
There are some occasional exceptions to this, but I really do think it's a bit daft not to keep the full history. Keeping that history means peace of mind that your patches (now commits) can be intelligently merged with all changes ever made to that module for all time, across new versions, across Drupal major versions...blah blah blah. Trading a few hundred MB of disk space for that is MORE than worth it.
cheers s
On 2/28/11 10:56 AM, Marco Carbone wrote:
Since a Git clone downloads the entire Drupal repository, the Drupal codebase is no longer so lightweight (~50MB) if you are using Git, especially as if you clone contrib module repositories as well.
With CVS, our usual practice with clients was to checkout core and contrib using CVS, so that we can easily monitor any patches that have been applied, so that they wouldn't be lost when updating to newer releases. (Drush makes this particularly easy.) This is doable with Git as well, but now there seems to be the added cost of having to download the full repository. This is great when doing core/contrib development, but not really necessary for client work. This is unavoidable as far as I can tell, but I don't think I'm satisfied with the "just use a tarball and don't hack core/contrib" solution, especially when patches come into play.
Is there something I'm missing/not understanding here, or does one just have to accept the price of a bigger codebase when using Git to manage core/contrib code? Or is managing core/contrib code this way passe now that updates can be done through the UI?
-marco////