[development] Git best practices for client codebases

Sam Boyer drupal at samboyer.org
Tue Mar 1 18:38:13 UTC 2011

On 3/1/11 8:13 AM, larry at garfieldtech.com wrote:
> I think the question is more about non-custom dev history; there's
> little need for a client site to have the complete development history
> of Drupal 4.3 in its repo, for instance.

So you do a shallow clone that skips irrelevant branches and only grabs
recent history on the ones you want, that's fine.

> Lately, what I've been doing/advocating is using Drush and real releases
> to download stuff from Drupal.org (core, contrib modules, etc.) and then
> checking the whole site into Git.  If I update a module, I use Drush for
> that and then update the code in my Git repo. Then deploy to production
> using *my* git repo (which has my full dev history but not every commit
> in every one of my projects ever) and tags.

...which is *exactly* what I'm saying is pointless. Why stick a stupider
intermediary - tarballs - into a system that's already highly capable of
doing patch & vendor management? The only thing you've accomplished is
diluting the capabilities of your version control system to manage
upstream changes.

> That keeps me on real releases, avoids unnecessary repository bloat, but
> still gives me a full history of all work on that project specifically.

"Unnecessary repository bloat?" Two great words there, let's address
each one:

"Unnecessary": well, the full branch history is a requirement if you
want to use git's smart merging algorithms. So the only way it's
"unnecessary" is if you prefer manually hauling chunks out of
patch-generated .rej and .orig files.

"Bloat": Really, step back and think about this. Are you solving a real,
compelling problem faced by most modern servers? How much does it matter
that your Drupal tree is, say, 70MB instead of 700MB? It really doesn't.
Not even on shared hosting. And, let's not forget - judicious use of
shallow clones & compression whittles that number way, WAY down. IMO,
ripping out the vendor history is something a lot of us got in the habit
of doing because we were used to having CVS vendor data that earned us
nothing but headaches, and it was an easy "optimization" that made our
Drupal trees feel more svelte.

Well, now it does get you something. It gets you a _ton_. Now, all you
need for company-specific or site-specific customizations that can
easily coexist with rich vendor data is some branch naming conventions
and practice with reading git logs. Yeah, that takes some learning too,
but it's worth it.


> --Larry Garfield
> On 3/1/11 1:56 AM, Sam Boyer wrote:
>> I tend to advocate full clone. You're talking about a task that version
>> control is designed for. Now that we've made the switch, IMO native
>> code:Git::bytecode:another VCS, or worse, patch stacks, etc. I don't
>> know what drush did before to "make this easy" - maybe pop off patch
>> stacks, update the module, then re-apply the patches? Fact is, though,
>> nothing Drush could have done under CVS can compare to patching with
>> native Git commits: your patches can speak the same language as upstream
>> changes, and you have all of Git's merge&  rebase behavior at your
>> fingertips to reconcile them.
>> There are some occasional exceptions to this, but I really do think it's
>> a bit daft not to keep the full history. Keeping that history means
>> peace of mind that your patches (now commits) can be intelligently
>> merged with all changes ever made to that module for all time, across
>> new versions, across Drupal major versions...blah blah blah. Trading a
>> few hundred MB of disk space for that is MORE than worth it.
>> cheers
>> s
>> On 2/28/11 10:56 AM, Marco Carbone wrote:
>>> Since a Git clone downloads the entire Drupal repository, the Drupal
>>> codebase is no longer so lightweight (~50MB) if you are using Git,
>>> especially as if you clone contrib module repositories as well.
>>> With CVS, our usual practice with clients was to checkout core and
>>> contrib using CVS, so that we can easily monitor any patches that have
>>> been applied, so that they wouldn't be lost when updating to newer
>>> releases.  (Drush makes this particularly easy.) This is doable with Git
>>> as well, but now there seems to be the added cost of having to download
>>> the full repository. This is great when doing core/contrib development,
>>> but not really necessary for client work. This is unavoidable as far as
>>> I can tell, but I don't think I'm satisfied with the "just use a tarball
>>> and don't hack core/contrib" solution, especially when patches come into
>>> play.
>>> Is there something I'm missing/not understanding here, or does one just
>>> have to accept the price of a bigger codebase when using Git to manage
>>> core/contrib code? Or is managing core/contrib code this way passe now
>>> that updates can be done through the UI?
>>> -marco////

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 203 bytes
Desc: OpenPGP digital signature
Url : http://lists.drupal.org/pipermail/development/attachments/20110301/6425ae21/attachment-0001.bin 

More information about the development mailing list