I'm new to drupal, drush and git, not had much sql exposure for 20 years but experienced in unix systems and all the usual software technologies. Please tell me nicely if I posted in the wrong place where I should have posted it.
I'm setting a drupal site up so I can post offline to a virtual machine on my laptop, track my development in git (fabulous tool, well done Linus). This I have working fine on several vm instances and my remote server. However, as I'd like the database backed in a git repo I was playing with ways to do that in combination with drush sql-sync and drush sql-dump when I noticed that the database that drush sql-sync produces (located with %dump in a drush alias) has differences to the one produced by drush sql-dump --result-file=..). That is, if I sql-sync from another machine and I then sql-dump the database into a file its not the same as the one that was dumped during the sync.
Why ?
Further to this I'm trying to figure out a scheme where I know there is a copy of the current database with a known name inside the git scope but its a little tricky if the commands to dump produce different results to the ones produced by sql-sync. I'm beginning to feel these sql database results are non-deterministic :-). Maybe they are ;-) (note on each machine there is only *one* database for drupal)
andy
Cheers
andy
Following on from my post one twig up in this thread ..
So now I have three ways to save the database:
1. syncing from another machine with %dump=wherever... in the alias 2. drush sql-dump --result-file=wherever... 3. drush sql-sync @mysite @mysite
They all produce different sql - which is best?
From a processing perspective admittedly 3. is terrible because I presume it exports the db then re-imports it.
I seek a reliable practice to adopt then if I always do it the same way I can happily shove the current db in a git repo (hey its only a small database).
andy
Um... you do understand that this is data, right? It can't be thought of like source code. You will never get exactly the same data twice, even if you haven't edited a single thing on your site.
Every DB export will be different. There are even session variables in the DB, so it's not like you can use it to track changes. What difference does it make which backup methodology you choose as long as you've tested the restore? I assume that's what you're using it for, since it really doesn't make sense from the version control perspective.
I'd use the dump method since you're really not trying to synchronize between two dbs. 3. seems dangerouse since it will try and backup and restore over the top of the same site in a single execution which is not what the script is intended for. Seems like asking for trouble.
-----Original Message----- From: support-bounces@drupal.org [mailto:support-bounces@drupal.org] On Behalf Of Andy Heath Sent: Wednesday, July 13, 2011 10:58 AM To: support@drupal.org Subject: Re: [support] WAS Why .. now 3 ways to save the db which is best
Following on from my post one twig up in this thread ..
So now I have three ways to save the database:
1. syncing from another machine with %dump=wherever... in the alias 2. drush sql-dump --result-file=wherever... 3. drush sql-sync @mysite @mysite
They all produce different sql - which is best?
From a processing perspective admittedly 3. is terrible because I presume it exports the db then re-imports it.
I seek a reliable practice to adopt then if I always do it the same way I can happily shove the current db in a git repo (hey its only a small database).
andy
On 13/07/2011 19:25, Metzler, David wrote:
Um... you do understand that this is data, right? It can't be thought of like source code. You will never get exactly the same data twice, even if you haven't edited a single thing on your site.
Every DB export will be different. There are even session variables in the DB, so it's not like you can use it to track changes. What difference does it make which backup methodology you choose as long as you've tested the restore? I assume that's what you're using it for, since it really doesn't make sense from the version control perspective.
This is very useful, thankyou David - no I hadn't realised this - the point about session variables puts it all in perspective for me.
So if I *did* want to use it from a version control perspective I'd need to be able to extract the content in some non-permeable form (i.e. so the same content had always the same representation - as it does with files in a file system for example) - in fact that gives me an idea - with a bit of nifty footwork (e.g. knocking out mysqld for a short period) it should be possible to do something here directly with the database files in the filesystem. I did do this with a running database once (tarred it up and shifted it across machines) and it worked fine. :-). Never did like sql - its only an API after all, it all ends up on some filesystem or other. Hmmmm - wonder if it uses any of the timestamp info on the files. Hmmm2 - wonder what the variability across the files is. It doesn't make sense to me to put an exported database in a version control system because of this point that no two exports will be the same even without change in the content but it might make sense to put the raw files beneath it in one (with careful control of the processes that use the files).
andy
On Wed, 13 Jul 2011 21:10:14 +0100 Andy Heath andyheathoss@axelrod.plus.com wrote:
On 13/07/2011 19:25, Metzler, David wrote:
Every DB export will be different. There are even session variables in the DB, so it's not like you can use it to track changes. What difference does it make which backup methodology you choose as long as you've tested the restore? I assume that's what you're using it for, since it really doesn't make sense from the version control perspective.
This is very useful, thankyou David - no I hadn't realised this - the point about session variables puts it all in perspective for me.
So if I *did* want to use it from a version control perspective I'd need to be able to extract the content in some non-permeable form …
It is still possible to use database dumps to do some sort of version control for the database. The "ever changing" data is located only in some tables. Most of them are caches or session tables that can be truncated without further consequences. So you can make a dump of the database that excludes the data (=rows) from certain tables. This dump will be (nearly) constant as long as the configuration doesn't change and no content is added.
"drush sql-dump --ordered-dump" together with a good list of structure-tables would do the trick - with one exception that is mentioned here: http://drupal.org/node/1132238
roman
This is the best strategy by far; either a positive list or list of excluded tables in a using a dump option. Since you're just ramping though, I'd wait to you got a little more experience about what's data and what's not in drupal before you get to worried beyond a backup of the site.
What we typically do is export from production back to test/dev on a periodic basis to make sure that CODE we write works against production database. Version controlling drupal configuration data is pretty heady stuff, requiring intimate knowledge of the data structures used in your site.
I beg to disagree about SQL... it's much more than an API, but a programming language. It isn't well used in drupal, but is quite powerful as a language in its own right. It's set based, so there's a learning curve to use it well. There are things that take tons of work to do in PHP that can be done easily in SQL.
That being said, SQL is easily versioned and most of that in drupal is in .module or .install files already. It's only when the configuration lines get blurred that you have to start worrying about what's code and what's data. That's no different for images and media files on the file system than it is for the data that's in the dbms.
-----Original Message----- From: support-bounces@drupal.org [mailto:support-bounces@drupal.org] On Behalf Of Roman Zimmermann Sent: Wednesday, July 13, 2011 1:29 PM To: support@drupal.org Subject: Re: [support] WAS Why .. now 3 ways to save the db which is best
It is still possible to use database dumps to do some sort of version control for the database. The "ever changing" data is located only in some tables. Most of them are caches or session tables that can be truncated without further consequences. So you can make a dump of the database that excludes the data (=rows) from certain tables. This dump will be (nearly) constant as long as the configuration doesn't change and no content is added.
"drush sql-dump --ordered-dump" together with a good list of structure-tables would do the trick - with one exception that is mentioned here: http://drupal.org/node/1132238
roman
David and Roman - thankyou *both* for your very helpful suggestions. I don't disagree on sql David - particularly if you are working at the analysis/design level. I *have* done sql work but most of it was 30 years ago when it was just starting to compete with codasyl databases.
I like Roman's suggestion very much because it forces me to grapple with precisely the area you describe David - the relationship between the code and the data in drupal. I'm aiming to do some serious work with drupal particularly in accessibility (I've worked the last ten years in international accessibility technology standards - I edit a couple of them so I have to know my technology onions - drupal is a good vehicle to re-learn the php/sql ropes in a serious system again).
The test/dev/prod idea goes a long way yes, but in exploring code ideas, which is where I'm at/going, its incredibly useful to be able to version the whole thing - code+data - so as to be able to revert to some particular version reliably. So that's my next step - to explore precisely what *is* static and what isn't so I can drag out for some instance of the database some serialised form that is non-variant for that instance and can enable the database to be re-constructed. You have both helped me along that path, thankyou.
If it was easy it wouldn't be fun.
Roman - external configuration changing isn't an issue as I need to version those files too. Hmmmm - could be tricky with some of this in the db tho. - we'll see. I've read the link you posted Roman and the patch and this is the kind of thing I was thinking towards. Again, thankyou to you both, I'll probably come back to you when I got a little further on the code-versus data thing in terms of what's where in the tables.
andy
This is the best strategy by far; either a positive list or list of excluded tables in a using a dump option. Since you're just ramping though, I'd wait to you got a little more experience about what's data and what's not in drupal before you get to worried beyond a backup of the site.
What we typically do is export from production back to test/dev on a periodic basis to make sure that CODE we write works against production database. Version controlling drupal configuration data is pretty heady stuff, requiring intimate knowledge of the data structures used in your site.
I beg to disagree about SQL... it's much more than an API, but a programming language. It isn't well used in drupal, but is quite powerful as a language in its own right. It's set based, so there's a learning curve to use it well. There are things that take tons of work to do in PHP that can be done easily in SQL.
That being said, SQL is easily versioned and most of that in drupal is in .module or .install files already. It's only when the configuration lines get blurred that you have to start worrying about what's code and what's data. That's no different for images and media files on the file system than it is for the data that's in the dbms.
-----Original Message----- From: support-bounces@drupal.org [mailto:support-bounces@drupal.org] On Behalf Of Roman Zimmermann Sent: Wednesday, July 13, 2011 1:29 PM To: support@drupal.org Subject: Re: [support] WAS Why .. now 3 ways to save the db which is best
It is still possible to use database dumps to do some sort of version control for the database. The "ever changing" data is located only in some tables. Most of them are caches or session tables that can be truncated without further consequences. So you can make a dump of the database that excludes the data (=rows) from certain tables. This dump will be (nearly) constant as long as the configuration doesn't change and no content is added.
"drush sql-dump --ordered-dump" together with a good list of structure-tables would do the trick - with one exception that is mentioned here: http://drupal.org/node/1132238
roman
Sorry - should have made the environment clear ..
Drupal 7.4, drush 4.4, mysql 14.14, PHP PHP 5.3.5.
andy
Following on from my post one twig up in this thread ..
So now I have three ways to save the database:
- syncing from another machine with %dump=wherever... in the alias
- drush sql-dump --result-file=wherever...
- drush sql-sync @mysite @mysite
They all produce different sql - which is best?
From a processing perspective admittedly 3. is terrible because I presume it exports the db then re-imports it.
I seek a reliable practice to adopt then if I always do it the same way I can happily shove the current db in a git repo (hey its only a small database).
andy