[development] Call to developers, Yet another migration module / how to choose a module name?
Gerhard Killesreiter
gerhard at killesreiter.de
Sat Aug 1 15:33:52 UTC 2009
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Pierre Rineau schrieb:
> Hello all,
>
> Working on some custom project for my company, I developed a module to
> do massive migration between sites.
>
> This module uses a full OO layer.
>
> Its internal mechanism is based on abstracting objects to migrate from a
> master site to clients. This abstraction defines how to construct object
> dependency tree and how to serialize objects.
>
> Object implementation (node, user, taxonomy, whatever) is really simple
> to use, it's only 3 methods classes (register dependencies, save, and
> update) using some kind of custom registry for developer to save/get
> back data before and after serialization.
>
> All error handling is exception oriented, and lower software layers
> won't fail on higher layers unrecoverable errors.
>
> Object fetching is based on a push/pull mechanism. Server push the
> sync order, client responds OK or not. If OK, it creates a job using
> DataSync module which allow it to run as CLI thread (which won't
> hurt the web server, and allow us a larger memory limit at run
> time).
I am generally not happy with datasync's approach to run shell-scripts
as the webserver user. Have you considered to use Drush instead?
> During the DataSync job execution, client will pull an original set
> of content, and browsing it will do incremental dependencies
> fetching (by pulling again server), based on xmlrpc (fetching
> component is also abstracted, and could be any other communication
> method than xmlrpc).
Wouldn't the server better be qualified to decide which data the
client needs?
> To be unobtrusive on the system, smart unset() is done after
> building a dependencies subtree, and there is a recursion breaker in
> case of circular dependencies.
Have you tried it with php 5.3?
> This module was created because the deploy module seems to be so
> unstable, I did not want to risk client's production sites to run
> with it. I started implementation of some sort of "deploy plan",
> using profile based on views, you construct a set of views, saved
> them in a profile, then all objects that these views reference will
> be synchronized.
>
> Right now, the module fully synchronize taxonomy and content types,
> partially synchronize users (including core account information and
> passwords), and I have a small bug left to handle with nodes (revision
> problem I think).
>
> There might be a performance or overhead problem with this
> conception with a very large amount of data, it could break
> easily.
How large is your "very large"? If I wanted to sync 10k nodes to 100
client sites, how successful would I be?
> The only way to be sure it won't break is I think to migrate stuff
> with a numerous small set of data. But the problem doing this is
> that it will be really hard to keep the transactional context of
> DataSync module.
Yeah, one reason to let the server handle this, no?
> There is a lot of other custom goodies coming.
>
> First thing is, what do you think about such module, should I commit
> it on drupal.org? Is there people interested?
I am certainly interested, especially if my concerns from above can be
addressed. ;)
> And, now that I described the module, what name should I give him,
> considering the fact I'll probably commit it on drupal.org, if
> people are interested.
>
> I though about "YAMM" (Yet Another Migration Module), or YADM (Yet
> Another Deployment Module).
>
> The fact is there is *a lot* of modules which want to do the same
> thing as this one, I just want a simple an expressive name.
Data migration is an important and diverse task. IMO it doesn't hurt
to have several approaches.
Cheers,
Gerhard
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (GNU/Linux)
iEYEARECAAYFAkp0YGAACgkQfg6TFvELooSOzACfUr5q/9Eu5b8YETgXu6CNYLZN
JugAn1j8/8nlbVV55RmsP9ZLc9px35/A
=rk5A
-----END PGP SIGNATURE-----
More information about the development
mailing list