[development] Call to developers, Yet another migration module / how to choose a module name?

Pierre Rineau pierre.rineau at makina-corpus.com
Sat Aug 1 13:50:06 UTC 2009


Hello all,

Working on some custom project for my company, I developed a module to
do massive migration between sites.

This module uses a full OO layer.

Its internal mechanism is based on abstracting objects to migrate from a
master site to clients. This abstraction defines how to construct object
dependency tree and how to serialize objects.

Object implementation (node, user, taxonomy, whatever) is really simple
to use, it's only 3 methods classes (register dependencies, save, and
update) using some kind of custom registry for developer to save/get
back data before and after serialization.

All error handling is exception oriented, and lower software layers
won't fail on higher layers unrecoverable errors.

Object fetching is based on a push/pull mechanism. Server push the sync
order, client responds OK or not. If OK, it creates a job using DataSync
module which allow it to run as CLI thread (which won't hurt the web
server, and allow us a larger memory limit at run time). DataSync module
uses MySQL transactions (shame it's only MySQL compliant, but I hope it
will evolve, I'm thinking about PostgreSQL).

During the DataSync job execution, client will pull an original set of
content, and browsing it will do incremental dependencies fetching (by
pulling again server), based on xmlrpc (fetching component is also
abstracted, and could be any other communication method than xmlrpc).

To be unobtrusive on the system, smart unset() is done after building a
dependencies subtree, and there is a recursion breaker in case of
circular dependencies.

This module was created because the deploy module seems to be so
unstable, I did not want to risk client's production sites to run with
it. I started implementation of some sort of "deploy plan", using
profile based on views, you construct a set of views, saved them in a
profile, then all objects that these views reference will be
synchronized.

Right now, the module fully synchronize taxonomy and content types,
partially synchronize users (including core account information and
passwords), and I have a small bug left to handle with nodes (revision
problem I think).

There might be a performance or overhead problem with this conception
with a very large amount of data, it could break easily. The only way to
be sure it won't break is I think to migrate stuff with a numerous small
set of data. But the problem doing this is that it will be really hard
to keep the transactional context of DataSync module.

There is a lot of other custom goodies coming.

First thing is, what do you think about such module, should I commit it
on drupal.org? Is there people interested?

And, now that I described the module, what name should I give him,
considering the fact I'll probably commit it on drupal.org, if people
are interested.

I though about "YAMM" (Yet Another Migration Module), or YADM (Yet
Another Deployment Module).

The fact is there is *a lot* of modules which want to do the same thing
as this one, I just want a simple an expressive name.

Pierre.



More information about the development mailing list