[development] Call to developers, Yet another migration module / how to choose a module name?

Pierre Rineau pierre.rineau at makina-corpus.com
Sat Aug 1 15:47:33 UTC 2009

Oh, and I should precise, it may won't be released until we deliver our
project to client (which is, if I remember well maybe in over a month).
Because their needs can evolve, some architectural choices may too.

But don't worry, any changes on what I explained shouldn't be more that
some higher abstraction level and new features.

To answer about documentation, I'll try to write some and also give some
diagrams, but as DataSync relies on the system, it may be not that easy
to install. I'll see in some days when I'll start deploying it on a
preprod environment.


On Sat, 2009-08-01 at 15:50 +0200, Pierre Rineau wrote:
> Hello all,
> Working on some custom project for my company, I developed a module to
> do massive migration between sites.
> This module uses a full OO layer.
> Its internal mechanism is based on abstracting objects to migrate from a
> master site to clients. This abstraction defines how to construct object
> dependency tree and how to serialize objects.
> Object implementation (node, user, taxonomy, whatever) is really simple
> to use, it's only 3 methods classes (register dependencies, save, and
> update) using some kind of custom registry for developer to save/get
> back data before and after serialization.
> All error handling is exception oriented, and lower software layers
> won't fail on higher layers unrecoverable errors.
> Object fetching is based on a push/pull mechanism. Server push the sync
> order, client responds OK or not. If OK, it creates a job using DataSync
> module which allow it to run as CLI thread (which won't hurt the web
> server, and allow us a larger memory limit at run time). DataSync module
> uses MySQL transactions (shame it's only MySQL compliant, but I hope it
> will evolve, I'm thinking about PostgreSQL).
> During the DataSync job execution, client will pull an original set of
> content, and browsing it will do incremental dependencies fetching (by
> pulling again server), based on xmlrpc (fetching component is also
> abstracted, and could be any other communication method than xmlrpc).
> To be unobtrusive on the system, smart unset() is done after building a
> dependencies subtree, and there is a recursion breaker in case of
> circular dependencies.
> This module was created because the deploy module seems to be so
> unstable, I did not want to risk client's production sites to run with
> it. I started implementation of some sort of "deploy plan", using
> profile based on views, you construct a set of views, saved them in a
> profile, then all objects that these views reference will be
> synchronized.
> Right now, the module fully synchronize taxonomy and content types,
> partially synchronize users (including core account information and
> passwords), and I have a small bug left to handle with nodes (revision
> problem I think).
> There might be a performance or overhead problem with this conception
> with a very large amount of data, it could break easily. The only way to
> be sure it won't break is I think to migrate stuff with a numerous small
> set of data. But the problem doing this is that it will be really hard
> to keep the transactional context of DataSync module.
> There is a lot of other custom goodies coming.
> First thing is, what do you think about such module, should I commit it
> on drupal.org? Is there people interested?
> And, now that I described the module, what name should I give him,
> considering the fact I'll probably commit it on drupal.org, if people
> are interested.
> I though about "YAMM" (Yet Another Migration Module), or YADM (Yet
> Another Deployment Module).
> The fact is there is *a lot* of modules which want to do the same thing
> as this one, I just want a simple an expressive name.
> Pierre.

More information about the development mailing list