I agree with Greg&#39;s and others remarks about documentation and showing difference with other modules... and you can see this is already a fascinating discussion because it adressess so many needs.<br><br>But the bottom line is you should not feel you have to jump through hoops to post an early version! Go ahead and refactor later.<br>


<br>Victor<br><br><div class="gmail_quote">On Sat, Aug 1, 2009 at 12:47 PM, Pierre Rineau <span dir="ltr">&lt;<a href="mailto:pierre.rineau@makina-corpus.com">pierre.rineau@makina-corpus.com</a>&gt;</span> wrote:<br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">


Oh, and I should precise, it may won&#39;t be released until we deliver our<br>

project to client (which is, if I remember well maybe in over a month).<br>

Because their needs can evolve, some architectural choices may too.<br>

<br>

But don&#39;t worry, any changes on what I explained shouldn&#39;t be more that<br>

some higher abstraction level and new features.<br>

<br>

To answer about documentation, I&#39;ll try to write some and also give some<br>

diagrams, but as DataSync relies on the system, it may be not that easy<br>

to install. I&#39;ll see in some days when I&#39;ll start deploying it on a<br>

preprod environment.<br>

<font color="#888888"><br>

Pierre.<br>

</font><div class="im"><br>

On Sat, 2009-08-01 at 15:50 +0200, Pierre Rineau wrote:<br>

</div><div><div></div><div class="h5">&gt; Hello all,<br>

&gt;<br>

&gt; Working on some custom project for my company, I developed a module to<br>

&gt; do massive migration between sites.<br>

&gt;<br>

&gt; This module uses a full OO layer.<br>

&gt;<br>

&gt; Its internal mechanism is based on abstracting objects to migrate from a<br>

&gt; master site to clients. This abstraction defines how to construct object<br>

&gt; dependency tree and how to serialize objects.<br>

&gt;<br>

&gt; Object implementation (node, user, taxonomy, whatever) is really simple<br>

&gt; to use, it&#39;s only 3 methods classes (register dependencies, save, and<br>

&gt; update) using some kind of custom registry for developer to save/get<br>

&gt; back data before and after serialization.<br>

&gt;<br>

&gt; All error handling is exception oriented, and lower software layers<br>

&gt; won&#39;t fail on higher layers unrecoverable errors.<br>

&gt;<br>

&gt; Object fetching is based on a push/pull mechanism. Server push the sync<br>

&gt; order, client responds OK or not. If OK, it creates a job using DataSync<br>

&gt; module which allow it to run as CLI thread (which won&#39;t hurt the web<br>

&gt; server, and allow us a larger memory limit at run time). DataSync module<br>

&gt; uses MySQL transactions (shame it&#39;s only MySQL compliant, but I hope it<br>

&gt; will evolve, I&#39;m thinking about PostgreSQL).<br>

&gt;<br>

&gt; During the DataSync job execution, client will pull an original set of<br>

&gt; content, and browsing it will do incremental dependencies fetching (by<br>

&gt; pulling again server), based on xmlrpc (fetching component is also<br>

&gt; abstracted, and could be any other communication method than xmlrpc).<br>

&gt;<br>

&gt; To be unobtrusive on the system, smart unset() is done after building a<br>

&gt; dependencies subtree, and there is a recursion breaker in case of<br>

&gt; circular dependencies.<br>

&gt;<br>

&gt; This module was created because the deploy module seems to be so<br>

&gt; unstable, I did not want to risk client&#39;s production sites to run with<br>

&gt; it. I started implementation of some sort of &quot;deploy plan&quot;, using<br>

&gt; profile based on views, you construct a set of views, saved them in a<br>

&gt; profile, then all objects that these views reference will be<br>

&gt; synchronized.<br>

&gt;<br>

&gt; Right now, the module fully synchronize taxonomy and content types,<br>

&gt; partially synchronize users (including core account information and<br>

&gt; passwords), and I have a small bug left to handle with nodes (revision<br>

&gt; problem I think).<br>

&gt;<br>

&gt; There might be a performance or overhead problem with this conception<br>

&gt; with a very large amount of data, it could break easily. The only way to<br>

&gt; be sure it won&#39;t break is I think to migrate stuff with a numerous small<br>

&gt; set of data. But the problem doing this is that it will be really hard<br>

&gt; to keep the transactional context of DataSync module.<br>

&gt;<br>

&gt; There is a lot of other custom goodies coming.<br>

&gt;<br>

&gt; First thing is, what do you think about such module, should I commit it<br>

&gt; on <a href="http://drupal.org" target="_blank">drupal.org</a>? Is there people interested?<br>

&gt;<br>

&gt; And, now that I described the module, what name should I give him,<br>

&gt; considering the fact I&#39;ll probably commit it on <a href="http://drupal.org" target="_blank">drupal.org</a>, if people<br>

&gt; are interested.<br>

&gt;<br>

&gt; I though about &quot;YAMM&quot; (Yet Another Migration Module), or YADM (Yet<br>

&gt; Another Deployment Module).<br>

&gt;<br>

&gt; The fact is there is *a lot* of modules which want to do the same thing<br>

&gt; as this one, I just want a simple an expressive name.<br>

&gt;<br>

&gt; Pierre.<br>

&gt;<br>

<br>

</div></div></blockquote></div><br>