[development] SNA - status report

Novák Áron aaron at szentimre.hu
Sat Jun 10 18:02:45 UTC 2006


Hi,

I would like to share you what I've done in the SoC.
At first I test the performance of various data storing method (SQL,
gdb, etc). The next step to me was create a graph from a Drupal-based
site's sql data. The source of the graph is the node and the comments
module. The scripts count activity between users (ie: reply to an
article, reply to a comment) then create a directed, weighted graph from
these information. Now the graph is just serialized out to a file (might
be changed).
Here is the data structure:
[uidA] => 
           [uid1] => [weight]
[uidB] => 
           [uid1] => [weight]
           [uid2] => [weight]
           [uidN] => [weight]
This is an adjacency list. I store the adjacent of each user. I also
write a function that create a graphviz input from this array. The
graphviz generated site maps are interesting for me but I know this is
not the main purpose of my project. I had very big performance problem
with graphviz when I tried to visualize a graph with thousand of edges
but I realized that an image with many-many edges is rather useless.
Then I simply throw out low-scored edges. I haven't written yet the
not-important-edge detector function. I've just set a limit by hand in
the script.
I generate the social map of a very popular Hungarian portal (~8000
users and ~250000 comments). 

As far as I can see the next step is to implement the Dijkstra algorithm
with this data structure and to test how effective this data structure
and data storing.

All the results above (images, source codes and test results) can be
viewed at http://sna.drupaler.net

Aron Novak



More information about the development mailing list