[development] SNA - status report

Robert Wohleb rob at techsanctuary.com
Sat Jun 10 20:44:20 UTC 2006


Excuse me if we are not on the same page when it comes to your goals.
I'm going to be making a few assumptions.

The graphs you generated are pretty interesting. However, aren't we more
interested in clustering? Changing to a non-directed graph would
significantly decrease your data set without really losing much. Also,
your graphs didn't seem to reflect taxonomy. I think it would be very
valuable to visualize clustering of user posts around subjects.

Also, check out Dan Kaminsky's network graph experiments. Besides his
unique DNS hacks, he has generated some LARGE, and beautiful, network
traffic graphs. He also uses Drupal :)

http://www.doxpara.com/
http://www.doxpara.com/?q=node/1133
http://www.opte.org/

~Rob

Novák Áron wrote:
> Hi,
>
> I would like to share you what I've done in the SoC.
> At first I test the performance of various data storing method (SQL,
> gdb, etc). The next step to me was create a graph from a Drupal-based
> site's sql data. The source of the graph is the node and the comments
> module. The scripts count activity between users (ie: reply to an
> article, reply to a comment) then create a directed, weighted graph from
> these information. Now the graph is just serialized out to a file (might
> be changed).
> Here is the data structure:
> [uidA] => 
>            [uid1] => [weight]
> [uidB] => 
>            [uid1] => [weight]
>            [uid2] => [weight]
>            [uidN] => [weight]
> This is an adjacency list. I store the adjacent of each user. I also
> write a function that create a graphviz input from this array. The
> graphviz generated site maps are interesting for me but I know this is
> not the main purpose of my project. I had very big performance problem
> with graphviz when I tried to visualize a graph with thousand of edges
> but I realized that an image with many-many edges is rather useless.
> Then I simply throw out low-scored edges. I haven't written yet the
> not-important-edge detector function. I've just set a limit by hand in
> the script.
> I generate the social map of a very popular Hungarian portal (~8000
> users and ~250000 comments). 
>
> As far as I can see the next step is to implement the Dijkstra algorithm
> with this data structure and to test how effective this data structure
> and data storing.
>
> All the results above (images, source codes and test results) can be
> viewed at http://sna.drupaler.net
>
> Aron Novak
>
>
>
>
>   

-- 
----------------------------------------------------------
It is by Caffeine alone that I set my mind in motion
It is by the beans of Java, that my thoughts acquire speed
The hands acquire shakes; the shakes become a warning 
It is by Caffeine alone that I set my mind in motion



More information about the development mailing list