Greg Knaddison wrote:
On 1/16/06, Nick Lewis <nick@smartcampaigns.com> wrote:
John Handelaar wrote:
IP-based *anything* - Just Say No. It's *spectactularly* inaccurate and, frankly, an amateur's mistake.
<snip>
I disagree with the above assumption (though I acknowledge that it is correct -- in *some* ways, and in *some* situations) on the basis of personal experience.
It does use IP, so it could present problems, but for most situations it is a good generalization of what is going on.
Again, not if your tracked users are behind balanced proxies. There are entire countries which fit that description, and other surprisingly-large places like the UK which are heavily affected. So if (for example) you're in the UK, it's just BROKEN for the 30% of *everybody* who's on AOL, and another 20%-ish on ja.net, and (let's be generous) no more than one in twenty others. 55% isn't *some*, ffs. And by NO definition would the remaining 45% count as "most situations". I'm taking a maximal estimate there of JaNet usage, but those numbers don't get any prettier if you reduce that number to zero. Honestly, I'm a little surprised one can be *in* the analytics business and not know this stuff.
John's solution of using the sessionID confuses me (can you expand a bit more) but my understanding of it is that it either presents a privacy problem or would be confusing to the user or both.
"Solution" is pushing it. "Wild suggestion out of left field" is closer :) Certainly it's not confusing to end users, since it's transparent, and there are no privacy issues connected to values derived from session IDs [1] which don't already exist in the fact that Drupal uses sessions all over the place in the method prescribed by the authors of the PHP language. It goes like this: 1) Module alters the link element in the RSS feed on a per-user basis. Links are amended to force clickthroughs (and referred links) through that module's handler. The new link contains an ID [1] and the original destination. [2] 2) When someone clicks on one of those links, the module logs the click and "passes through" to the original destination. 3) If you want to collect IPs as well, you can use the relational database we all have access to to group them by SID: SELECT DISTINCT remote_ip FROM linktracker WHERE... I mean, if you're going to log IPs, you need context. Otherwise you end up with either i) too many IPs per session and no trail, or ii) a metric assload of people hiding behind only one IP who look like one person if you ignore the context of the session. You avoid this by basing your primary ID for tracking on the session which generated the feed. IP info is secondary, and you may even get the bonus of it being useful sometimes. jh [1] You can't use actual session IDs for security reasons, but you can use something derived from them, like an MD5 hash [2] This has caching implications which would need to be addressed.