[development] User Status [WAS: User last access]

Ivan Sergio Borgonovo mail at webthatworks.it
Sun May 4 17:04:20 UTC 2008


On Sun, 4 May 2008 10:29:23 -0500 (CDT)
David Timothy Strauss <david at fourkitchens.com> wrote:

> I couldn't care less about the SQL92 standard (or any other SQL
> standard).
> 
> What I care about is:
> * What we do is space efficient on MySQL
> * What we do is fast on MySQL
> * What we do works on other databases
> 
> SQL92 doesn't answer any of those questions for me.
> 
> Consequently, I have a few goals if we implement status bits for
> user accounts:
> * Not having one byte per bit
> * Not having O(n) searches for common bit-status user queries, if
> we have such queries
> * Ability to run on other databases, without significant regard to
> performance or space efficiency

char(0) seems to work just on sqlite and MySQL.
It is advertised as using just 1 bit. I didn't see anything that
explain its implementation, advantages or drawbacks.

MySQL support boolean and they "waste" the same/less space as a
char(1) in C encoding + they don't suffer of ordering problems
related to collation/locales.

Hopefully MySQL is going to fix this kind of quirks and come up with
a decent boolean implementation and you may end up in a not so
efficient, hardly portable mess.

If the "status" is going to store a boolean... I'd use a boolean.
Especially for "Logged in", "Used", "Never Used" status, boolean may
offer the additional advantage over char(0) of storing true, false,
null.

If it is going to store a multivalued column, the smallest things
that doesn't suffer from collation/locales/utf8 and works across
*several* DB is smallint, but it is 2 bytes large.
MySQL, MS SQL support tinyint (1byte), PostgreSQL, Oracle, DB2 don't,
just to name some.

char(1) could be a reasonable compromise. Problems with ordering
shouldn't be a real concern unless you really go hunting them with
"strange" choices.
It has to be seen if it really offer advantages over smallint in
terms of overhead once you're using utf8.
Not being a fixed length data type, it is going to incur in some
extra overhead, surely it has an easily to quantify space advantage
over smallint.

MySQL and several other DB support ENUM... but I don't think D5 and
D6 need another data type to deal with.

> "Make the common case fast."
> <http://en.wikipedia.org/wiki/Amdahl%27s_law>

It can't be read as you're advertising it.
You can't optimize the most used DB and make the overall application
run faster on all DB.

What about:
"We should forget about small efficiencies, say about 97% of the time:
premature optimization is the root of all evil." D. Knuth


-- 
Ivan Sergio Borgonovo
http://www.webthatworks.it



More information about the development mailing list