Mark, Good points. However, I think there are 2 types of Drupal users: Enterprise, and non-enterprise. (read: large/small). Meaning, not everyone has a company large-enough where they can have the dedicated DB staff to maintain, create, and update the sp's, triggers, etc. However, on the flip side, companies that do have this type of setup and resources balk at Drupal because they can't leverage their DB staff to truly separate the db from the app-layer. It's a catch 22 either way you think of it. By and large though, most Drupal installs do not have the dedicated DB staff to truly separate the layers; and I do mean dedicated. Maintaining and optimizing triggers, sp's, etc. in a production environment is a full time job and not really something to be looked at as a 'hobby'. The other flip side to stored procs, is PHP's arch. And what sp's were made for. Each request is it's own thread. (this is knowledge from SQL Server, I'll have to read about MySQL's sp implementation to see if it's the same). But, depending on the DB, the sp is compiled upon first use with each connection. If the same connection calls the sp again, the cached query plan is used. However, if the query plan is dynamic (e.g. WHERE clause changes each time), performance actually goes down greatly. If the WITH RECOMPILE option is used (meaning re-do the query plan with each execution), performance only goes down slightly. More over, sp's are made for 2 things: 1. Reduce network traffic. 2. Place the processing on the database server and not the app. So, let's look at sp's; they send a small RPC call to the DB, and the db sends back the results. Where currently we send SQL statements to the DB, and get back results. The catch is though, that Drupal sends relatively small statements, and statements that are not really processor intensive, compared to say a manager's analytical application that sends a 40 line SQL statement that goes across 3 databases and takes about 15 seconds to process. I guess what I'm saying is that I don't think sp's would give Drupal much of a performance boost. However, this is a null point because I have no benchmarks to back this up. Regarding ref. integrity, I probably generalized and mis-spoke; when I said it, I meant the ID generation, meaning module x can generate an id for table y and work with it before actual insertion...hmm....which upon thinking about it should all be handled in node_save function either way. However, I have many modules that do not create a node, thus creating a node and doing a node_save wouldn't give me an nid, if say I needed the eid (an arbitrary id I made up) because the item is not a node/content-type. All in all, there are tradeoffs either way...its just which one are we willing to accept? -----Original Message----- From: development-bounces@drupal.org [mailto:development-bounces@drupal.org] On Behalf Of Mark Fredrickson Sent: Wednesday, April 25, 2007 12:06 PM To: development@drupal.org Subject: Re: [development] Why do we avoid auto-incrementing db columns? Hi Earnest and others, On 4/25/07, Earnest Berry III <earnest.berry@gmail.com> wrote:
Hmm...I'm still more of a fan of controlling the IDs application level Than using a db generated serial number, especially when dealing with referential integrity.
I have the opposite standpoint. The more referential integrity that is in the app layer, the harder it it is to integrate and share data from the database. Use case: I have a another (non-PHP) system that needs to insert nodes. If it can't generate a valid ID for that node, this process is halted in it's tracks. I would like to see more data modeling logic moved to the db layer (stored procedures, triggers, views, etc), and leave PHP to do the theming and IO on top of that. E.g. CCK would be a great candidate for going to the db layer.
However, say an NID, of the txid (from e-commerce) gets passed around drupal and used by other modules, sometimes in loops, and sometimes for other modules to ref. the node; this I'd rather have some control of.
I do not see why these behaviors indicate that referential integrity should be handled in the app. If you are saying that within a loop, one must generate N IDs, each with a database call, there is some merit to argument that app created IDs would be better. But there are going to be other tradeoffs.
Also, the implementations of "next_val" per-database get a little hairy also. You also then have the issue of if you are in a transaction, the transaction level, etc. that all starts to play in to the db's implementation of "next_val" or "last_insert_id".
I'll defer to others' experience on this subject, but I'd be surprised if the problems were intractable. Considering that Drupal currently supports all of 2 databases (I'm crossing my fingers for SQLite support - that'd be cool) unifying the implementations shouldn't be that hard. -M