Re: [development] Drupal could be 10x times faster with correct indexing
When it comes to database capabilities, MySQL is quite a bit more than a glorified card catalog. When it comes to database architecture, Drupal developers are quite a bit more than rejected Microsoft Access programmers. ----- "Jean-Michel Pouré" <jm@poure.com> wrote:
Here is the issue with solutions: http://drupal.org/node/276742
From the issue, it seems like your indexes are missing or misconfigured.
When using MySQL, you don't really know how SQL queries are treated, except execution times that may be monitored (?).
Wrong. Use EXPLAIN.
This way, the database server is always faster than any kind of PHP cache.
Wrong. Databases have higher latency for even simple queries compared to PHP static caching and memcache.
There is an extensive query debugging needed on Drupal: [...] 6) Add indexes everywhere.
Wrong. Indexes add multiple forms of overhead. They should not be added gratuitously. Indexes should be added when they enhance performance more than they cost it or if they enhance data integrity (UNIQUE/PRIMARY keys).
[...] I consider this as a bug, [...]
Wrong. Bugs are issues of correctness. Performance improvements are not issues of correctness.
My reply may have been overly blunt, but it's insulting to have someone post to our development list promising a 10x performance boost from methods we've clearly considered and implemented. Development work and discussion surrounding database indexes is easy to find with a quick search of Drupal.org. ----- "David Timothy Strauss" <david@fourkitchens.com> wrote:
When it comes to database capabilities, MySQL is quite a bit more than a glorified card catalog. When it comes to database architecture, Drupal developers are quite a bit more than rejected Microsoft Access programmers.
----- "Jean-Michel Pouré" <jm@poure.com> wrote:
Here is the issue with solutions: http://drupal.org/node/276742
From the issue, it seems like your indexes are missing or misconfigured.
When using MySQL, you don't really know how SQL queries are treated, except execution times that may be monitored (?).
Wrong. Use EXPLAIN.
This way, the database server is always faster than any kind of PHP cache.
Wrong. Databases have higher latency for even simple queries compared to PHP static caching and memcache.
There is an extensive query debugging needed on Drupal: [...] 6) Add indexes everywhere.
Wrong. Indexes add multiple forms of overhead. They should not be added gratuitously. Indexes should be added when they enhance performance more than they cost it or if they enhance data integrity (UNIQUE/PRIMARY keys).
[...] I consider this as a bug, [...]
Wrong. Bugs are issues of correctness. Performance improvements are not issues of correctness.
Le lundi 30 juin 2008 à 15:35 -0500, David Timothy Strauss a écrit :
My reply may have been overly blunt, but it's insulting to have someone post to our development list promising a 10x performance boost from methods we've clearly considered and implemented. Development work and discussion surrounding database indexes is easy to find with a quick search of Drupal.org.
I don't want to start a flame war. First, I have to apoligize because it seems that some indexes are missing in my PostgreSQL schema, whereas indexes are set in the MySQL schema. There are some missing indexes, but very few. As proposed, I will try to submit patches. This will not change the overall performance by 10x. Secondly, my Apache installation lacked a maximum PostgreSQL pconnect number of connections. My fault, because I migrated recently and forgot to set a maximum number of pconnect. When using Drupal without cache, attackers were able to create a large number of database sockets, because Drupal runs multiple SQL commands. After 10 minutes my dedicated 3Gb RAM server was down and I was a little bit surprised. It seems that Drupal should always be installed with normal cache option. Otherwize, too many SQL commands are sent. Sometimes, I could notice that Drupal would send 5 times the same SELECT on user command. Third, I looked at the shouts module and discovered there was no indexing at all. The attackers only needed to call this module repetedly. A lot of modules use a SELECT to query a table and then explore each subsequent node sending a SELECT on each node. This is a clear overhead and can be used to attack a Drupal site. Looking at the Feeded, it seems to work this way too, except that it is correctly indexed. But this is the Drupal way to explore nodes I believe. I don't know if theses SELECTs can be replaced with LEFT JOINS. Fourth, sorry I didn't mean to attack Drupal developers. Drupal is far too important for me to annoy you. I will do my best to submit patches about the few SQL commands that need indexing. Forgot this silly 10x times faster title. Drupal could be much faster if using LEF JOINs and indexes when appropriate, but this would not be 10 times. Kind regards, Jean-Michel
On Tue, Jul 1, 2008 at 2:49 AM, Jean-Michel Pouré <jm@poure.com> wrote:
Le lundi 30 juin 2008 à 15:35 -0500, David Timothy Strauss a écrit :
My reply may have been overly blunt, but it's insulting to have someone post to our development list promising a 10x performance boost from methods we've clearly considered and implemented. Development work and discussion surrounding database indexes is easy to find with a quick search of Drupal.org.
I don't want to start a flame war.
First, I have to apoligize because it seems that some indexes are missing in my PostgreSQL schema, whereas indexes are set in the MySQL schema. There are some missing indexes, but very few. As proposed, I will try to submit patches.
Jean Michel PostgreSQL has a chronic problem in Drupal: lack of testers and maintainers who use it. We occasionally get someone who is willing to maintain the PostgreSQL for core and does so for a while, then disappears then the cycle starts again. In contrib modules, very few provide PostgreSQL support, for many reasons. So, if you want proper PostgreSQL support, then start testing and submitting patches, AND continue to do so in the long term, not just for a month or two. -- Khalid M. Baheyeldin 2bits.com, Inc. http://2bits.com Drupal optimization, development, customization and consulting.
Le mardi 01 juillet 2008 à 03:11 -0400, Khalid Baheyeldin a écrit :
So, if you want proper PostgreSQL support, then start testing and submitting patches, AND continue to do so in the long term, not just for a month or two.
I will try, as Drupal is a crutial issue for our community. Kind regards, JMP
On Tue, Jul 1, 2008 at 10:07 AM, Jean-Michel Pouré wrote:
Le mardi 01 juillet 2008 à 03:11 -0400, Khalid Baheyeldin a écrit :
So, if you want proper PostgreSQL support, then start testing and submitting patches, AND continue to do so in the long term, not just for a month or two.
I will try, as Drupal is a crutial issue for our community. Kind regards, JMP
If you're looking for somewhere to get stuck in, we maintain a list of issues needing review on PostgreSQL here: http://groups.drupal.org/node/6980
It's by no means complete, but a good place to start. Nat
On Tue, 1 Jul 2008 10:16:04 +0100, "Nathaniel Catchpole" <catch56@googlemail.com> wrote:
On Tue, Jul 1, 2008 at 10:07 AM, Jean-Michel Pouré wrote:
Le mardi 01 juillet 2008 à 03:11 -0400, Khalid Baheyeldin a écrit :
So, if you want proper PostgreSQL support, then start testing and submitting patches, AND continue to do so in the long term, not just for a month or two.
I will try, as Drupal is a crutial issue for our community. Kind regards, JMP
If you're looking for somewhere to get stuck in, we maintain a list of issues needing review on PostgreSQL here: http://groups.drupal.org/node/6980
It's by no means complete, but a good place to start.
Nat
And please note the very first item on that list. The new Database API layer is designed to make supporting multiple databases easier, but right now it's stalled on lack of people who can help test the Postgres driver. If you know Postgres and PDO, we need your help to carry this thing home! --Larry Garfield
Le mardi 01 juillet 2008 à 09:57 -0500, Larry Garfield a écrit :
And please note the very first item on that list. The new Database API layer is designed to make supporting multiple databases easier, but right now it's stalled on lack of people who can help test the Postgres driver. If you know Postgres and PDO, we need your help to carry this thing home!
PDO are good news. I will have some time this summer and can monitor/review queries in logs, then submit patches. If you are moving to PDO, which is good news, there may be some interest to use PostgreSQL, as it writes fairly compliant SQL 92 code. I will get back shortly. Kind regards, JM
logs, then submit patches. If you are moving to PDO, which is good news, there may be some interest to use PostgreSQL, as it writes fairly compliant SQL 92 code.
there may be some confusion here. PDO writes no queries. It's a relatively simple driver nothing else.
----- "Karoly Negyesi" <karoly@negyesi.net> wrote:
logs, then submit patches. If you are moving to PDO, which is good news, there may be some interest to use PostgreSQL, as it writes fairly compliant SQL 92 code.
there may be some confusion here. PDO writes no queries. It's a relatively simple driver nothing else.
I think he means "PostgreSQL is fairly compliant with SQL 92."
----- "Jean-Michel Pouré" <jm@poure.com> wrote:
Le mardi 01 juillet 2008 à 03:11 -0400, Khalid Baheyeldin a écrit :
So, if you want proper PostgreSQL support, then start testing and submitting patches, AND continue to do so in the long term, not just for a month or two.
I will try, as Drupal is a crutial issue for our community. Kind regards, JMP
I'm glad to hear that you're interested in improving PostgreSQL support in Drupal. Help from people like you is necessary to keep PostgreSQL support healthy.
On Tue, Jul 1, 2008 at 4:49 PM, Jean-Michel Pouré <jm@poure.com> wrote:
Third, I looked at the shouts module and discovered there was no indexing at all. The attackers only needed to call this module repetedly. A lot of modules use a SELECT to query a table and then explore each subsequent node sending a SELECT on each node. This is a clear overhead and can be used to attack a Drupal site. Looking at the Feeded, it seems to work this way too, except that it is correctly indexed. But this is the Drupal way to explore nodes I believe. I don't know if theses SELECTs can be replaced with LEFT JOINS.
Are you referring to the 'shout' module? <http://docs.projectopus.com/releases/shout> The quality of code in contributed modules can vary. As a general rule, the code in contrib modules is of a significantly lower standard than the code in Drupal core. Before deploying a contrib module, or before using contrib code as the foundation for your own custom code, you should review the module's issue queue, its CVS commit history, and its release status. The 'shout' module isn't even on drupal.org, and it appears that it hasn't been maintained for over 18 months. You are using such module at your own risk. You have absolutely no reason to expect to find well-constructed SQL statements in such a module. This is not the place to complain about crusty, unmaintained, random old modules. 'Feeded'? Can't find it. Perhaps you mean <http://drupal.org/project/feed>? If so, this doesn't seem to be actively maintained, either. Modules like this are hardly ideal candidates to use when whinging about Drupal. Please whinge about core, or not at all. Cheers, Jaza.
participants (7)
-
David Timothy Strauss -
Jean-Michel Pouré -
Jeremy Epstein -
Karoly Negyesi -
Khalid Baheyeldin -
Larry Garfield -
Nathaniel Catchpole