<:avglsdata id=avglsdata> Please, this should not be in development; it belongs in support.
Actually, Ivan's query won't work because each skill is on a separate row. Your query, AFAICT, is the only way the way to actually find multiple skill combinations. It also has the advantage of scaling to handle even more skills. But let's make it truly Drupalized:
SELECT e.employee_id FROM {employees} e WHERE EXISTS(SELECT 1 FROM {skills} s WHERE s.employee_id = e.employee_id AND s.skill_id = 'java') AND EXISTS(select 1 FROM {skills} s WHERE s.employee_id = e.employee_id AND s.skill_id = 'C++')
Of course, to use it in a db_query(), you will need to further parameterize it.
$skills_desired = array('java', 'c++', 'php'); // Values can be filled however you prefer. $wheres = array_fill(0, count($skills_desired), "EXISTS(SELECT 1 FROM {skills} s WHERE s.employee_id = e.employee_id AND s.skill_id = '%s')"); $query = "SELECT e.employee_id FROM {employees} e WHERE " . implode(' AND ', $wheres); if ($result = db_query($query, $skills_desired)) { while ($result = db_result($result)) { // ... do something with the employee. } } else { return t('No matches found.'); } This is, of course, Drupal 6 code. DBTNG can do this more efficiently.
Nancy Injustice anywhere is a threat to justice everywhere. -- Dr. Martin L. King, Jr.
________________________________
From: David Metzler Select * form employees e where exists (select 1 from skills s where s.employee_id=e.employee_id and s.skill_id = 'java') AND exists(select 1 from skills s where s.employee_id=e.employee_id and s.skill_id='C++')
I have heard that mysql will perform better with Ivan's query, but I haven't verified that this is the case.
On Sat, 16 Apr 2011 13:21:24 -0700 (PDT) nan wich nan_wich@bellsouth.net wrote:
<:avglsdata id=avglsdata> Please, this should not be in development; it belongs in support.
Actually, Ivan's query won't work because each skill is on a
It happens it worked with real data on a real DB (no pun on MySQL) The exists method works as well but there were some typo in David's example. I'd guess the join method should be faster on most DB engines but I didn't take the time to actually test it on more than one DB (postgres) on more than a tens of rows. On a slower box remotely connected the exists, 2 condition version was 10% slower (order of magnitude for complete execution 200ms). On a faster box on my lan the join and exists versions with 3 conditions were indistinguishable (12ms).
The test was so ridiculous I could just say both version works. I did test without the index (auto created by the pk). With such a small data set it changes the query plan but it doesn't change the timing.
separate row. Your query, AFAICT, is the only way the way to actually find multiple skill combinations. It also has the advantage of scaling to handle even more skills. But let's make it truly Drupalized:
Automatically generating a query for more than 2 skills with the JOIN method may look harder, but not that much (you've to name differently the tables aliases, and add some additional logic to write the correct join). If he has to work with a very large data set it would be advisable to really test which method runs really faster on *his* DB. I'd suspect that unless he is working on a very slow connection PHP overhead and difference in verbosity of the 2 queries won't make the difference.
create table test.employee (eid int, skill varchar (15), primary key (eid, skill));
insert into test.employee values(1, 'PHP'); insert into test.employee values(1, 'SQL'); insert into test.employee values(1, 'Java'); insert into test.employee values(1, 'VoIP'); insert into test.employee values(2, 'Shell Scripting'); insert into test.employee values(2, 'PHP'); insert into test.employee values(2, 'Java'); insert into test.employee values(2, 'SQL'); insert into test.employee values(3, 'C++'); insert into test.employee values(3, 'PHP'); insert into test.employee values(3, 'Java'); insert into test.employee values(3, 'Pearl');
select t1.eid from test.employee t1 join test.employee t2 on t1.eid=t2.eid join test.employee t3 on t2.eid=t3.eid where t1.skill='Java' and t2.skill='PHP' and t3.skill='SQL' group by t1.eid ;
select t1.eid from test.employee t1 join test.employee t2 on t1.eid=t2.eid and t2.skill='PHP' join test.employee t3 on t2.eid=t3.eid and t3.skill='Java' where t1.skill='SQL' group by t1.eid ;
select eid from test.employee e where exists (select 1 from test.employee s where s.eid=e.eid and s.skill = 'Java') and exists(select 1 from test.employee s where s.eid=e.eid and s.skill='PHP') and exists(select 1 from test.employee s where s.eid=e.eid and s.skill='SQL') group by eid ;
BTW the group by in this case is a synonym of distinct and anticipating conditions in the join has no effect on the query plan of a reasonably smart DB.
<:avglsdata id=avglsdata> I apologize, Ivan, in a quick read, I apparently confused your method with an earlier method. Indeed the multiple-join-same-table method not only works, but is the one often found in core (e.g. Taxonomy.module). I have not benchmarked either one, but I am going to assume that some core developer has at some time and that whoever that was discovered that either it is faster or that it is the best cross-database implementation. I do, however, feel that the EXISTS method is a more elegant and straight-forward solution, at least in MySql. Nancy Injustice anywhere is a threat to justice everywhere. -- Dr. Martin L. King, Jr.
________________________________
From: Ivan Sergio Borgonovo
Actually, Ivan's query won't work because each skill is on a
It happens it worked with real data on a real DB (no pun on MySQL)
Hi Nancy, David, Ivan, Robbie, Rob, Jason Thanks for wonderful ideas in this regard.
Probably three different kind of query, which will work in this case as discussed above.
1. GROUP BY .. HAVING COUNT(*) 2. EXISTS 3. And the one suggested by Ivan.
I adopted the below query, though not sure which one is faster in MySQL.
SELECT EmployeeID FROM `table` WHERE `skill` IN ('PHP','Java') GROUP BY EmployeeID HAVING COUNT(*) = 2
As Nancy suggested, parametrized it as required. If the query would have been "search the employees having mandatory skills PHP, Java and *Optional* skills Pearl, VoIP". I want to put it in single SQL query. Please suggest me what should be query.
And the result should be shown in such a way that, people having more number of optional skills should be shown at start. How can we acheive it?
Best Regards Austin
On Sun, Apr 17, 2011 at 5:57 AM, nan wich nan_wich@bellsouth.net wrote:
I apologize, Ivan, in a quick read, I apparently confused your method with an earlier method. Indeed the multiple-join-same-table method not only works, but is the one often found in core (e.g. Taxonomy.module). I have not benchmarked either one, but I am going to assume that some core developer has at some time and that whoever that was discovered that either it is faster or that it is the best cross-database implementation. I do, however, feel that the EXISTS method is a more elegant and straight-forward solution, at least in MySql.
*Nancy*
Injustice anywhere is a threat to justice everywhere. -- Dr. Martin L. King, Jr.
*From:* Ivan Sergio Borgonovo
Actually, Ivan's query won't work because each skill is on a
It happens it worked with real data on a real DB (no pun on MySQL)
-- [ Drupal support list | http://lists.drupal.org/ ]