[drupal-devel] Request for input on Quiz module data structure
[long]
Angie Byron
drupal-devel at webchick.net
Wed Oct 12 23:35:41 UTC 2005
Hi, folks. Hope that this is the right place for this. Apologies in advance for
the length, but I wanted to make sure everyone's on the same page before asking
for feedback.
== BACKGROUND ==
As most know, I worked over the summer on the Quiz module. Thanks to a generous
injection of funding from Ejovi Nuwere at SecurityLab.net, I've been able to
both complete most of the work that was not done by the other student assigned
to this project during SoC, and am also able to now focus on adding additional
features to the module.
The quiz module was developed around Robert Douglass's specification (which I've
temporarily mirrored here: http://drupaldev.snarkles.net/node/2). In a nutshell,
this means rather than creating a simple module that could do multiple choice
questions, instead creating a quiz *framework* which could potentially support
almost limitless question types, from traditional multiple choice/fill in the
blank, to really innovative ideas such as "reading" a question aloud to gage
reading comprehension and so on.
== HOW IT CURRENTLY WORKS ==
The current schema can be viewed here:
http://cvs.drupal.org/viewcvs/drupal/contributions/modules/quiz/quiz.mysql?rev=1.10
For the more visually inclined, I've uploaded an E-R Diagram here:
http://www.webchick.net/soc/quiz.module/quiz-schema-old.png
This is based off the original proposal, where serialized arrays (/me ducks in
advance) are used to store answer, feedback and user response data.
NOTE: Skip to "THE PROBLEM" if you don't care about how this stuff is stored
currently :P
A typical result array (stored in quiz_results.results) might look like:
Array
(
[10] => Array
(
[input] => Array
(
[0] => 0
[1] => 2
[2] => 3
)
[feedback] => Array
(
[0] => Correct!
[1] => Correct!
[2] => Correct!
)
[score] => 1
)
[6] => Array
(
[input] => Array
(
[0] => 2
)
[feedback] => Array
(
[0] => Nope, fooled you!
)
[score] => 0
)
...
)
The array keys (10 and 6) are question node IDs... in other words, this is how
the user responded to both question 10 and question 6. Over the course of the
quiz, this array keeps growing, with one array of results per question.
"input" is an array of the answer(s) that the user selected on that question.
The first question was a multiple choice question, so the user plugged in three
answers (answers 1, 3, and 4 -- remember that arrays start at 0 index). The
second question was only a single-choice question, and the user selected answer 3.
"feedback" is an array of the feedback to the answer(s) the user selected for
input. As you can see, the user answered correctly on all three answers on the
first question, but did not answer correctly on the second question.
Finally, "score" is either a 1 (indicating the answer was correct) or a 0
(indicating the answer was not correct).
== THE PROBLEM ==
The question was raised to me:
"If there's a million questions, and a
million students, and we want to find out for a given
question, how many student got it right how do we do that?"
Right now, the answer is something along the lines of this:
-----------------------------------------------------------------------
function get_question_results($quiz_nid, $question_nid) {
// Array to store question results
$question_results = array(
'right' => 0,
'wrong' => 0
);
// Grab a list of the results array for each quiz attempt
$quiz_results = array();
$results = db_query("SELECT r.results FROM {quiz_results} r WHERE r.quiz_nid =
%d", $quiz_nid);
while ($result = db_fetch_object($results)) {
$quiz_results[] = unserialize($result->results);
}
// Now, loop through each result and look for the given question
foreach ($quiz_results as $key => $value) {
if ($value[$question_nid]) {
// If question is found check if it was answered correctly or not
if ($value[$question_nid]['score'] == 1) {
$question_results['right']++;
} else {
$question_results['wrong']++;
}
}
}
return $question_results;
}
// Call the function and output results
$results = get_question_results(12, 10);
echo $results['right'] ." students answered this question right, " .
$results['wrong'] ." answered it wrong.";
-----------------------------------------------------------------------
This is quite processor and RAM-intensive, since it has to load ALL of the
results for each attempt on a given quiz first, then parse through those to find
individual question attempts, and then finally retrieve the answers from each
and evaluate whether or not they were correct.
And this same basic logic has to be completed for each potential query someone
might want to throw at it.. Such as, "What is the average score of this user
among all exams?"
In short, it doesn't look like this solution is going to be very scalable in the
long-term for the kind of ad-hoc querying that educators are likely going to
really want to be able to do.
== THE SOLUTION? ==
I've created an alternate, normalized data schema and wanted to get your
feedback on it, if you get a chance.
http://webchick.net/soc/quiz.module/newschema.sql
E-R Diagram: http://webchick.net/soc/quiz.module/quiz-schema-new.sql
Changes:
- quiz/quiz_questions tables stay the same
- multichoice turns into 'quiz_question' containing only a primary key and a
"properties" field to take the place of "multiple_answers." I envision this
field as being more flexible than multiple_answers, as it could potentially
store a plethora of options for very diverse array question types, perhaps in
the form of key/value pairs or something (multichoice=1;email_results=1; etc.)
- answers are now broken out into a 'quiz_question_answer' table, which links
back to quiz_question with separate fields for answer, feedback, and points. The
answer field itself could store a wide variety of options, such as "Answer" for
a multichoice question, "St. Paul|Saint Paul" for a fill-in-the-blank question,
and so on. Since answer display is handled in each question type module in the
_form and _render_question hooks, we can still have the advantage of storing
data independently from the actual quiz module. Feedback, too, can be stored in
whatever way makes sense for the answer, and points can indicate whether a
question is incorrect (0 points) or can also be used for 'weighting' questions
(answer 1 is 20 points, answer 2 is 50 points, etc.)
- Results are now stored in two different tables: quiz_result for the overall
properties (start time/end time, etc. -- essentially the same as the current
table) and quiz_question_results which provides the textual representation of
what that user put for his/her answer on each question attempt. This could be
stored in a similar manner to the answers table... for multiple answer
questions, it could be 23,63,3 to indicate the answer ids selected, for
fill-in-the-blank it could have 'Minneapolis', etc. Each question type's
_evaluate_question hook would take the data from the answer field and check it
against the given question's answers.
- Fields representing times and dates are switched to DATETIME instead of
INTEGER (for UNIX timestamp) in order to help with date range calculation
And if you made it through that, a hearty congratulations. :P
Any thoughts?
More information about the drupal-devel
mailing list