[support] Doubt on Index and Primary key

Austin Einter austin.einter at gmail.com
Tue Mar 29 00:46:42 UTC 2011


Hi  David
Thanks a lot. I am going ahead with the solution Pat proposed. I had asked
this question in MySql forum, they also liked the solution.
Best Regards
Austin
On Tue, Mar 29, 2011 at 12:47 AM, Metzler, David <metzlerd at evergreen.edu>wrote:

> Pat describes roughly what I had in mind in much greater detail.   You
> should not be worried about the row counts, here.  This solution scales much
> better than the one you have purposed, especially for large row counts.
>
> I have 3 million row tables containing more data than this sitting on my
> completely unoptimized laptop that I query regularly query.  When the
> queries are properly indexed, they return in milliseconds.
>
> I will say that your obsession with scale is a bit misplaced in your first
> database design project.  Don't sacrifice normalization for scale until you
> really know what you're doing around database design.... probably a minimum
> of 5 years experience under your belt before you start making those kinds of
> decisions.
>
> -----Original Message-----
> From: support-bounces at drupal.org on behalf of Pat Johnston
> Sent: Fri 3/25/2011 20:00
> To: support at drupal.org
> Subject: Re: [support] Doubt on Index and Primary key
>
>  Austin,
>
> I haven't dealt with high row counts with MySQL - I don't know for sure.
> But the rows are small so you're not using too much space. I would think
> it should be OK with some performance tuning. I'd suggest indexing the
> Skillset table by UID and sID.That and maybe some decent hardware
> configuration should be able to handle it.
>
> Any comments from the MySQL people?
>
> Regards,
>
> Pat
>
> On 3/25/2011 7:24 PM, Austin Einter wrote:
> > Hi Pat
> > Thanks for detail help. I really appreciate it. I did put the tables
> > in excel sheet. It looks fine.
> > But the only worry is skillset table row count.
> > Assuming on average, one person has 10 different skills, then skillset
> > table will have 10 entries per user.
> > Assuming total 50,000,00 users (May be a higher number , but I want to
> > design keeping this number high), then total number of rows in
> > skillset table will be 50,000,00 x 10 = 50,000,000 which is really
> > high. Is this going to be a bottle neck from MySql perspective, search
> > time perspective??
> > I am attaching the excel sheet (please where I have put the tables as
> > per suggestion), please have a look.
> > Best Regards
> > Austin
> >
> > On Fri, Mar 25, 2011 at 2:27 PM, Pat Johnston <Pat at melrosecenter.com
>  > <mailto:Pat at melrosecenter.com>> wrote:
> >
> >     Austin,
> >
> >     I think you're off to a good start. I might suggest that you do
> >     something about all of those skill-set tables. You could replace
> >     them with three tables and still have the information in easy reach:
> >
> >     - A skillGroup table with
> >     -- sgID - an integer primary key
> >     -- name - name of the group, like 'Web Skills', 'Database'
> >
> >     - A skills table with
> >     -- sID - an integer primary key
> >     -- sgID - ID of the skillGroup that the skill belongs to
> >     -- skill - the name of the skill, like 'PHP' or 'PostgreSql'
> >
> >     - A skillset table with
> >     -- UID - your user ID
> >     -- sID - ID of the skill being rated
> >     -- skillRating - 0 or 1 or whatever
> >
> >     The skillset entry links to the skill via the sID and to the user
> >     table (Table 1) with the UID. The skill table is then linked to
> >     the skillGroup via the sgID. This way you can add skill groups and
> >     skills as need without needing to create more tables.
> >
> >     Good Luck,
> >
> >     Pat
> >
> >
> >     On 3/24/2011 6:58 PM, Austin Einter wrote:
> >>     Hi Pierre, David, Ursula
> >>     Thanks for excellent piece of information. I just went through
> >>     basic database concepts like indexing, join, normalisation and
> >>     tried to analyse how can I apply these for my job registration
> >>     site implementation.
> >>     After understanding a bit on normalisation and join, I have
> >>     comeup with below approach for this specific case.
> >>     Instead of having a single table, and comma separated values in
> >>     table cells , I am going to split it multiple tables.
> >>     I am attaching a table.xls file , please have a look.
> >>     In that excel sheet, I have the main table, and I have broken the
> >>     main table into 9 different tables. But I hope it need to be
> >>     broken into more number of tables, depends on how many  different
> >>     kind of work domains are there. It may go to 50+ tables.
> >>     Examples for work domains are - Web, PSTN, VoIP,
> >>     NetworkManagement, GSM, Datbase, BoardDesign etc.
> >>     So how many work domains are there, those many tables will be
> >>     there. In those tables, a coulmn will represent a particular
> >>     skill set.
> >>     Say under Web Tables, coulmns can be HTML, PHP, Web2.0, Drupal,
> >>     ASP, etc
> >>     And under NetworkManagement table, NMS and SNMP can be coulmns.
> >>     In table cells, I will keep either 1 or 0, depending on the
> >>     person has that skill or not.
> >>      Example: Lets say User 2 knows NMS and SNMP, User 3 knows only
> >>     SNMP, User4 knows only SNMP, then the table will look as below.
> >>     *UID*    NMS     SNMP
> >>     2        1       1
> >>     3        0       1
> >>     4        0       1
> >>
> >>     As first time I am  doing this, I might be wrong. If so, kindly
> >>     let me know.
> >>     Best Regards
> >>     Austin.
> >>
> >>
> >>     On Fri, Mar 18, 2011 at 8:54 PM, Pierre Rineau
> >>     <pierre.rineau at makina-corpus.com
>  >>     <mailto:pierre.rineau at makina-corpus.com>> wrote:
> >>
> >>         Le vendredi 18 mars 2011 à 08:15 -0700, Metzler, David a écrit :
> >>         > Pierre is spot on here.
> >>         >
> >>         > That is why most dbas would advise against storing this
> >>         data in a comma separated list in a single field.  An index
> >>         cannot really be used to search within the text cause you are
> >>         forcing to examine every row anyway.  I can't programtically
> >>         say lets start with the N's, now is there a nokia in there
> >>         (that's an oversimplification intentionally to make a point).
> >>         Rather I would make a single skill table that housed the
> >>         values. If UID is the primary key for the resume, then you'd
> >>         make a table with
> >>         >
> >>         > On a separate note, you do understand that the site that
> >>         you're talking about building could be done without you
> >>         writing ANY code? Basically the site you've described can be
> >>         implemented with content_profile, cck and views modules,
> >>         allowing you to build custom content types that are tied (one
> >>         per user).  You could then use taxonomys for skill sets an
> >>         all this would be written for you?
> >>         >
> >>         > Dave
> >>
> >>         Dave is right about the fact this simple business stuff could
> >>         be done in
> >>         many ways using D6 existing modules (even only with core and
> >>         taxonomy)
> >>         or D7 fields.
> >>
> >>         But, if you really want to learn technical aspects of SQL
> >>         and/or Drupal
> >>         development, this is a good thing to start with this kind of
> >>         simple
> >>         business stuff.
> >>
> >>         Pierre.
> >>
> >>
> >>         --
> >>         [ Drupal support list | http://lists.drupal.org/ ]
> >>
> >>
> >
> >
> >     --
> >     [ Drupal support list | http://lists.drupal.org/ ]
> >
> >
>
>
>
> --
> [ Drupal support list | http://lists.drupal.org/ ]
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.drupal.org/pipermail/support/attachments/20110329/124a19c9/attachment.html 


More information about the support mailing list