10.17.07

The Occasional Tech Comment: MySQL GROUP BY Oddities

Posted in technology at 10:00 am by kevin

Debunking GROUP BY Myths

Summary (with some editorializing by me):

Actually, SQL-99 and beyond only require functionally dependent columns (i.e. keys) to be in the GROUP BY, so we’re not as wrong as you think we are. We just settled on the other extreme - letting you do whatever you want and maybe getting completely arbitrary values once in a while; what’s the big deal? Hey, MySQL isn’t your mommy, so it’s on you to make sure your queries make sense. Besides, if you’re really such a baby, you can change the sql_mode server setting to include ONLY_FULL_GROUP_BY, which will emulate the behavior you expect (well, sort of - the bug tracker suggests some issues may be still outstanding).

Okay, maybe that was a bit harsh.

3 Comments »

  1. Adam Batkin said,

    October 17, 2007 at 11:54 am

    If anything in your SELECT clause isn’t in the GROUP BY clause (other than things that would already be grouped by primary key, something which plenty of RDBMS’s get wrong) then you are *guaranteed* to get incorrect results.

    The database should stop you because, there is no way it can possibly answer that query correctly! The database would have to ARBITRARILY pick some random values and spit them out.

    The only other possibility is that your data isn’t properly normalized and at the application level you “know” that certain values will be already grouped, but still, the database doesn’t know that (and in that case, adding those values to the GROUP BY won’t change anything)

  2. kevin said,

    October 17, 2007 at 11:59 am

    Right… MySQL’s current behavior - where it will, in fact, pick an arbitrary value and not throw an error - struck me as slightly insane. (Though apparently my mocking tone in the summary didn’t communicate my feelings as clearly as I would have liked!)

  3. Adam Batkin said,

    October 20, 2007 at 7:18 pm

    Heh, sorry. I just never liked mysql much since it always seemed like the whole thing was a hack. And there’s PostgreSQL available for the same price (free), the same ease-of-use and administerability (easy) but it’s an actual database with real transactions and much better standards compliance.

    Then I go back to work and install some piece of software and it inevitably requires mysql with no possibility of using any other database.

Leave a Comment