Why does SQL force me to repeat all non-aggregated fields from my SELECT clause in my GROUP BY clause?

I tend to agree with you - this is one of many cases where SQL should have slightly smarter defaults to save us all some typing. For example, imagine if this were legal.

I tend to agree with you - this is one of many cases where SQL should have slightly smarter defaults to save us all some typing. For example, imagine if this were legal: Select ClientName, InvoiceAmount, Sum(PaymentAmount) Group By * where "*" meant "all the non-aggregate fields". If everybody knew that's how it worked, then there would be no confusion.

You could sub in a specific list of fields if you wanted to do something tricky, but the splat means "all of 'em" (which in this context means, all the possible ones). Granted, "*" means something different here than in the SELECT clause, so maybe a different character would work better: Select ClientName, InvoiceAmount, Sum(PaymentAmount) Group By! There are a few other areas like that where SQL just isn't as eloquent as it could be.

But at this point, it's probably too entrenched to make many big changes like that.

Yes. Thank you. That sort of thing sounds sensible.

I guess the answer is "there isn't really an answer". ;) – AJ. Jan 6 '09 at 14:50 Difficult though to select a meaningful character or keyword.

The * or "ALL" keyword isn't semantically correct, as you're not grouping by ALL (you're grouping by all EXCEPT . . .).

It's not a bad idea, can't see how it would work without being a semantic kludge though. – Binary Worrier Jan 12 '09 at 16:06 Why not just GROUP instead of GROUP BY? – Martijn Jun 15 '10 at 8:54 Sure, that would work too, but I suspect that might be less than intuitive for current SQL programmers.

For that matter, the syntax could also be to leave it off entirely, and the parser would know that you're supposed to automatically group by any non-aggregate fields. That's probably how it should have been done in the first place, IMO, but maybe that's getting into "trying to be so helpful that you end up being confusing" territory. – Ian Varley Jun 23 '10 at 16:02.

I personally prefer (and think it's safer) to have this code Select ClientName, InvAmt, Sum(PayAmt) as PayTot Group By ClientName throw an error, prompting me to change the code to Select ClientName, Sum(InvAmt) as InvTot, Sum(PayAmt) as PayTot Group By ClientName.

I think it may just be a clarity issue. – Mike Woodhouse Jan 6 '09 at 14:11 1 Implied operations are the bane of debugging and testing. – Robert C.

Barth Jan 6 '09 at 20:30.

I hope/expect we'll see something more comprehensive soon; a SQL history lesson on the subject would be useful and informative. Anyone? Anyone?

Bueller? In the meantime, I can observe the following: SQL predates the DRY principle, at least as far as it it was documented in The Pragmatic Programmer. Not all DBs require the full list: Sybase, for example, will happily execute queries like SELECT a, b, COUNT(*) FROM some_table GROUP BY a ... which (at least every time I accidentally ran such a monster) often leads to such enormous inadvertent recordsets that panic-stricken requests quickly ensue, begging the DBAs to bounce the server.

The result is a sort of partial Cartesian product, but I think it may mostly be a failure on Sybase's part to implement the SQL standard properly.

Perhaps we need a shorthand form - call it GroupSelect GroupSelect Field1, Field2, sum(Field3) From SomeTable Where (X = "3") This way, the parser need only throw an error if you leave out an aggregate function.

Hmmm. Nice idea. – AJ.

Jan 7 '09 at 9:27.

I don't have an answer though.It certainly does seem like a awkward moment for the language.

MySQL allows this... – gms8994 Jan 6 '09 at 14:11.

I share the op's view that repeating is a bit annoying, especially if the non-aggregate fields contain elaborate statements like ifs and funtions and a whole lot of other things. It would be nice if there could be some shorthand in the group by clause - at least a column alias. Referring to the columns by number may be another option, albeit one that probably has their own problems.

Nice point about long expressions. It does add to the pain. – AJ.

Jan 7 '09 at 9:26.

The short answer is SQL does have a shortcut for grouping by all the select columns. It is SELECT DISTINCT ... FROM ...

The good reason for it is that you would get incorrect results more often than not if you did not specify all columns. Suppose you have three columns, col1, col2 and col3. Suppose your data looks like this: Col1 Col2 Col3 a be 1 a c 1 be be 2 a be 3 select col1, col2, sum(col3) from mytable group by col1, col2 would give the following results: Col1 Col2 Col3 a be 4 a c 1 be be 2 How would it interpret select col1, col2, sum(col3) from mytable group by col1 My guess would be Col1 Col2 Col3 a be 5 a c 5 be be 2 These are clearly bad results.

Of course the more complex the query and the more joins the less likely it would be that the query would return correct results or that the programmer would even know if they were incorrect. Personally I'm glad that group by requires the fields.

I cant really gove you an answer,but what I can give you is a way to a solution, that is you have to find the anglde that you relate to or peaks your interest. A good paper is one that people get drawn into because it reaches them ln some way.As for me WW11 to me, I think of the holocaust and the effect it had on the survivors, their families and those who stood by and did nothing until it was too late.

Related Questions