No, the distinct will be in general much worse - the optimizer recognizes top-n quereis with row_number(). When I see GROUP BY at the outer level of a complicated query, especially when it's across half a dozen or more columns, it is frequently associated with poor performance. So while DISTINCT and GROUP BY are identical in a lot of scenarios, here is one case where the GROUP BY approach definitely leads to better performance (at the cost of less clear declarative intent in the query itself). … WHERE OrderID = o.OrderID Or if video is more your thing, check out Connor's latest video and Chris's latest video from their Youtube channels. IMHO, anyway. But hey, repetition is a good thing… I hope? 8. Oh, this takes me back-- one of the rule-of-thumb (ROT) myths I remember hearing from crusty DBAs when I started working with Oracle DBMS late last century: I ran exactly the same test in 10.2 just to confirm that nothing about the HASH GROUP BY changed this, and noticed that the distinct query used HASH UNIQUE, which made me initially believe that both operations were still internally the same. Hey David Aldridge, that test you did is not the same, you have to create the index that Tom´s create. This is correct. While in SQL Server v.Next you will be able to use STRING_AGG (see posts here and here), the rest of us have to carry on with FOR XML PATH (and before you tell me about how amazing recursive CTEs are for this, please read this post, too). 4. The optimizer is smart … select unique vs. select distinct Can you please settle an argument we are having re: 'select unique' vs. 'select distinct'? The results are sorted in ascending order by city. The DISTINCT variation took 4X as long, used 4X the CPU, and almost 6X the reads when compared to the GROUP BY variation. A DISTINCT and GROUP BY usually generate the same query plan, so performance should be the same across both query constructs. Last updated: May 30, 2013 - 2:50 pm UTC, Mike Angelastro, December 19, 2005 - 2:33 pm UTC, A reader, January 19, 2006 - 3:36 am UTC, A reader, May 11, 2006 - 8:40 pm UTC, Duke Ganote, October 05, 2006 - 9:55 am UTC, David Aldridge, October 05, 2006 - 5:03 pm UTC, Matthew, December 08, 2006 - 8:48 am UTC, Alejandro Daza, December 09, 2006 - 10:13 am UTC, A reader, January 10, 2007 - 4:46 pm UTC, Tom Admirer, March 26, 2007 - 2:37 pm UTC, Tom Admirer, May 05, 2007 - 10:06 pm UTC, Mark Brady, May 07, 2007 - 10:58 am UTC, orafan, May 09, 2007 - 10:17 pm UTC, A reader, May 11, 2007 - 9:05 pm UTC, A reader, May 14, 2007 - 4:40 pm UTC, Richard Armstrong-Finnerty, May 16, 2007 - 7:53 am UTC, dfxgirl, March 26, 2008 - 12:23 pm UTC, A reader, April 16, 2008 - 11:38 pm UTC, Jack Douglas, May 02, 2011 - 5:11 am UTC, chithambaram.p, May 24, 2011 - 11:57 pm UTC, Sokrates, May 25, 2011 - 11:48 am UTC, Nathan Marston, May 26, 2011 - 9:56 pm UTC, A reader, May 27, 2011 - 2:51 am UTC, Sambhav, May 28, 2011 - 5:55 am UTC, A reader, May 30, 2011 - 8:16 am UTC, Rajeshwaran, Jeyabal, June 09, 2011 - 12:12 pm UTC, Snehasish Das, December 14, 2012 - 1:41 am UTC. Compare query plans, and use Profiler and SET to capture IO, CPU, Duration etc. Re: DISTINCT operator performance issue 635471 Aug 1, 2008 4:40 AM ( in response to g.myers ) As a general rule, if you are not selecting any data from a table, it should be in the WHERE clause as a … I’ve written about this before in my guide to joins in Oracle, and there are a few reasons for this:. I highly recommend taking the time to … We can also compare the execution plans when we change the costs from CPU + I/O combined to I/O only, a feature exclusive to Plan Explorer. He discusses the fact that GROUP BY will, in fact, under certain circumstances, produce a faster query plan. You can certainly spot it when casually scanning the output: For every order, we see the pipe-delimited list, but we see a row for each item in each order. Does SQL filter the duplicates on the fly? The following statement uses the GROUP BY clause to return distinct cities together with state and zip code from the sales.customers table: SELECT city, state, zip_code FROM sales.customers GROUP BY city, state, zip_code ORDER BY city, state, zip_code. Sure, if that is clearer to you. We might have a query like this, which attempts to return all of the Orders from the Sales.OrderLines table, along with item descriptions as a pipe-delimited list: This is a typical query for solving this kind of problem, with the following execution plan (the warning in all of the plans is just for the implicit conversion coming out of the XPath filter): However, it has a problem that you might notice in the output number of rows. The 2 receipes (sic) that do have ING1 & ING2 are receipe1 & receipe3. Is it correct?regardsik Introduction. Is there any dissadvantage of using "group … Thus performance could vary. Did you cost both out? This could happen in the past, thus back than we had the rule of thumb: Use always GROUP BY. ;) good one, I should have thought of that - as "select unique" is the same as "select distinct", I don't know who you are or what you are talking about "reader". How to Improve the Performance of Group By with Having I have a table t containing three fields accountno, ... Oracle Database can use this automagically. Well, in this simple case, it's a coin flip. 5. Note that DISTINCT is synonym of UNIQUE which is not SQL standard.It is a good practice to always use DISTINCT instead of UNIQUE.. Oracle SELECT DISTINCT … Sometimes I use DISTINCT in a subquery to force it to be "materialized", when I know that this would reduce the number of results very much but the compiler does not "believe" this and groups to late. Its definition is: The group by gives the same result as of distinct when no aggregate function is present. However, in more complex cases, DISTINCT can end up doing more work. I'm getting poor performance from DISTINCT. The AskTOM team is taking a break over the holiday season, so we're not taking questions or responding to comments. COUNTDISTINCT can only be used for single-assign attributes, and not for multi-assigned attributes. In Oracle Database 12.1.0.2, we added a new transformation called Group-by and Aggregation Elimination and it slipped through any mention in our collateral. We just have to remember to take the time to do it as part of SQL query optimization…. 11. For Oracle, we will have to say more or less the same: the TOP 1 from MS SQL Server cannot be implemented simply like this:-- oracle => incorrect code select t.* from tbl_Employee_WorkRecords t where t.pk = ( select i.pk from tbl_Employee_WorkRecords i where i.employee_pk = t.employee_pk and rownum = … yes, true, because analytics are done after the where clause/aggregation takes place... if you have an index on col_name, we can index fast full scan that instead of the table - but distinct is going to be what you use. A) COUNT(*) vs. COUNT(DISTINCT expr) vs… These two queries produce the same result: And in fact derive their results using the exact same execution plan: Same operators, same number of reads, negligible differences in CPU and total duration (they take turns "winning"). 3. ON Essentially, DISTINCT collects all of the rows, including any expressions that need to be evaluated, and then tosses out duplicates. Let's talk about string aggregation, for example. When I remember correct there was a second 'trick' on it by using a UNION with a SELECT NULL, NULL, NULL … I'll bookmark this article and come back, when I find a current statement, that benefits this behavior. I think this is n't using a set operation has been a very challenging year for many when aggregations present! & ING2 are receipe1 & receipe3 '' would return all rows in your case connect BY