I have read a book whose title is "Oracle PL SQL Programming" (2nd ed.) by Steven Feuerstein & Bill Pribyl. On page 99, there is a point suggested that
Do not "SELECT COUNT(*)" from a table unless you really need to know the total number of "hits." If you only need to know whether there is more than one match, simply fetch twice with an explicit cursor.
Could you anyone explain this point more to me by providing example? Thank you.
-
He means open a cursor and fetch not only the first record but the second, and then you will know there is more than one.
Since I never seem to need to know that
SELECT COUNT(*)is>= 2, I have no idea why this is a useful idiom in any SQL variant. Either no records or at least one, sure, but not two or more. And anyway, there's alwaysEXISTS.That, and the fact that Oracle's optimizer seems to be pretty poor... - I would question the relevance of the technique.
To address TheSoftwareJedi's comments:
WITH CustomersWith2OrMoreOrders AS ( SELECT CustomerID FROM Orders GROUP BY CustomerID HAVING COUNT(*) >= 2 ) SELECT Customer.* FROM Customer INNER JOIN CustomersWith2OrMoreOrders ON Customer.CustomerID = CustomersWith2OrMoreOrders.CustomerIDAppropriately indexed, I've never had performance problems even with whole universe queries like this in SQL Server. However, I have consistently run into comments about Oracle optimizer problems here and on other sites.
My own experience with Oracle has not been good.
The comment from the OP appears to be saying that full
COUNT(*)from tables are not well handled by the optimizer. i.e.:IF EXISTS (SELECT COUNT(*) FROM table_name HAVING COUNT(*) >= 2) BEGIN END(which, when a primary key exists, can be reduced to a simple index scan - in a case of extreme optimization, one can simply query the index metadata in sysindexes.rowcnt - to find the number of entries - all without a cursor) is to be generally avoided in favor of:
DECLARE CURSOR c IS SELECT something FROM table_name; BEGIN OPEN c FETCH c INTO etc. x 2 and count rows and handle exceptions END; IF rc >= 2 THEN BEGIN ENDThat, to me would result in less readable, less portable, and less maintainable code.
TheSoftwareJedi : "...the fact that Oracle's optimizer seems to be pretty poor..." - needless claim. at least back it up with facts or explain plans!TheSoftwareJedi : How would you write a query for "confirm this customer has 2 or more orders" in your database of choice, and how would it be optimized?Cade Roux : Actually, I had a high opinion of Oracle until I used it and had to troubleshoot problems - it was the opposite with SQL Server. -
This comes from programmers writing code similar to the following (this is psuedo code!).
You want to check to see if the customer has more than one order:
if ((select count(*) from orders where customerid = :customerid) > 1) { .... }That is a terribly inefficient way to do things. As Mark Brady would say, if you want to know if a jar contains pennies, would you count all the pennies in the jar, or just make sure there is 1 (or 2 in your example)?
This could be better written as:
if ((select 1 from (select 1 from orders where customerid = :customerid) where rownum = 2) == 1) { .... }This prevents the "counting all of the coins" dilemma since Oracle will fetch 2 rows, then finish. The previous example would cause oracle to scan (an index or table) for ALL rows, then finish.
DanSingerman : That's pretty smart!Dave Costa : I think you mean "rownum <= 2", right? -
Depending on the DB, there may be a sys table which stores an approximate count and can be queried in constant time. Useful if you want to know whether the table has 20 rows or 20,000 or 20,000,000.
-
If two is all you are interested in, try
SELECT 'THERE ARE AT LEAST TWO ROWS IN THE TABLE' FROM DUAL WHERE 2 = ( SELECT COUNT(*) FROM TABLE WHERE ROWNUM < 3 )It will take less code than doing the manual cursor method, and it is likely to be faster.
The rownum trick means to stop fetching rows once it has two of them.
If you don't put some sort of limit on the count(*), it could take a long while to finish, depending on the number of rows you have. In that case, using a cursor loop, to read 2 rows from the table manually, would be faster.
-
SQL Server:
if 2 = ( select count(*) from ( select top 2 * from ( select T = 1 union select T = 2 union select T = 3 ) t) t) print 'At least two'Also, don't ever use cursors. If you think you really really need them, beat yourself with a shovel until you change your mind. Let relics from an ancient past remain relics from an ancient past.
TheSoftwareJedi : Cursors have a place. Just not in this question.tuinstoel : @Justice, Oracle PL/SQL cursors aren't slow. You can't directly translate SQL Server knowledge to Oracle knowledge. Have you ever done some benchmarking with Oracle cursors?Justice : I haven't. That's true, it was an assumption. -
There are a number of reasons why developers might perform select COUNT(*) from a table in a PL/SQL program:
1) They genuinely need to know how many rows there are in the table.
In this case there is no choice: select COUNT(*) and wait for the result. This will be pretty fast on many tables, but could take a while on a big table.
2) They just need to know whether a row exists or not.
This doesn't warrant counting all the rows in the table. A number of techniques are possible:
a) Explicit cursor method:
DECLARE CURSOR c IS SELECT '1' dummy FROM mytable WHERE ...; v VARCHAR2(1); BEGIN OPEN c; FETCH c INTO v; IF c%FOUND THEN -- A row exists ... ELSE -- No row exists ... END IF; END;b) SELECT INTO method
DECLARE v VARCHAR2(1); BEGIN SELECT '1' INTO v FROM mytable WHERE ... AND ROWNUM=1; -- Stop fetching if 1 found -- At least one row exists EXCEPTION WHEN NO_DATA_FOUND THEN -- No row exists END;c) SELECT COUNT(*) with ROWNUM method
DECLARE cnt INTEGER; BEGIN SELECT COUNT(*) INTO cnt FROM mytable WHERE ... AND ROWNUM=1; -- Stop counting if 1 found IF cnt = 0 THEN -- No row found ELSE -- Row found END IF; END;3) They need to know whether more than 1 row exists.
Variations on the techniques for (2) work:
a) Explicit cursor method:
DECLARE CURSOR c IS SELECT '1' dummy FROM mytable WHERE ...; v VARCHAR2(1); BEGIN OPEN c; FETCH c INTO v; FETCH c INTO v; IF c%FOUND THEN -- 2 or more rows exists ... ELSE -- 1 or 0 rows exist ... END IF; END;b) SELECT INTO method
DECLARE v VARCHAR2(1); BEGIN SELECT '1' INTO v FROM mytable WHERE ... ; -- Exactly 1 row exists EXCEPTION WHEN NO_DATA_FOUND THEN -- No row exists WHEN TOO_MANY_ROWS THEN -- More than 1 row exists END;c) SELECT COUNT(*) with ROWNUM method
DECLARE cnt INTEGER; BEGIN SELECT COUNT(*) INTO cnt FROM mytable WHERE ... AND ROWNUM=2; -- Stop counting if 2 found IF cnt = 0 THEN -- No row found IF cnt = 1 THEN -- 1 row found ELSE -- More than 1 row found END IF; END;Which method you use is largely a matter of preference (and some religious zealotry!) Steven Feuerstein has always favoured explicit cursors over implicit (SELECT INTO and cursor FOR loops); Tom Kyte favours implicit cursors (and I agree with him).
The important point is that to select COUNT(*) without restricting the ROWCOUNT is expensive and should therefore only be done when a count is trully needed.
As for your supplementary question about how to re-write this with an explicit cursor:
CREATE OR REPLACE PROCEDURE do_sth ( emp_id_in IN emp.emp_id%TYPE ) IS v_rows INTEGER; BEGIN ... SELECT COUNT(*) INTO v_rows FROM emp WHERE emp_id = emp_id_in; IF v_rows > 0 THEN /* do sth */ END; /* more statements */ ... END do_sth;That would be:
CREATE OR REPLACE PROCEDURE do_sth ( emp_id_in IN emp.emp_id%TYPE ) IS CURSOR c IS SELECT 1 FROM emp WHERE emp_id = emp_id_in; v_dummy INTEGER; BEGIN ... OPEN c; FETCH c INTO v_dummy; IF c%FOUND > 0 THEN /* do sth */ END; CLOSE c; /* more statements */ ... END do_sth;But really, in your example it is no better or worse, since you are selecting the primary key and Oracle is clever enough to know that it only needs to fetch once.
-
As Steven Feuerstein & Bill Pribyl recommends us not to use SELECT COUNT() to check whether records in a table exist or not, could anyone help me edit the code below in order to avoid using SELECT COUNT(*) by using explicit cursor instead? This code is written in the Oracle stored procedure.
I have a table emp(emp_id, emp_name, ...), so to check the provided employee ID corret or not:
CREATE OR REPLACE PROCEDURE do_sth ( emp_id_in IN emp.emp_id%TYPE ) IS v_rows INTEGER; BEGIN ... SELECT COUNT(*) INTO v_rows FROM emp WHERE emp_id = emp_id_in; IF v_rows > 0 THEN /* do sth */ END; /* more statements */ ... END do_sth;RussellH : The above code is potentially dangerous. What happens if another session deletes the row(s) that cause a non-zero count after the count but before the /* do sth */? What specific statement will run if v_rows > 0? -
If you want to get number of rows in a table, please don't used count(*), I would suggest count(0) that 0 is the column index of your primary key column.
tuinstoel : count(*) is fine, count(0) or count(1) isn't faster. -
Before you take Steven Feuerstein's suggestions too serious, just do a little benchmark. Is count(*) noticeably slower than the explicit cursor in your case? No? Then better use the construct that allows for simple, readable code. Which, in most cases, would be "select count(*) into v_cnt ... if v_cnt>0 then ..."
PL/SQL allows for very readable programs. Don't waste that just to nano-optimize.
-
"Before you take Steven Feuerstein's suggestions too serious, just do a little benchmark... PL/SQL allows for very readable programs. Don't waste that just to nano-optimize."
I would have to disagree with you. The time spent proving why I should not use fetch over count(*), could have easily been spent copy/paste from a standard model and replacing the SQL and variables as needed. DONE.
If you can answer the question with a NO, "Do I need to know the full true count?", then skip all the extra effort and go right with building a boolean function that returns TRUE or FALSE when you meet the conditions. Now your core logic is easier to read...
e.g. CASE WHEN MULTIPLE_MATCHES_FOUND_f THEN ELSE END;
Once you have your basic model built, you can easily use that same strategy whenever needed. One needs to be careful about SQL that appears to run fast. Especially when executed multiple times. it will bite you.
0 comments:
Post a Comment