Skip navigation.
Home

When Newbies Attack!

DB Hoopla

I think inexperienced web programmers all make common DB WTFs when starting out. Jim Grill sent in a prime example from a project that he inherited. I'm sure we've all seen similar code before and we've all said, "wtf?!", if not "ytf?!"

<?php
$query
= 'SELECT * FROM sometable';
$result = mysql_query($query,$connection);
$count = mysql_num_rows($result);
?>

It should be obvious what's wrong in the example. To count the number of rows all data is needlessly requested and the rows counted in PHP. As the table grows these three lines will get slower and slower. Most people take data transfer from the DB server for granted. However we always should be as efficient as possible, since little things can quickly multiply into big problems.

The fix is relatively simple: <?php
$query
= 'SELECT COUNT(*) FROM sometable';
$result = mysql_query($query,$connection);
list(
$count) = mysql_fetch_array($result);
?>

Using count(*) will return just the number of rows. Much more efficient.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

A stupid n00b question:
is a SELECT COUNT(Primary_Key-field) faster then SELECT COUNT(The whole table)?

I usually retrieve COUNT(*) this way:



Just two lines of code.

http://www.php.net/manual/en/function.mysql-result.php

Please note: The code got removed in my previous post because it was enclosed in the PHP opening and closing tags, WTF...

-----------

I usually retrieve COUNT(*) this way:

$sql = 'SELECT COUNT(*) FROM my_table';
$count = mysql_result(mysql_query($sql), 0);

Just two lines of code.

http://www.php.net/manual/en/function.mysql-result.php

xeniac: COUNT(*) is considerably faster than other counts in MySQL -- MySQL catches this particular construct and optimizes it. If you count a particular column, it counts all non-NULL values, so it has to pay more attention. Also, I believe MySQL keeps track of intermediate values when you count on a particular column, where it only keeps a single column with COUNT(*) (though that last part may have changed since I last looked into this).

While premature optimization is usually a bad idea, this is the kind of optimization that's important because it will only bite you when your datasets get large. You can usually expect your datasets to get large eventually, but few people test large amounts of data while developing.

hmm...but the original code would actually be the faster solution in cases where you know you want all the rows, but you also want to know how many rows there are. lacking any context, this could be perfectly valid.

yes , i perfectly agree with scott. if you want both the count AND the actual data , the first method is better than running two queries.... so everything depends upon what you're using it for and what the situation demands....there are no hard and fast rules.

COUNT(pk) is better then COUNT(*). Why? When counting all fields some field types can cause it not to count the current row (probally a bug in the version of MySQL im using), however if your developing apps always count the pk (since it is unique & a integer).

I have serveral database tables where if i did a COUNT(*) and COUNT(pk) i would get 2 different results.

Another [probably more effecient] way to get the count is to use code similar to this:

$result = mysql_query('select SQL_CALC_FOUND_ROWS * from sometable');
$result_count = mysql_query("select FOUND_ROWS() as FoundRows");
$row_count = mysql_fetch_assoc($result_count);
$numRecords = $row_count['FoundRows'];
// continue on with the rest of the code to fetch the $result data

Of course, my comment above is only valid for when you want the data as well as the count. Forgot to mention that.

Important thing to consider....

Although the above code is not technically the most efficient, it does lend itself to flexibility....

eg, say, you decide you want the data out too, as mentioned above....

Also, what if the condition changes, and it's not _all_ of the rows, but count the rows where sometable.x = 5 that you want?

Also, for most coders, code which you can copy for other applications can speed up development significantly, yes, eventually this method will lead toward slower processing in larger datasets

This however, would be a _lot_ faster than repeating the query to retrieve the data, and considering that PHP functions operate a lot faster than MySQL queries, it really would outperform the second code block.

I think this is less of a wtf and more of a piece of code taken out of context...

Cheers
Paul

I dunno, but if you're talking about efficiency, do a list() on mysql_fetch_row instead of mysql_fetch_array, which is overkill as well.

yes, there's no need for mysql_fetch_array, but why even do a list? all that is being returned is one row...

$count = mysql_result(mysql_query($sql), 0);

mysql FOUND_ROWS() isnt very reliable, espically when mixed with groupby's and limit's

however itsa more reliable with SQL_CALC_FOUND_ROWS :)
http://dev.mysql.com/doc/mysql/en/Information_functions.html
mysql> SELECT SQL_CALC_FOUND_ROWS * FROM tbl_name
-> WHERE id > 100 LIMIT 10;
mysql> SELECT FOUND_ROWS();

Post new comment




*

  • You may post code using <code>...</code> (generic) or <?php ... ?> (highlighted PHP) tags.
  • Allowed HTML tags: <a> <em> <strong> <cite> <code> <ul> <ol> <li> <dl> <dt> <dd> <pre> <p> <br>
  • Web and e-mail addresses are automatically converted into links.