COUNT(*) vs COUNT(pk_col)

By Adrian Smith19 Nov 2007400 words2 mins to read

A while back I was doing some performance tuning on MySQL 5 for a customer. A SELECT was counting the number of rows in the table. I always use COUNT(*) for that but I know a lot of people, including the customer, use COUNT(pk_col). The query was taking a long time (a few minutes). I analyzed that the problem came from the usage of COUNT(pk_col) instead of COUNT(*). With COUNT(*) it was instantaneous.

I didn't know that there was a difference between the two. There is no difference in their semantics, therefore it didn't occur to me that there might be a difference in the way they were executed.

Just to recap the SQL syntax:

It follows that for columns which don't have any "null" values in them, COUNT(col) must be the same as COUNT(*). For any column marked "not null", such as a primary key, this is always the case. And thus it follows that COUNT(pk_col) must always deliver an identical result to COUNT(*).

However, the database in question was executing a COUNT(pk_col) query and a COUNT(*) query differently.

Alas I didn't write down the output from the EXPLAIN statement and I've just tried it on some other databases (using MySQL 4.1 and 5.0) and was unable to reproduce this behaviour. So this is a bit of a useless blog entry! Nevertheless – you must believe me – it did happen!

So the conclusion is, one should always use COUNT(*) and never COUNT(pk_col).

This article was written by Adrian Smith on 19 Nov 2007

Follow me: Facebook | Twitter | Email

More on: Databases