Using functions on columns instead on literals
I always hear from developers that DBAs are against the use of ORMs (like Hibernate) without any reason. Before being a DBA, I was a developer and I recognize that I like to use ORMs because it makes it easier but you should always have in mind what’s beyond the ORM’s classes and what queries are executed by the ORM.
In other words: to know how the ORM works. Of course it can be some differences when running the application on SQL Server, PostgreSQL or Oracle but usually not a lot.
The following example is on a slow query I found through pg_stat_statements, in PostgreSQL, caused by the use of the upper function because PostgreSQL is case sensitive. In SQL Server this specific problem could be avoided if the database is created using a case insensitive collation. But other examples could be found that use functions on columns.
The query is against a table called BinaryFile where are stored the path (column named ruta) in the file system where files are placed. The table has 181.438 rows. There’s a normal index on ruta:
CREATE UNIQUE INDEX ix_binaryfile_ruta
ON binaryfile
USING btree (ruta);
The query generated by Hibernate:
SELECT COUNT(*) AS y0_ FROM public.BinaryFile this_
WHERE UPPER(this_.ruta) = 'C:\MYPATH\MYFILE.TXT' AND this_.id<>432
The index is not used because the UPPER function, so we’ve two solutions:
- Create an index over the UPPER function.
- Use the UPPER() in the literal part and have the ruta’s column values normalized (all in upper case).
I like the 2nd option because it’s simpler, but the improvement would be the same. In Windows the path are case insensitive, so we can do:
UPDATE BinaryFile SET ruta = UPPER(ruta)
The query would be:
SELECT COUNT(*) AS y0_ FROM public.BinaryFile this_
WHERE this_.ruta = UPPER('C:\MyPath\MyFile.TxT') AND this_.id<>432
Now, the query plan:
Now the query uses the index and the performance improvement is impressive: 7000 times faster.
I know, this is an easy example, but I think it’s the best way to explain it.