Search Smith

ColdFusion, SQL queries, and, of course, searching

Posts Tagged ‘loop’

ColdFusion: Queries inside query loops

Posted by David Faber on January 3, 2012

One of the first places I typically look to improve the performance of ColdFusion templates is the query loop. Sometimes developers, in their haste to get things done, don’t take the time to think about how they might optimize the performance of their templates. In some instances this can be justified. Maybe there aren’t enough rows returned by the database query to worry about the time it takes to process each record. But even in these cases there could be issues with scalability in the future.

How many times have we seen something like this:

<cfquery name="get_employees" datasource="#the_dsn#">
    SELECT emp_id, fname, lname, dept_id
      FROM employees
</cfquery>
<cfloop query="get_employees">
    <cfquery name="get_dept_info" datasource="#the_dsn#">
        SELECT dname
          FROM dept
         WHERE dept_id = <cfqueryparam cfsqltype="CF_SQL_INTEGER" value="#dept_id#" />
    </cfquery>
    ...
</cfloop>

?

Now admittedly this is an elementary example, which could be solved by using a simple SQL join in the first query:

SELECT e.emp_id, e.fname, e.lname, e.dept_id, d.dname
  FROM emp e, dept d
 WHERE e.dept_id = d.dept_id(+)

(Yes, I did use Oracle’s ancient syntax for an outer join above. I can’t get used to the ANSI standard syntax.)

What if we’ve already run the department query for some reason, however? Maybe we’d rather keep our employee query trim. In that case, we can do the following:

<cfquery name="get_depts" datasource="#the_dsn#">
    SELECT dept_id, dname
      FROM dept
</cfquery>
<cfset dept_names = {} /> <!--- Use structNew() for CF8 or less --->
<cfloop query="get_depts">
    <cfset dept_names[dept_id] = dname />
</cfloop>

Now when I loop over my employee query I can do the following:

<cfloop query="get_employees">
    <cfset dname = "" />
    <cfif structKeyExists(dept_names, dept_id)>
        <cfset dname = dept_names[dept_id] />
    </cfif>
    ...
</cfloop>

One area in which this can be very helpful is when one is using aggregate functions. Suppose we want to get some aggregate data (number of employees, total of salaries, etc.) for each of our departments. However, we also want to get information about each department. We could use a single query with multiple columns in the GROUP BY clause:

SELECT d.dept_id, d.dname, d.budget, d.location, d.supervisor_name
     , COUNT(e.emp_id) AS employee_cnt, SUM(e.salary) AS total_salary
  FROM dept d, emp e
 WHERE d.dept_id = e.dept_id(+)
 GROUP BY d.dept_id, d.dname, d.budget, d.location, d.supervisor_name

Now the performance of a query with a GROUP BY clause can be improved by adding index(es) on the column(s) mentioned in the GROUP BY clause (at least in Oracle). I would rather not add indexes where I don’t absolutely have to, however – I’d rather not have to think about the possible impact they may have. (Depending on the company for which you work, you might not be permitted to create indexes on your own.) What I can do in this case is break the query into two parts, one:

<cfquery name="get_dept_info" datasource="#the_dsn#">
    SELECT dept_id, dname, budget, location, supervisor_name FROM dept
</cfquery>

and two:

<cfquery name="get_dept_data" datasource="#the_dsn#">
    SELECT dept_id, COUNT(emp_id) AS employee_cnt, SUM(salary) AS total_salary
      FROM emp
     GROUP BY dept_id
</cfquery>
<cfset dept_counts = {} />
<cfset dept_salary = {} />
<cfloop query="get_dept_data">
    <cfset dept_counts[dept_id] = employee_cnt />
    <cfset dept_salary[dept_id] = total_salary />
</cfloop>

Then when I loop over my “get_dept_data” query I have the data in those structures available, without needing to query the database further.

Posted in ColdFusion, Oracle, SQL | Tagged: , , , , | 2 Comments »