-
Bug
-
Resolution: Unresolved
-
None
-
None
-
Critical
-
Loading users for tasks may lead to excessive full scans and per-row subquery execution, slowing down the DB. This is optimised.
-
Emptyshow more show less
Problem
The current SQL query that retrieves users with their assigned tasks runs extremely slowly (up to 500 seconds in production).
explain SELECT Zf_users.id, Zf_users.userGuid, Zf_users.firstName, Zf_users.surName, Zf_users.gender, Zf_users.login, Zf_users.email, Zf_users.roles, Zf_users.editable, Zf_users.locale, Zf_users.customers,filter.taskGuid from Zf_users, (SELECT LEK_task.* FROM LEK_task
-> LEFT JOIN LEK_taskUserAssoc AS LEK_taskUserAssoc_1 ON LEK_taskUserAssoc_1.taskGuid = LEK_task.taskGuid WHERE (LEK_taskUserAssoc_1.userGuid = '{XXXXXX-837d-48aa-bbe3-XXXXXX}' OR LEK_task.pmGuid = '{XXXXXX-837d-48aa-bbe3-XXXXXX}') ORDER BY LEK_task.orderdate desc) as filter INNER JOIN LEK_taskUserAssoc ON LEK_taskUserAssoc.taskGuid=filter.taskGuid WHERE Zf_users.userGuid = LEK_taskUserAssoc.userGuid AND filter.taskGuid NOT IN(SELECT IF((SELECT IF(t.value IS NOT NULL,t.value, if(c.value IS NOT NULL,c.value,z.value = 1)) FROM Zf_configuration z
-> LEFT JOIN LEK_customer_config c on z.name = c.name
-> LEFT JOIN LEK_task_config t on t.name = z.name
-> WHERE ((t.taskGuid = LEK_task.taskGuid) OR (c.customerId = LEK_task.customerId AND t.taskGuid IS NULL))
-> AND z.name = "runtimeOptions.customers.anonymizeUsers") = 1,LEK_task.taskGuid,NULL) AS s
-> FROM LEK_task
-> GROUP BY LEK_task.taskGuid
-> HAVING s IS NOT NULL) GROUP BY Zf_users.id ORDER BY Zf_users.surName;
+------+--------------------+---------------------+--------+-----------------------------+--------------------+---------+-----------------------------------+-------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+------+--------------------+---------------------+--------+-----------------------------+--------------------+---------+-----------------------------------+-------+----------------------------------------------+
| 1 | PRIMARY | LEK_taskUserAssoc | index | task_user_job_type,userGuid | task_user_job_type | 825 | NULL | 18791 | Using index; Using temporary; Using filesort |
| 1 | PRIMARY | Zf_users | eq_ref | userGuid_2,userGuid | userGuid_2 | 154 | DBXXXX.LEK_taskUserAssoc.userGuid | 1 | |
| 1 | PRIMARY | LEK_task | eq_ref | taskGuid | taskGuid | 154 | DBXXXX.LEK_taskUserAssoc.taskGuid | 1 | Using where |
| 1 | PRIMARY | LEK_taskUserAssoc_1 | ref | task_user_job_type | task_user_job_type | 154 | DBXXXX.LEK_taskUserAssoc.taskGuid | 2 | Using where; Using index |
| 3 | MATERIALIZED | LEK_task | ALL | NULL | NULL | NULL | NULL | 21370 | |
| 4 | DEPENDENT SUBQUERY | z | const | name | name | 1022 | const | 1 | |
| 4 | DEPENDENT SUBQUERY | c | ref | configNameIdx | configNameIdx | 1023 | const | 1 | |
| 4 | DEPENDENT SUBQUERY | t | ref | configNameIdx | configNameIdx | 1023 | const | 20276 | Using where |
+------+--------------------+---------------------+--------+-----------------------------+--------------------+---------+-----------------------------------+-------+----------------------------------------------+
8 rows in set (0.001 sec)
The execution plan shows:
- Full table scans and materialization of LEK_task (~20k rows).
- NOT IN (SELECT … GROUP BY … HAVING …) subquery causing dependent subquery execution per row.
- ORDER BY inside a derived table forcing sorting of the full dataset.
- Optimizer unable to efficiently use existing indexes.
Impact:
- Long response times for task-related views.
- High database load during query execution.
Notes:
- The query likely needs a structural rewrite to avoid materialization and complex subqueries.
- NOT EXISTS and direct joins could be a better fit than NOT IN with GROUP BY.
- Missing index on LEK_task(orderdate) may further worsen sorting performance.
solution
Next Steps:
- Review the query design and consider refactoring. Extract the "fixed" parts - so everything related to the given user.
- Evaluate additional indexes (e.g. LEK_task(orderdate)).
- Optimize or rewrite the anonymization subquery.
Possible improvement
EXPLAIN SELECT u.id, u.userGuid, u.firstName, u.surName, u.gender, u.login, u.email, u.roles, u.editable, u.locale, u.customers, t.taskGuid FROM Zf_users u JOIN LEK_taskUserAssoc tua ON u.userGuid = tua.userGuid JOIN LEK_task t ON t.taskGuid = tua.taskGuid AND (tua.userGuid = '{XXXXX-837d-48aa-bbe3-XXXXX}' OR t.pmGuid = '{XXXX-837d-48aa-bbe3-XXXXX}') WHERE NOT EXISTS ( SELECT 1 FROM Zf_configuration z LEFT JOIN LEK_customer_config c ON z.name = c.name LEFT JOIN LEK_task_config tc ON tc.name = z.name WHERE ((tc.taskGuid = t.taskGuid) OR (c.customerId = t.customerId AND tc.taskGuid IS NULL)) AND z.name = 'runtimeOptions.customers.anonymizeUsers' AND (tc.value IS NOT NULL OR c.value IS NOT NULL OR z.value = 1) ) GROUP BY u.id ORDER BY u.surName;
Additional Findings (from EXPLAIN):
- The main query now uses indexes efficiently on users, taskUserAssoc, and tasks.
- The biggest bottleneck is the dependent subquery (NOT EXISTS) on Zf_configuration / customer_config / task_config.
- This subquery is executed once per row in the main query and scans ~20k rows in task_config each time.
- GROUP BY u.id ORDER BY u.surName forces Using temporary; Using filesort, but this is a secondary issue.
Suggested Direction:
- Add composite indexes on task_config (taskGuid, name, value) and customer_config (customerId, name, value).
- Consider rewriting or precomputing the anonymization condition (e.g. materialized view, cache table).
- Keep NOT EXISTS, but avoid the per-row dependent evaluation.