-
Bug
-
Resolution: Fixed
-
None
-
Critical
-
The query that removes unused image records has been optimized. It now uses better indexing and avoids unnecessary full table scans, resulting in a faster cleanup of old data.
-
Emptyshow more show less
Problem:
The current DELETE query on terms_images is performing poorly in MariaDB.
explain DELETE i FROM `terms_images` i -> LEFT JOIN ( -> SELECT id, target FROM `terms_attributes` WHERE collectionId = '118' -> ) a ON i.targetId = a.target -> WHERE a.id IS NULL AND i.collectionId = '118'; +------+-------------+------------------+------+-----------------------------------+------+---------+-------------------+---------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +------+-------------+------------------+------+-----------------------------------+------+---------+-------------------+---------+-------------+ | 1 | PRIMARY | i | ALL | fk_terms_images_languageresources | NULL | NULL | NULL | 1641 | Using where | | 1 | PRIMARY | <derived2> | ref | key0 | key0 | 403 | DBXXXX.i.targetId | 1314 | Using where | | 2 | DERIVED | terms_attributes | ALL | collectionId_idx | NULL | NULL | NULL | 3512342 | Using where | +------+-------------+------------------+------+-----------------------------------+------+---------+-------------------+---------+-------------+
- The derived table (SELECT ... FROM terms_attributes) forces MariaDB to scan all 3.5M rows from terms_attributes.
- The index on collectionId (collectionId_idx) is not used in this context.
- As a result, the optimizer performs a full table scan instead of leveraging the index.
solution
- Avoid derived table by rewriting the query:
- Add a composite index on (collectionId, target) in terms_attributes:
DELETE i FROM terms_images i LEFT JOIN terms_attributes a ON i.targetId = a.target AND a.collectionId = 118 WHERE a.id IS NULL AND i.collectionId = 118; CREATE INDEX idx_terms_attributes_collection_target ON terms_attributes (collectionId, target);
Expected Result:
Optimizer should switch from ALL table scan to a range or ref scan. Query execution time should improve significantly on large datasets.