What is Lock Escalation?
When LOCKLIST and MAXLOCKS are not set to AUTOMATIC, or total system memory is constrained, lock escalation can occur. When there is not enough room in the lock list to store additional locks that an application needs to continue processing, lock escalation occurs. DB2 will attempt to satisfy queries and perform updates, inserts, and deletes by locking at the row level. Each row level lock requires a certain amount of memory that varies by the version of DB2. The IBM DB2 Knowledge Center page for the LOCKLIST parameter spells out how much memory each row lock takes. When there is no more memory available in the lock list for additional locks, or when a single application reaches the percentage of the lock list defined by MAXLOCKS, DB2 performs what is called lock escalation. This means that applications with row-level locks instead try to acquire table-level locks.
The table-level locks acquired with lock escalation are much worse for concurrency. Now instead of only individual rows being unavailable for reads, updates, and deletes (depending on the isolation levels of the applications involved and the work being performed), the entire table may be unavailable. This can lead to other negative locking phenomena, including longer lock-wait times, lock timeouts, and even deadlocks.
Analyzing the Problem
Lock escalation is one of the more negative things that can happen with DB2’s concurrency control. It is something DB2 databases should be monitored for and that should be addressed if it occurs on an ongoing basis. Lock escalation is documented in the DB2 diagnostic log, and this is one of the better places to look for it. Once my diagnostic log parser alerts me that there is lock escalation occurring, I spend some time analyzing to see which databases (if more than one on the instance) and which times it is occurring at. The db2diag tool is a powerful tool in this analysis. The following syntax will list out occurrences of lock escalation, including the database name and time stamp:
$ db2diag -g message:=scalation -fmt '%ts %db %errname' 2016-03-28-01.13.34.680662 PRODM 2016-03-28-01.13.34.681123 PRODM 2016-03-28-01.14.41.746583 PRODM 2016-03-28-01.14.41.747016 PRODM 2016-03-28-01.16.28.127806 PRODM 2016-03-28-01.16.28.128327 PRODM 2016-03-28-01.17.20.249458 PRODM 2016-03-28-01.17.20.250037 PRODM 2016-03-28-02.45.10.337993 PRODM 2016-03-28-02.45.10.338500 PRODM 2016-03-28-02.45.46.461853 PRODM 2016-03-28-02.45.46.462300 PRODM ...
This is a bit messier than I would like it to be, but when using db2diag, for some reason, the errono field is not populated for lock escalations. You can get the same info from SYSIBMADM.PDLOGMSGS_LAST24HOURS or the table function PD_GET_LOG_MSGS, where oddly enough the msgnum field IS populated:
select timestamp , substr(dbname,1,12) as dbname from sysibmadm.PDLOGMSGS_LAST24HOURS where msgnum=5502 with ur TIMESTAMP DBNAME -------------------------- ------------ 2016-03-28-22.214.171.1249646 PRODM 2016-03-28-126.96.36.1994685 PRODM 2016-03-28-188.8.131.525929 PRODM 2016-03-28-12.20.02.290882 PRODM ...
Analyzing the timing of lock escalation events can be quite useful to determine if perhaps there is an application that is using a higher isolation level and also if there may be missing indexes for the workload. There is also a lot more detailed information in the MSG field of SYSIBMADM.PDLOGMSGS_LAST24HOURS or PD_GET_LOG_MSGS – which may include the application name, the specific SQL being executed, and other details.
Resolving the Problem
The most obvious solution here is to increase the size of LOCKLIST in the db cfg using syntax like this:
db2 update db cfg for PRODM using LOCKLIST 30000
It is also possible that the MAXLOCKS parameter may need to be adjusted. Both of these parameters can be set to AUTOMATIC and tuned by STMM(Self Tuning Memory Manager). In fact, these are the two parameters I’m most likely to include in STMM tuning because the impact of having them too small can be so high, and because from what I’ve seen, DB2 seems to do a good job of tuning them.