I can’t remember the last time a ulimit bit me, but the time has come again. Everyone is used to removing all the limits for the instance ID at this point, I hope, but have you ever considered the ulimits for your fenced user id?
In the DB2 diagnostic log, I saw error messages like this for 3 out of 4 databases on a fairly new server, occuring multiple times a day:
2016-01-12-22.214.171.1248169-300 E302959A3363 LEVEL: Error (OS) PID : 36372558 TID : 4627 PROC : db2fmp (C) 0 INSTANCE: db2inst1 NODE : 000 DB : SAMPLE APPID : 192.0.2.0.42856.151231085622 HOSTNAME: server1 EDUID : 4627 EDUNAME: db2fmp (C) 0 FUNCTION: DB2 UDB, SQO Memory Management, sqloLogMemoryCondition, probe:100 CALLED : OS, -, malloc OSERR : ENOMEM (12) "There is not enough memory available now." MESSAGE : Private memory and/or virtual address space exhausted, or data ulimit exceeded DATA #1 : Soft data resource limit, PD_TYPE_RLIM_DATA_CUR, 8 bytes
Not only is the ENOMEM a critical element in this error message relating to the problem I’m describing, but the fact that it’s coming from the process db2fmp. Also critical is the fact that this server is not experiencing memory pressure or memory misconfiguration problems. If it’s coming from a different process, the issue may be different.
A little research led me to conclude I was seeing scenario number 12 from this technote: http://www-01.ibm.com/support/docview.wss?uid=swg21470035
That scenario is that the fenced user id has a ulimit for data.
Resolving the Issue
Finding Fenced User ID
If you do not already know what your fenced user id is, you can determine it using any of these methods:
==> cat /db2home/db2inst1/sqllib/ctrl/.fencedID db2fenc1
In the above, ‘
/db2home/db2inst1/‘ would be replaced with the home directory of the DB2 instance owner.
==> ps -ef | grep -i [db2]fmp db2fenc1 10617056 65863810 0 Jan 10 - 0:00 db2fmp cogadmf 13631718 11599922 0 Jan 02 - 0:01 db2fmp ...
In this method, there may be many processes, and you can see that I have two DB2 instances on this server, so I get two fenced ids. The parent process id is the process id of db2sysc for the instance, so I could use that to map back which fenced id goes with which instance.
==> db2pd -fmp Database Member 0 -- Active -- Up 3 days 22:27:04 -- Date 2016-01-13-126.96.36.1991410 FMP: Pool Size: 11 Max Pool Size: 200 ( Automatic ) Keep FMP: YES Initialized: YES Trusted Path: /db2home/db2inst1/sqllib/function/unfenced Fenced User: db2fenc1 ...
This will output information about all of the fenced processes, so may be a long list – the fenced user is listed near the top.
Looking at ulimits
Once you know the fenced user, you want to login as that user or su to it. This will list the limits for the user:
$ ulimit -a time(seconds) unlimited file(blocks) unlimited data(kbytes) 131072 stack(kbytes) 32768 memory(kbytes) 32768 coredump(blocks) 2097151 nofiles(descriptors) 2000 threads(per process) unlimited processes(per user) unlimited
In this case, the data limit is what is causing the problem.
Changing the ulimit
Depending on the division of responsibilities, you may only need to request that your System Administrator change the data limit for that user to unlimited. If you instead have access to root and should change it yourself, you can do this as root:
chuser data=-1 data_hard=-1 db2fenc1
After making this change, you will have to log in as the user again to see the changes. Always verify the changes took effect as expected.
If you have the DBM CFG parameter KEEP_FENCED set to YES (which is the default), you will need to stop and start the DB2 instance before the changes will take effect.
Note that all instructions here are for AIX because that is the OS where I ran into this issue.