Why my rails production server reaches to 100% CPU usage and making app to be unresponsive?

Srinivasa Varma D October 21, 2016 1:39pm

Hi

My rails app production server CPU usage sometimes reaches to 100% and making app to be unresponsive. We are using Ubuntu 16.04.1 LTS server and configured Rails app along with Nginx and passenger as suggested in gorails deploy ruby on rails tutorial. I don't know why this is happening and it leads to website down for so many times. If anyone has any solution to fix this please tell me.

Jacob Montgomery October 21, 2016 7:12pm

Chances are this isn't something anyone here is really going to be able to help with without a lot more information. Issues like this you need to scour all of your logs in /var/log and your rails logs + passenger logs. Are you sure it's even your rails app that's causing it and not something else on your system? What process is really causing your CPU to max out and how are you determining that?

Check these links to get you going in the right direction for diagnosing your problem:

http://unix.stackexchange.com/questions/130955/how-to-troubleshoot-high-load-when-there-are-no-obvious-processes
https://www.tummy.com/articles/isolating-heavy-load/
http://www.linuxjournal.com/magazine/hack-and-linux-troubleshooting-part-i-high-load
https://bobcares.com/blog/how-to-troubleshoot-high-load-in-linux-web-hosting-servers/

Srinivasa Varma D October 25, 2016 5:01am

Hi Jacob Montgomery,

I ran top command and it showing following

top - 10:48:14 up 23:51, 2 users, load average: 7.47, 7.58, 7.13
Tasks: 167 total, 1 running, 166 sleeping, 0 stopped, 0 zombie
%Cpu(s): 35.9 us, 0.0 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.0 hi, 0.0 si, 64.1 st
KiB Mem : 4045040 total, 169484 free, 2827428 used, 1048128 buff/cache
KiB Swap: 0 total, 0 free, 0 used. 1108364 avail Mem

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND

3637 deploy 20 0 499564 165644 10436 S 35.8 4.1 305:37.51 Passenger RubyApp: /home/deploy/testprep_rails/current/public (production)
6669 deploy 20 0 494112 178280 7352 S 34.8 4.4 291:43.09 Passenger RubyApp: /home/deploy/testprep_rails/current/public (production)
3650 deploy 20 0 494204 177568 7392 S 33.1 4.4 298:59.78 Passenger RubyApp: /home/deploy/testprep_rails/current/public (production)
8735 deploy 20 0 563904 180032 10540 S 31.8 4.5 97:30.69 Passenger RubyApp: /home/deploy/testprep_rails/current/public (production)
8743 deploy 20 0 494216 174424 7388 S 30.5 4.3 94:15.07 Passenger RubyApp: /home/deploy/testprep_rails/current/public (production)
10114 deploy 20 0 428576 171136 7364 S 29.1 4.2 91:26.35 Passenger RubyApp: /home/deploy/testprep_rails/current/public (production)

My Nginx error log says that some passenger ruby processes are running long time and taking too much CPU usage. Still problem exist please have look at my app nginx error log.

[ 2016-10-24 23:02:18.7328 21683/7fec0566a700 age/Cor/CoreMain.cpp:532 ]: Signal received. Gracefully shutting down... (send signal 2 more time(s) to force shutdown)
[ 2016-10-24 23:02:18.7328 21683/7fec0c265780 age/Cor/CoreMain.cpp:901 ]: Received command to shutdown gracefully. Waiting until all clients have disconnected...
[ 2016-10-24 23:02:18.7328 21683/7fec0c265780 age/Cor/CoreMain.cpp:819 ]: Checking whether to disconnect long-running connections for process 21780, application /home/deploy/testprep_rails/current/public (production)
[ 2016-10-24 23:02:18.7329 21683/7fec0c265780 age/Cor/CoreMain.cpp:819 ]: Checking whether to disconnect long-running connections for process 21788, application /home/deploy/testprep_rails/current/public (production)
[ 2016-10-24 23:02:18.7329 21683/7fec0c265780 age/Cor/CoreMain.cpp:819 ]: Checking whether to disconnect long-running connections for process 26979, application /home/deploy/testprep_rails/current/public (production)
[ 2016-10-24 23:02:18.7329 21683/7fec0c265780 age/Cor/CoreMain.cpp:819 ]: Checking whether to disconnect long-running connections for process 29348, application /home/deploy/testprep_rails/current/public (production)
[ 2016-10-24 23:02:18.7329 21683/7fec0c265780 age/Cor/CoreMain.cpp:819 ]: Checking whether to disconnect long-running connections for process 624, application /home/deploy/testprep_rails/current/public (production)
[ 2016-10-24 23:02:18.7329 21683/7fec0c265780 age/Cor/CoreMain.cpp:819 ]: Checking whether to disconnect long-running connections for process 3289, application /home/deploy/testprep_rails/current/public (production)
[ 2016-10-24 23:02:18.7329 21683/7fec0c265780 age/Cor/CoreMain.cpp:819 ]: Checking whether to disconnect long-running connections for process 22519, application /home/deploy/testprep_rails_staging/current/public (staging)
[ 2016-10-24 23:02:18.7330 21683/7febe7fff700 Ser/Server.h:817 ]: [ApiServer] Freed 0 spare client objects
[ 2016-10-24 23:02:18.7330 21683/7febe7fff700 Ser/Server.h:464 ]: [ApiServer] Shutdown finished

[ 2016-10-24 23:02:18.7331 21688/7f5c3c815700 age/Ust/UstRouterMain.cpp:422 ]: Signal received. Gracefully shutting down... (send signal 2 more time(s) to force shutdown)
[ 2016-10-24 23:02:18.7331 21688/7f5c4344f780 age/Ust/UstRouterMain.cpp:492 ]: Received command to shutdown gracefully. Waiting until all clients have disconnected...
[ 2016-10-24 23:02:18.7348 21688/7f5c2ffff700 Ser/Server.h:817 ]: [UstRouterApiServer] Freed 0 spare client objects
[ 2016-10-24 23:02:18.7355 21688/7f5c2ffff700 Ser/Server.h:464 ]: [UstRouterApiServer] Shutdown finished
[ 2016-10-24 23:02:18.7357 21688/7f5c3c815700 Ser/Server.h:464 ]: [UstRouter] Shutdown finished
[ 2016-10-24 23:02:18.7359 21688/7f5c4344f780 age/Ust/UstRouterMain.cpp:523 ]: Passenger UstRouter shutdown finished
2016/10/24 23:02:19 [info] 3530#3530: Using 32768KiB of shared memory for push module in /etc/nginx/nginx.conf:71
[ 2016-10-24 23:02:19.8095 3538/7fe734526780 age/Wat/WatchdogMain.cpp:1291 ]: Starting Passenger watchdog...
[ 2016-10-24 23:02:19.8169 3541/7fc6d76d2780 age/Cor/CoreMain.cpp:982 ]: Starting Passenger core...
[ 2016-10-24 23:02:19.8170 3541/7fc6d76d2780 age/Cor/CoreMain.cpp:235 ]: Passenger core running in multi-application mode.
[ 2016-10-24 23:02:19.8185 3541/7fc6d76d2780 age/Cor/CoreMain.cpp:732 ]: Passenger core online, PID 3541
[ 2016-10-24 23:02:19.8266 3546/7fd1a4535780 age/Ust/UstRouterMain.cpp:529 ]: Starting Passenger UstRouter...
[ 2016-10-24 23:02:19.8290 3546/7fd1a4535780 age/Ust/UstRouterMain.cpp:342 ]: Passenger UstRouter online, PID 3546

Jacob Montgomery October 25, 2016 5:09am

If you're referring to:

[ 2016-10-24 23:02:18.7328 21683/7fec0c265780 age/Cor/CoreMain.cpp:819 ]: Checking whether to disconnect long-running connections for process 21780, application /home/deploy/testprep_rails/current/public (production)

Then according to https://github.com/phusion/passenger/issues/1467#issuecomment-89545674 it doesn't look like that's the cause of your problem and just normal behavior. Is that all that you're basing your high usage off of or have you monitored something like TOP and noticed long periods of 100% usage?

https://linux.die.net/man/1/top

http://www.tecmint.com/12-top-command-examples-in-linux

Srinivasa Varma D October 25, 2016 5:20am

I used 'top -ic' command then i get following result

top - 10:48:14 up 23:51, 2 users, load average: 7.47, 7.58, 7.13
Tasks: 167 total, 1 running, 166 sleeping, 0 stopped, 0 zombie
%Cpu(s): 35.9 us, 0.0 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.0 hi, 0.0 si, 64.1 st
KiB Mem : 4045040 total, 169484 free, 2827428 used, 1048128 buff/cache
KiB Swap: 0 total, 0 free, 0 used. 1108364 avail Mem

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND

3637 deploy 20 0 499564 165644 10436 S 35.8 4.1 305:37.51 Passenger RubyApp: /home/deploy/testprep_rails/current/public (production)
6669 deploy 20 0 494112 178280 7352 S 34.8 4.4 291:43.09 Passenger RubyApp: /home/deploy/testprep_rails/current/public (production)
3650 deploy 20 0 494204 177568 7392 S 33.1 4.4 298:59.78 Passenger RubyApp: /home/deploy/testprep_rails/current/public (production)
8735 deploy 20 0 563904 180032 10540 S 31.8 4.5 97:30.69 Passenger RubyApp: /home/deploy/testprep_rails/current/public (production)
8743 deploy 20 0 494216 174424 7388 S 30.5 4.3 94:15.07 Passenger RubyApp: /home/deploy/testprep_rails/current/public (production)
10114 deploy 20 0 428576 171136 7364 S 29.1 4.2 91:26.35 Passenger RubyApp: /home/deploy/testprep_rails/current/public (production)

Jacob Montgomery October 25, 2016 5:26am

I don't see anything wrong there, all well below max.

Is there any other indication that you're using too much CPU? Maybe a specific page that takes a long time to load?

shakycode October 27, 2016 12:27pm

A load average of 7.47, 7.58, 7.13 is not normal. What's interesting is it shows that your CPU core (initially) is only 35%. Can you please take a screenshot and upload it here using htop or top sorting my processor/CPU usage?

Jacob Montgomery October 27, 2016 2:26pm

Ack, I'm sorry Srinivasa Varma, James Jelinek is right - 1.0 = 100%, so you want to be below 1.

If it's not always maxed out then you're probably going to have to sit and monitor top for awhile and pay attention to the cpu column for when the process starts to spike then snag the screen shot. To sort by the CPU field inside top, press the 'P' key (capitalized).

Also note that the CPU column can be a little funky if you're on a multi-core/processor system so check out http://unix.stackexchange.com/a/145249

Srinivasa Varma D October 28, 2016 7:05am

Hi

I changed my server also for this reason. Even then the problem still exist.
I just deployed app using capistrano and during deploy process my cpu reaches to 100%.

Jacob Montgomery October 28, 2016 4:32pm

You could possibly be running into insufficient RAM for your deploy. I recently had to upgrade one of my droplets because during the asset pre-compilation it would max out the ram, jet the CPU to 100% then Ubuntu would shut it all down. However this was when I was attempting to use only a 512mb droplet and if I'm not mistaken you're running 4gb of ram so might be a slim chance that this is the cause unless you have some really big assets...?

Here's a couple of more stuff I found digging around the interwebz...

http://stackoverflow.com/a/24993595 <= this maybe useful to determine which task is overloading during deploy
https://github.com/capistrano/rails/issues/55

Unfortunately I'm not going to be much more help here since I use git hooks and not capistrano. Which, if you have the ability to easily spin up a new VM you could try to deploy using a different technique and see if it really is isolated to capistrano... process of elimination!

shakycode October 28, 2016 4:46pm

The only way I can think of that this sort of CPU/memory utilization spike would happen would be

Low memory VM
Gigantic set of assets that need to be precompiled
Passenger/Unicorn/Puma already maxing out the CPU/Memory

As a test I've spun up a 512mb droplet with Passenger/Postgres/Nginx and deployed an asset heavy application while Passenger was spun up.

Coming from an ops background I would definitely lean heavily on htop or top and sort the processes by niceness or CPU utilization. One (or both) of these tools should tell you what the offending process is.

By chance are you running any other sort of services, programs, scripts which may be proc intensive?

Would definitely like to help you get back on track. :-)

I deployed (which included asset compliation) 20x in a row and I could not spike the CPU to 100% (topped out at 44%) although memory did start to page (swap) during the app being spun up and used while capistrano was deploying.

Komal Agrahari January 10, 2018 5:53am

Hello everyone,

I am also facing Ruby High CPU Usage, in my server, due to this issue my user not able to even signup or login in the app, I found the high cpu usage reason by top command.

The result of top command is given below -
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND

13514 ec2-user 20 0 1175m 904m 7008 R 101 12.5 10:50.75 ruby

14229 root 20 0 987m 3580 2108 S 2.3 0.0 0:06.05 PassengerHelper

15909 ec2-user 20 0 1291m 1.0g 6976 S 0.7 14.0 27:49.27 ruby

15904 ec2-user 20 0 1111m 901m 6512 S 0.7 12.4 26:19 PassengerHelper

What should be done to overcome this problem, please help me out asap.
Thanks!

Rails for Beginners

Advanced Ruby: Behind the Magic

Payments with Rails Master Class

Refactoring Rails

Learn Hotwire

Install and Deploy Rails Guides

Hatchbox.io

Jumpstart Rails SaaS Template

Remote Ruby Podcast

GoRails Open Source

Rails Hackathon

Beginner Bounties

Ruby on Rails Job Board

Notifications

Why my rails production server reaches to 100% CPU usage and making app to be unresponsive?

Want to stay up-to-date with Ruby on Rails?