Unable to fully utilize CPU for parallel calculations on AWS EC2 instances

Asked 2 years ago, Updated 2 years ago, 82 views

AWS EC2 c5.9 xlarge stamp is used.
(36 vCPU, 72 GB RAM)
AMI uses ubuntu 18.04.

I'd like to do a parallel calculation in python with 36 cores, but maybe it doesn't recognize the CPU or some CPU cores are not used.
The python script specifies 36 CPUs.
(It should automatically be parallel to the maximum number of cores without specifying it, but only one core worked.
If you specify 36 cores, they will use 25 cores.)

If you look at the CPU utilization at top, it looks like this:

top-01:52:47 up34min, 1 user, load average:36.03, 34.15, 22.04
Tasks: 446 total, 37 running, 236 sleeping, 0 stopped, 0 zombie
%Cpu0: 100.0 us, 0.0 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu1: 0.0 us, 0.0 sy, 0.0 ni, 100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu2: 100.0 us, 0.0 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu3: 100.0 us, 0.0 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu4: 100.0 us, 0.0 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu5: 0.3us, 0.3sy, 0.0ni, 99.3id, 0.0wa, 0.0hi, 0.0si, 0.0st
%Cpu6: 100.0 us, 0.0 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu7: 100.0 us, 0.0 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu8: 0.0 us, 0.0 sy, 0.0 ni, 100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu9: 100.0 us, 0.0 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu10: 100.0 us, 0.0 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu11: 0.0 us, 0.0 sy, 0.0 ni, 100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu12: 100.0 us, 0.0 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu13: 0.0 us, 0.0 sy, 0.0 ni, 100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu14: 0.0 us, 0.0 sy, 0.0 ni, 100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu15: 100.0 us, 0.0 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu16: 100.0 us, 0.0 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu17: 100.0 us, 0.0 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu18: 100.0 us, 0.0 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu19: 100.0 us, 0.0 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu20: 100.0 us, 0.0 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu21: 0.0 us, 0.0 sy, 0.0 ni, 100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu22: 100.0 us, 0.0 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu23: 100.0 us, 0.0 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu24: 100.0 us, 0.0 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu25: 0.0 us, 0.0 sy, 0.0 ni, 100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu26: 100.0 us, 0.0 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu27: 0.0 us, 0.0 sy, 0.0 ni, 100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu28: 100.0 us, 0.0 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu29: 100.0 us, 0.0 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu30: 0.0 us, 0.0 sy, 0.0 ni, 100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu31: 100.0 us, 0.0 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu32: 100.0 us, 0.0 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu33: 100.0 us, 0.0 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu34: 0.0 us, 0.0 sy, 0.0 ni, 100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu35: 100.0 us, 0.0 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
KiB Mem: 72028912 total, 63163192 free, 7853444 used, 1012272 buff/cache
KiB Swap: 10239996 total, 10239996 free, 0 used.63486432 avail Mem
PID USER PR NI VIRT RES SHRS%CPU%MEM TIME+COMMAND
 3302 ubuntu 200 488808235796 48848 R 100.000.3 14:49.20 python
 3304 ubuntu 200 528140 275044 48816 R 100.000.4 14:49.12 python
 3305ubuntu 200 490244 23687248652 R 100.000.3 14:49.16 python
 3306ubuntu 2000 493836 240756 48940 R 100.000.3 14:49.13 python
 3307ubuntu 200 487884 234388 48496 R 100.000.3 14:49.19 python
 3309ubuntu 200 475068 219924 46628 R 100.000.3 14:49.13 python
 3312 ubuntu 200 528812 275764 48848 R 100.000.4 14:49.13 python
 3316 ubuntu 2000 491356 237732 48512 R 100.000.3 14:49.13 python
 3318 ubuntu 200 507600 254616 48820 R 100.000.4 14:49.14 python
 3322 ubuntu 200 529840 276740 48920 R 100.000.4 14:49.07 python
 3325 ubuntu 200 522588 268868 48348 R 100.000.4 14:49.11 python
 3328 ubuntu 200 508744 255844 48936 R 100.000.4 14:49.11 python
 3331 ubuntu 200 528416 275416 48876 R 100.000.4 14:49.11 python
 3333 ubuntu 200 497844 244440 48616 R 100.000.3 14:49.10 python
 3335 ubuntu 200 522516 268860 48436 R 100.000.4 14:49.06 python
 3338 ubuntu 200 493688 240256 48404 R 100.000.3 14:49.08 python
 3340 ubuntu 200 533152 27954848484 R 100.000.4 14:49.07 python
 3370 ubuntu 200 509480 255976 48420 R 100.000.4 14:39.26 python
 3373 ubuntu 200 492532 239368 48748 R 100.000.3 14:32.92 python
 3375 ubuntu 2000 493040 239400 48380 R 100.000.3 14:32.55 python
 3379ubuntu 200 492884 239576 48688 R 100.000.3 14:27.08 python
 3382 ubuntu 200 531604 278160 48396 R 100.000.4 12:41.57 python
 3342 ubuntu 200 473212 219792 48540 R 99.70.3 14:49.06 python
 3377 ubuntu 200 479024 225260 48356 R 99.70.3 14:28.90 python
 3298 ubuntu 200 449504 194924 49564 R 8.60.31:15.02 python
 3291 ubuntu 200 522544 268260 49784 R 8.30.41:15.00 python
 3292 ubuntu 200 484124 229284 49460 R 8.30.31:15.02 python
 3293ubuntu 200 524380 269692 49360 R 8.30.41:15.00 python
 3294 ubuntu 200 524916 270384 49528 R 8.30.41:15.00 python
 3295 ubuntu 200 485528 230932 49356 R 8.30.31:15.00 python
 3296 ubuntu 200 483012 228416 49536 R 8.30.31:15.00 python
 3297 ubuntu 200 447140 192612 49580 R 8.30.31:15.01 python
 3299 ubuntu 200 448776 194252 49688 R 8.30.31:15.00 python
 3300 ubuntu 200 442656 187712 49300 R 8.30.31:14.99 python
 3301 ubuntu 200 445364 190980 49580 R 8.30.31:14.99 python
 3303ubuntu 200 447464 193128 49688 R 8.30.31:14.99 python
 3398 ubuntu 200 44236 4156 3228 R 0.30.00:00.80 top

Multiple python processes are being pushed into a single CPU core, resulting in poor performance.

This behavior does not occur in your local environment (MacBook Pro).
I hope you know the cause and solution.

python aws amazon-ec2

2022-09-30 14:28

1 Answers

It didn't seem to be due to AWS' virtual machine environment, but to Linux or Python and its libraries.

Since the top command shows the CPU, the OS can say that the CPU knows the appropriate number.You're not using a parallel calculation library, you're just giving the CPU a few minutes, right?

I don't know why Linux schedules processes like this, but why don't you use commands such as taskset to specify the CPU ID for each process?


2022-09-30 14:28

If you have any answers or tips


© 2024 OneMinuteCode. All rights reserved.