Efficient parallel calculation method for GPU calculation

I will explain the current situation a little and then write a question.

For the past month or so, I've been using the GPU (GeForce 1080) and running it on python with Keras (Tensorflow backend).(I use ubuntu 16.04.)

When you want to compute a single file by simply changing from CPU to GPU,
I was very satisfied that the calculation was completed 50-100 times faster.

Recently, however, the number of files to be calculated has increased, and
As a result, it is taking time again.

So I tried parallel calculations using Go language and python.
Due to memory constraints, GPUs can now calculate in parallel, but one question remains.

In general, if you have Go or python do parallel calculations, we know that the child processes (each) will not exceed 100 percent.(I'm currently writing a script that moves Keras written in python after I've adjusted the files I want to calculate using GO language.) However, I found that the actual load was about twice as high (average %CPU 150-180).Below is the result of the top command.

PID USER PR NI VIRT RES SHRS%CPU%MEM TIME+COMMAND
24786 user200 19.428g 1.886g 311096S 162.5 3.0 22:51.55 python 3.5
27392 user200 18.720g 1.194g 307708 S 162.5 1.97:56.84 python 3.5
16550 user200 22.414g 4.879g 318864S 156.27.867:30.80 python 3.5
27755 user200 18.635g 1.098g 306248S 150.01.86: 10.02 python 3.5
22933 user200 20.062g 2.527g 309140S 143.8 4.033:48.74 python 3.5
17685 user20027.359g 9.743g 317500 R 100.0 15.5 70:30.59 python 3.5

Also, the results of nvidia-smi are as follows:

+--------------------------------------------------------------------------------------------------------
| NVIDIA-SMI384.69 Driver Version: 384.69 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr.ECC |
| Fan Temp Perf Pwr: Usage/Cap | Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
|   0 GeForce GTX 1080 Off | 00000000:02:00.0 Off | N/A |
| 43% 64CP272W/180W | 7519MiB/8114MiB | 99% Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
|  GPU PID Type Process name Usage |
|=============================================================================|
|    0 17685 Python 3.5995 MiB|
|    0 20005 Python 3.5 1085 MiB|
|    022933 Python 3.5 1085 MiB|
|    024786 Python 3.5 1087 MiB|
|    027392 Python 3.5 1087 MiB|
|    027755 Python 3.5 1085 MiB|
|    029106Cpython 3.51085MiB|
+-----------------------------------------------------------------------------+

Since each CPU is using twice as much as this unexpectedly, it cannot be calculated in parallel anymore due to CPU constraints.
My prediction is that I have 16 cores of servers that I am currently using, so I was thinking of using them all separately (16 parallel calculations).I have a little extra GPU power, so I want to use it all up.

So I have a question,
(1)Is it possible to use as few cores as possible (up to 100% for each core)?
(2)If possible, would it be possible for the GPU to do all the calculations?
(3)In another way, is there a more efficient parallel calculation method when calculating with a GPU?
I would like to ask about three things.

It would be ideal if you could answer all three questions, but I would appreciate it if you could answer any of them.
May I speak to you?

python go keras gpu

2022-09-30 21:26

2 Answers

If you don't parallelize each process, it won't exceed 100%, but you're likely to misunderstand the scale."100%" is for one CPU and parallelization exceeds 100% and if the total load is around 16 cores (1600%), all cores are exhausted.

However, the bottleneck is not a CPU load, given that the load does not reach 162%, which seems to be the upper limit of parallelization, in other processes.It's reasonable to think that the GPU or I/O, maybe the main memory is running out.Well, the GPU is running out.

2022-09-30 21:26

(1) Is it possible to use as few cores as possible (up to 100% for each core)?

You may be able to do it with the command cpulimit

http://news.mynavi.jp/articles/2010/09/27/cpulimit-on-ubuntu/

2022-09-30 21:26

If you have any answers or tips

Popular Tags

python x 4647

android x 1593

java x 1494

javascript x 1427

c x 927

c++ x 878

ruby-on-rails x 696

php x 692

python3 x 685

html x 656