Calculated amount of ruby select, map, etc.

Asked 2 years ago, Updated 2 years ago, 41 views

How many loops will occur when ruby repeatedly calls an array, select or map such as Hash?

For example, in the code below,

 (1..100)
  .select { | val | val %2 == 0}
  .map {|val|val.to_s}
  US>"join""

Do I have to do three loops in a loop or three loops in one loop?

In some languages, delay assessment can reduce loop frequency, so I took a benchmark and found it very confusing.

I tried the bench below.For comparison, I also included processing using each_with_object.

 Benchmark.ips do | x |
  x.report("repeat map")do
    (1..10000)
      .map {|val|val.to_s}
      .map {|val|val.to_i}
      .map {|val|val.to_s}
      .map {|val|val.to_i}
  end

  x.report("each_with_object")do
    (1..10000).each_with_object([])do|val,res|
      val=val.to_s
      val=val.to_i
      val=val.to_s
      val=val.to_i
    end
  end

  x.report("repeat each")do
    range=(1..10000)
    range.each do | val |
      val=val.to_s
    end
    range.each do | val |
      val=val.to_i
    end
    range.each do | val |
      val=val.to_s
    end
    range.each do | val |
      val=val.to_i
    end
  end
  x.compare!
end

The result is like this (the more, the faster).

 repeat map 140.435 (±8.5%) i/s -700.000 in 5.023912s
each_with_object 176.636 (±5.1%) i/s-884.000 in 5.017814s
     repeat each 186.010 (±5.4%) i/s-936.000 in 5.048520s

I am not sure if the repeat map is looped repeatedly, but I could not understand that the repeat each is almost the same as the each_with_object.Perhaps there is something wrong with the code I tested...

I posted it because I was wondering if I could call select, reject, map, etc. repeatedly for array and Hash.

Thank you for your cooperation.

ruby

2022-09-30 16:38

3 Answers

If obj.map{...}.map{...}.map{...}, each map handles all the elements and passes them to the next map.

From Ruby 2.0, Enumerable #lazy is introduced and obj.lazy.map {...}.map {...}.map {...} to process each map for each element.

http://magazine.rubyist.net/?0041-200 Special-lazy will be helpful.


2022-09-30 16:38

Ruby would not have automatically optimized to reduce the number of loops.

 (1..100)
  .select { | val | val %2 == 0}
  .map {|val|val.to_s}
  US>"join""

is

This is the process of

If the map is a delay evaluation, the speed comparison would be repeat map eeach_with_object>repeat each, and if the map loops repeatedly, it would be repeat map _each_with_object>repeat map.I don't understand why repeat map is obviously slower than repeat each.

With the code for this benchmark, whether or not the map is evaluated for delay, the basic calculation is 10000 x 4.There is basically no difference between them.

However, the low cost of the calculation itself has affected many things.

Unlike other intermediate ones, repeat maps require four 10,000 elements to generate an array.(The slowness is probably due to that and GC.

For repeat each and each_with_object, Range#each is called and the cost of initiating a loop is higher, while each_with_object is substituted for the second argument.Also, the actions taken are different.(Is the receiver of to_i Integer or String?)

In the end, the benchmark is not sure what you are measuring because the conditions are different.


2022-09-30 16:38

Each translation in the block is processed the same number of times, whether it is evaluated for delay or not.If you are ultimately going to do everything, it doesn't reduce the number of processes in the block itself, but if you're thinking about the impact of other parts, it's hard to find meaningful differences, such as putting more maps on top of each other.Also, if the final results are not the same, there is no comparison between good performance.

Based on the above, we have recreated the benchmark.

#frozen_string_literal:true

require 'benchmark/ips'

module Convertize
  refresh Integer do
    def collatz
      If self.even?
        self/2
      else
        self*3+1
      end
    end
  end
end
using Convertize

MAX_NUM=10_000
REPEAT_COUNT=10
METHOD = 'collatz'

SCRIPTS = {
  'repeat map' = >
    'list=list' + ".map {|n|n.#{METHOD}}}" *REPEAT_COUNT,
  'repeat map!' = >
    'list' + ".map!{|n|n.#{METHOD}}}" *REPEAT_COUNT,
  'repeat map lazy' = >
    'list=list.lazy' + ".map {|n|n.#{METHOD}}}" *REPEAT_COUNT+'.to_a',
  'repeat map+assign' = >
    "list=list.map {|n|n.#{METHOD}}\n" *REPEAT_COUNT,
  'repeat each + push' = >
    "temp=[]; list.each {|n|temp<<n.#{METHOD}};list=temp\n" *REPEAT_COUNT,
  'repeat each_index+replace' = >
    "list.each_index {|i|list[i]=list[i].#{METHOD}}\n" *REPEAT_COUNT,
  'one map' = >
    'list=list.map {|n|n'+".#{METHOD}"*REPEAT_COUNT+'},
  'one each + push' = >
    'temp=[]; list.each {|n|temp<<n'+".#{METHOD}"*REPEAT_COUNT+
    '};list=temp',
  'one each_index+replace' = >
    'list.each_index {|i|list[i]=list[i]'+".#{METHOD}"*REPEAT_COUNT+'},
}.freeze

pSCRIPTS

list=(1..MAX_NUM).to_a
event(SCRIPTS['repeat map'])
EXPECTED_LIST=list.freeze

Benchmark.ips do | x |
  SCRIPTS.each do | name, script |
    x.report(name)do
      list=(1..MAX_NUM).to_a
      event(script)
      raise unless list == EXPECTED_LIST
    end
  end
  x.compare!
end

The results of 2.4.2 I have are as follows.

Comparison:
             one map: 108.6 i/s
     one each + push: 108.6 i/s-same-ish:difference falls with error
one each_index+replace: 105.9 i/s-same-ish:differences falls with in error
         repeat map!:59.4 i/s-1.83x lower
          repeat map: 57.2 i/s-1.90x lower
  repeat each + push:54.0 i/s - 2.01x lower
repeat each_index+replace:49.7 i/s-2.18x lower
 repeat map+assign:47.9 i/s-2.27x lower
     repeat map lazy: 38.9 i/s - 2.79x lower

In-block processing, you can change the size of the array and the number of repeated maps by changing MAX_NUM or REPEAT_COUNT.

Now, let me guess based on the results.

The fastest is the map one-time operation (one map), which generates an array and fills in elements for each operation, which costs itself a fair amount of money.

Repeat map is considered to be a few times.

I would like to say that, but if that is the case, I think it would be better to use map! to replace the array without generating it (repeat map!) a little faster.If you look at the rest of the repeat system, you can also think of the cost of processing the block itself.One block call is MAX_NUM, but repeat block call is very different from MAX_NUM*REPEACT_COUNT, so you can think of the difference.

Finally, lazy is the slowest.This is probably due to the processing going back one by one in the final sequence.It is presumed that the reason is that the production of the array is not designed to produce performance in the delay evaluation.However, if you say you want only the first one without all the arrays, you can hit more than 10 times the speed of one map.

Some say that Ruby's map is slow because it generates arrays every time, but it is practical enough to speed it up enough, and in most cases it doesn't matter.Using map! to eliminate generation costs makes little difference.If you put the processing together, you can hope for a certain speed increase, but the normal code will be insignificant.Rather, lazy is a limited use.In many cases, it is slower to find all the sequences.I think it's better not to use it unless you want the first part, not all of them.


2022-09-30 16:38

If you have any answers or tips


© 2024 OneMinuteCode. All rights reserved.