HIGH SPEED METHOD FOR MULTIPLE LOOP

Asked 2 years ago, Updated 2 years ago, 32 views

This is the result of my own trial and error in accelerating speed, but I am ashamed to say that there is still room for improvement.
Could you tell me how to speed this up?

For your information, there are actually 17 loop statements that I am running, ab, cde,,,,, and most of the variable ranges are 3, so I run about 40 million different things.

forab in range(3):
    for cde in range(2):
        for fgin range(3):
            for hi in range (3):
                Return=np.r_[Return_AB[ab], #Return_AB contains (1,41) np.array
                             Return_CDE[cde], #Return_CDE contains (1,41) np.array
                             Return_FG[fg], #Return_FG contains (1,41) np.array
                             Return_HI[hi]]#Return_HI contains (1,41) np.array
                Return_total=np.sum(Return,axis=0)
                Return_dif = Return_total - BM# BM is (1,41) data frame
                Num0=max(Num0_AB[ab], Num0_CDE[cde], Num0_FG[fg], Num0_HI[hi])#4 to 8 values
                Win_Pro=(Return_dif.iloc[:,Num0:]>0).sum(axis=1)/(Number_Date-Num0)
                if Win_Pro.item()<1:
                    continue
                Cum_return=np.prod(Return_dif.iloc[:,Num0:]+1,axis=1)-1
                if Cum_return.item()<0.1:
                    continue
                TE=Return_dif.iloc[:,Num0:].std(axis=1)
                Result.append ([Win_Pro.item(), Cum_return.item(), TE.item(), Num0, ab, cde, fg, hi])

python python3

2022-09-30 16:36

1 Answers

In the case of this question, using Numpy or Pandas vector calculations instead of repeating them will require a lot of memory, so it would be better to leave the repeating process as it is and use Numba or Cython.

Numba is easy to use, so why don't you try Numba for now?

import number

@numba.jit
def calc():
    NMAX = 10000000 # Keep the number not overflowing
    Win_Pro=np.zeros (NMAX) 
    Cum_return=np.zeros (NMAX) 
    TE = np.zeros (NMAX)
    N=np.zeros (NMAX, dtype=int)
    A=np.zeros(4,NMAX), dtype=int)

    int n = 0
    Forab in range (3):
        for cde in range(2):
            for fgin range(3):
                for hi in range (3):
                    Return_total=Return_AB[ab]+#Return_AB contains (41) np.array
                                 Return_CDE[cde]+#Return_CDE contains (41) np.array
                                 Return_FG[fg]+#Return_FG contains (41) np.array
                                 Return_HI[hi]#Return_HI contains (41) np.array
                    Return_dif=Return_total - BM#BM is converted to np.array of (41).DataFrame and Series in Pandas can be converted from .df.values to np.array
                    Num0=max(Num0_AB[ab], Num0_CDE[cde], Num0_FG[fg], Num0_HI[hi])#4 to 8 values
                    Win_Pro[n]=(Return_dif.iloc[:,Num0:]>0).sum()/(Number_Date-Num0)
                    if Win_Pro[n] <1:
                        continue
                    Cum_return[n] = np.prod(Return_dif.iloc[:,Num0:]+1)-1
                    if Cum_return[n]<0.1:
                        continue
                    TE[n] = Return_dif.iloc[:,Num0:].std()
                    N[n] = Num0
                    A[n] = np.array ([ab, cde,fg, hi]) 
                    n + = 1

    return Win_Pro[:n], Cum_return[:n], TE[:n], N[:n], A[:n,:] 

Return=np.r_ combines ndarray, but I think it would be faster to omit it and add it directly.
Also, although Result is a list of lists, I tried changing the list to ndarray because it is slow to process.pd.concat to Pandas DataFrame is useful for future use.

If Numba is still slow, you can use Cython. The following official documentation will help you to use Cython.

·Cython Working with NumPy
·Pandas Enhancing Performance

Cython is not that difficult, but declaring the type of variable takes a lot of time.For example, if you're using Jupiter Notebook, first import Cython's magic function

%loadext Cython

The following code will work for now:

%cython
def calc(Return_AB, Return_CDE, Return_FG,, Num0_AB, Num0_CDE, Num0_FG, Num0_HI, BM):
# hereinafter abbreviated

Declaring the type of variable will speed up the processing.


2022-09-30 16:36

If you have any answers or tips


© 2024 OneMinuteCode. All rights reserved.