It was going smoothly in two rows, but as soon as it was in 17 rows, it stopped running.

Asked 2 years ago, Updated 2 years ago, 100 views

Using the methodology you previously answered, I wrote a programming that starts with two or more numbers in the first row of data stored in csv, but it takes a lot of time to execute and there's no sign of ending after a day. What's wrong?

import numpy as np
import pandas aspd
from pandas import Series, DataFrame
import matplotlib.pyplot asplt
from itertools import dropwhile
%matplotlib inline

data='1214.csv'

data=pd.read_csv(data)
data=np.array(data)

from itertools import dropwhile, zip_longest

d0 = [x for x in dropwhile (lambday:y<0.2, data[:,0])] 
d1 = [x for x in dropwhile (lambday:y<0.2,data[:,1])]
d2 = [x for x in dropwhile (lambday:y<0.2,data[:,2])] 
d3 = [x for x in dropwhile (lambday:y<0.2, data[:,3])]
d4 = [x for x in dropwhile (lambday:y<0.2, data[:,4])] 
d5 = [x for x in dropwhile (lambday:y<0.2, data[:,5])]
d6 = [x for x in dropwhile (lambday:y<0.2, data[:,6])] 
d7 = [x for x in dropwhile (lambday:y<0.2, data[:,7])]
d8 = [x for x in dropwhile (lambday:y<0.2, data[:,8])] 
d9 = [x for x in dropwhile (lambday:y<0.2, data[:,9])]
d10 = [x for x in dropwhile (lambday:y<0.2,data[:,10])] 
d11 = [x for x in dropwhile (lambday:y<0.2, data[:,11])]
d12 = [x for x in dropwhile (lambday:y<0.2, data[:,12])] 
d13 = [x for x in dropwhile (lambday:y<0.2, data[:,13])]
d14 = [x for x in dropwhile (lambday:y<0.2, data[:,14])] 
d15 = [x for x in dropwhile (lambday:y<0.2, data[:,15])]
NewData=np.array ([d0, d1, d2, d3, d4, d5, d6, d7, d8, d9, d10, d11, d13, d14, d15]) 
print(NewData)
np.savetxt('1214-945.csv', NewData, fmt='%s', delimiter=',')

python python3 jupyter-notebook

2022-09-30 19:10

2 Answers

The column index ranges from 0 to 15 for 16 columns.

If there are only 15 columns of data, the following two lines will fail and NewData cannot be done, so maybe the next two lines will fail as well.

d15 = [x for x in dropwhile (lambday:y<0.2,data[:,15])]
NewData=np.array ([d0, d1, d2, d3, d4, d5, d6, d7, d8, d9, d10, d11, d13, d14, d15])

addition:

For example, if you read the CSV file below,

data='1214.csv'

data=pd.read_csv(data)
data=np.array(data)

Even if you simulate random numbers like this to 10-1000 rows of data in 17 columns, the processing itself will be complete.

data=[]
for_in range (1000):
    data.append(np.random.uniform(0,1,17))

data=np.array(data)

However, this Warning appears and the finished NewData display is strange, so I don't think it's the result I wanted.

Displayed Warning
VisibleDeprecationWarning: Creating and warning from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or-tuples with different lengths or shapes) is decreasing. If you want to do this, you must specify' type of events

Sample Results (with 10 rows of data)
Considering the previous Q&A , it is not CSV data because it has been converted vertically and horizontally and contains [,
] at the beginning and end of each line in the CSV file.

 [list([0.8964406887196086, 0.1289855921390296, 0.9821250613116147, 0.7277624380256356, 0.597179945506056, 0.5883628593677697, 0.4664218180732566, 0.04616575288635895, 0.9831.80518863595, 0.9831.80288225)]
 list ([0.81459734967449, 0.07064141016207148, 0.5683665463277392, 0.5798712084908151, 0.04482624113860545, 0.14102892907620534, 0.39305425436742103, 0.36942030150742966, 0.512619106492389, 0.67641181747)
 list ([0.7591268875733456, 0.11453918216754855, 0.4647680491351407, 0.040091866778552476, 0.1429846985183626, 0.6627834295485381, 0.0903669991356253, 0.810514260026902, 0.9931642800156654)
 list ([0.8625025870468651, 0.3509634752035785, 0.039771391720511695, 0.141294678559875, 0.9785141897412436, 0.7871799464751338, 0.4392150886107685, 0.8797371371672483, 0.8847149683486721])
 list ([0.497940921882306, 0.9082257949953394, 0.5833805332031865, 0.47877232055889385, 0.08331201995212023, 0.4006076893255164, 0.47959612630114455, 0.5886402830771224, 0.9980262909358327, 0.52242107568902)
 list ([0.27197735039392024, 0.575555020643584366, 0.6911546654769439, 0.6995193500503479, 0.11448204495653014, 0.28069236013054844, 0.2778123126787495, 0.8300432017199777, 0.5017304070162142, 0.118235735192451924519245192451945.]
 list ([0.5923816654094234, 0.42428317216701694, 0.21513107003802912, 0.3246311643857014, 0.8250821738073256, 0.10344631269777493, 0.038894047272288956, 0.467108730492769, 0.94249296790788388388])
 list ([0.930484556882399, 0.0949142996713167, 0.197681927245144, 0.6532485659646121, 0.07650063989252631, 0.4641428636069759, 0.46430807358621857, 0.8777035315326964, 0.807913312569, 0.27151654275405405]
 list ([0.5686227354480835, 0.20979325360116918, 0.41795367528452854, 0.6860058585789381, 0.25969874636294243, 0.3285464978554288, 0.599596888394171, 0.4151960837685723, 0.99513226604745, 0.4685945]
 list([0.3783579341561787, 0.6599424089628657, 0.6756485785361779, 0.6579640744721366, 0.48120694336419967, 0.6699040599082838, 0.47013107179808233, 0.7897054241420943, 0.257977685834, 0.69841537598415925]
 list ([0.2119275007054927, 0.9073692839679351, 0.7542551775514874, 0.7304899190222118, 0.8934351608341778, 0.37590231018995, 0.06554942790036467, 0.3782961369793859, 0.24940028099595324])
 list ([0.4792099212751727, 0.767858701604624, 0.5162174625431372, 0.019386808204984396, 0.8516704517647707, 0.6695238721486, 0.10275441692328746, 0.7262108016346217, 0.274663309461903, 0.511411021778779]
 list ([0.4736295383577239, 0.5594249849728473, 0.13118135935938803988, 0.6614701297724721, 0.31594547319531097, 0.25538415218610466, 0.38813802091478633, 0.9779182451444076, 0.12926086118813, 0.716446993099)
 list([0.3356971055614466, 0.9960279356408579, 0.9018106146850836, 0.7056349705349919, 0.8677843649824027, 0.713199273910345, 0.4500223204462691, 0.16791339066028255, 0.808338765780848, 0.06552978285185)
 list ([0.8653470673875016, 0.7158305241304749, 0.9241323423093925, 0.1598556330050731, 0.4655566657028062, 0.3714871548975628, 0.30538053450909164, 0.14440821986341756, 0.9057775689030844, 0.555959481488)
 list ([0.29634069084155035, 0.7219891196591361, 0.4799806341569959, 0.43549102434868503, 0.675185724281284, 0.5623786004405398, 0.24797232810209235, 0.755702079954496, 0.6220137098402305, 0.318744944944]

CSV Files

[0.8964406887196086, 0.1289855921390296, 0.9821250613116147, 0.7277624380256356, 0.597179945506056, 0.5883628593677697, 0.46642181807325156, 0.046165752886358935, 0.9833551108804288,80265926]
[0.81459734967449, 0.07064141016207148, 0.5683665463277392, 0.5798712084908151, 0.04482624113860545, 0.14102892907620534, 0.39305425436742103, 0.36942030150742966, 0.512619106492389, 0.6764118175476419]
[0.7591268875733456, 0.11453918216754855, 0.4647680491351407, 0.040091866778552476, 0.1429846985183626, 0.6627834295485381, 0.09036694991356253, 0.810514260026902, 0.9931642800156654]
[0.8625025870468651, 0.3509634752035785, 0.039771391720511695, 0.141294678559875, 0.9785141897412436, 0.7871799464751338, 0.4392150886107685, 0.8797371371672483, 0.8847149683486721]
[0.497940921882306, 0.9082257949953394, 0.5833805332031865, 0.47877232055889385, 0.08331201995212023, 0.4006076893255164, 0.47959612630114945, 0.5886402830771224, 0.9980262909358327, 0.5224210756830902]
[0.27197735039392024, 0.5755502064358436, 0.6911546654769439, 0.6995193500503479, 0.11448204495653014, 0.28069236013054844, 0.27781231267874995, 0.8300432017199777, 0.5017304070162142, 0.11823577351924519]
[0.5923816654094234, 0.42428317216701694, 0.21513107003802912, 0.3246311643857014, 0.8250821738073256, 0.10344631269777493, 0.038894047272288956, 0.467108730492769, 0.9424929679078838]
[0.9304845568823439, 0.0949142996713167, 0.197681927245144, 0.6532485659646121, 0.07650063989252631, 0.4641428636069759, 0.4930807358621857, 0.8777035315326964, 0.8079133125265369, 0.27151654275407255]
[0.5686227354480835, 0.20979325360116918, 0.41795367528452854, 0.6860058585789381, 0.25969874636294243, 0.3285464977855428, 0.5999579688394171, 0.4151960837685723, 0.9951317222604745, 0.46859183299562845]
[0.3783579341561787, 0.6599424089628657, 0.6756485785361779, 0.6579640744721366, 0.48120694336419967, 0.6699040599082838, 0.47013107179808233, 0.7897054241420943, 0.2579579777685834, 0.6984153759229584]
[0.2119275007054927, 0.9073692839679351, 0.7542551775514874, 0.7304899190222118, 0.8934351608341778, 0.3759023101892295, 0.06554942790036467, 0.3782961369793859, 0.24940028099595324]
[0.4792099212751727, 0.19767858701604624, 0.5162174625431372, 0.019386808204984396, 0.8516704517647707, 0.669500238721486, 0.10275441692328746, 0.7262108016346217, 0.27466330949461903, 0.5114110217107879]
[0.4736295383577239, 0.5594249849728473, 0.13118135938803988, 0.6614701297724721, 0.31594547319531097, 0.25538415218610466, 0.38813802091478633, 0.9779182451444076, 0.1292926086118813, 0.7164415691309892]
[0.3356971055614466, 0.9960279356408579, 0.9018106146850836, 0.7056349705349919, 0.8677843649824027, 0.713199273910345, 0.4500223204462691, 0.16791339066028255, 0.8083384665780848, 0.06577552973825185]
[0.8653470673875016, 0.7158305241304749, 0.9241323423093925, 0.1598556330050731, 0.4655566657028062, 0.3714871548975628, 0.30538053450909164, 0.14440821986341756, 0.9057775689030844, 0.5559513690214988]
[0.29634069084155035, 0.7219891196591361, 0.4799806341569959, 0.43549102434868503, 0.675185724281284, 0.5623786004405398, 0.24797232810209235, 0.755702079954496, 0.6220137098402305, 0.31865744944572993]

Workaround:

As mentioned above, there seems to be a lot of problems, so if you want to do columns 0 to 15 continuously, it would be better to do it with the for inrange() loop and pandas as shown below.

work=[]
for i in range (16):
    work.append(list(dropwhile(lambday:y<0.2, data[:,i])))))

df = pd.DataFrame(work).T.fillna(')

print(df)
df.to_csv('1214-945.csv', header=False, index=False)


2022-09-30 19:10

If itertools.dropwhile(), it may take some time to run.

itertools.dropwhile (predicate, iterable)

Make an interpreter that drops elements from the possible as long as the predicate is true; afterwards, returns every element. Note, the iterator does not produce any output until the predicate first bodies false, so it may have a long time.

For example, if an array of type numpy.ndarray has 10^8 (100 million) elements and all elements are 0.1, all elements of the array will be scanned.

z=np.array([0.1]*int(1e8))

print(timeit.timeit(
  'list(dropwhile(lambday:y<0.2,z))',
  globals=globals(), number=1)
=>
17.09322979053482 seconds

I don't know if itertools.dropwhile() is responsible for "no sign of ending after a day" because I don't know your computer environment (CPU performance, memory size, etc.) or the actual data size.

By the way, itertools.dropwhile() can also be implemented with simple for loop.You can also accelerate by using Numba:A High Performance Python Compiler.

 from numba import njit
import numpy as np
import timeout

@njit
def dropwhile_njit(lst):
  for i in range (len(lst)) :
    if not (lst[i]<0.2):
      break
  return lst [i:]

z=np.array([0.1]*int(1e8))

print(timeit.timeit(
  'dropwhile_njit(z)',
  globals=globals(), number=1)

=>
0.20109811995644122 seconds

The Numba version is about 85 times faster (for a full scan of a giant array).

17.09332979053482/0.20109811995644122 = 84.999914


2022-09-30 19:10

If you have any answers or tips


© 2024 OneMinuteCode. All rights reserved.