Using the methodology you previously answered, I wrote a programming that starts with two or more numbers in the first row of data stored in csv, but it takes a lot of time to execute and there's no sign of ending after a day. What's wrong?
import numpy as np
import pandas aspd
from pandas import Series, DataFrame
import matplotlib.pyplot asplt
from itertools import dropwhile
%matplotlib inline
data='1214.csv'
data=pd.read_csv(data)
data=np.array(data)
from itertools import dropwhile, zip_longest
d0 = [x for x in dropwhile (lambday:y<0.2, data[:,0])]
d1 = [x for x in dropwhile (lambday:y<0.2,data[:,1])]
d2 = [x for x in dropwhile (lambday:y<0.2,data[:,2])]
d3 = [x for x in dropwhile (lambday:y<0.2, data[:,3])]
d4 = [x for x in dropwhile (lambday:y<0.2, data[:,4])]
d5 = [x for x in dropwhile (lambday:y<0.2, data[:,5])]
d6 = [x for x in dropwhile (lambday:y<0.2, data[:,6])]
d7 = [x for x in dropwhile (lambday:y<0.2, data[:,7])]
d8 = [x for x in dropwhile (lambday:y<0.2, data[:,8])]
d9 = [x for x in dropwhile (lambday:y<0.2, data[:,9])]
d10 = [x for x in dropwhile (lambday:y<0.2,data[:,10])]
d11 = [x for x in dropwhile (lambday:y<0.2, data[:,11])]
d12 = [x for x in dropwhile (lambday:y<0.2, data[:,12])]
d13 = [x for x in dropwhile (lambday:y<0.2, data[:,13])]
d14 = [x for x in dropwhile (lambday:y<0.2, data[:,14])]
d15 = [x for x in dropwhile (lambday:y<0.2, data[:,15])]
NewData=np.array ([d0, d1, d2, d3, d4, d5, d6, d7, d8, d9, d10, d11, d13, d14, d15])
print(NewData)
np.savetxt('1214-945.csv', NewData, fmt='%s', delimiter=',')
The column index ranges from 0
to 15
for 16 columns.
If there are only 15 columns of data, the following two lines will fail and NewData
cannot be done, so maybe the next two lines will fail as well.
d15 = [x for x in dropwhile (lambday:y<0.2,data[:,15])]
NewData=np.array ([d0, d1, d2, d3, d4, d5, d6, d7, d8, d9, d10, d11, d13, d14, d15])
addition:
For example, if you read the CSV file below,
data='1214.csv'
data=pd.read_csv(data)
data=np.array(data)
Even if you simulate random numbers like this to 10-1000 rows of data in 17 columns, the processing itself will be complete.
data=[]
for_in range (1000):
data.append(np.random.uniform(0,1,17))
data=np.array(data)
However, this Warning
appears and the finished NewData
display is strange, so I don't think it's the result I wanted.
Displayed Sample Results (with 10 rows of data) Warning
VisibleDeprecationWarning: Creating and warning from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or-tuples with different lengths or shapes) is decreasing. If you want to do this, you must specify' type of events
] at the beginning and end of each line in the CSV file.
Considering the previous Q&A , it is not CSV data because it has been converted vertically and horizontally and contains [
,
[list([0.8964406887196086, 0.1289855921390296, 0.9821250613116147, 0.7277624380256356, 0.597179945506056, 0.5883628593677697, 0.4664218180732566, 0.04616575288635895, 0.9831.80518863595, 0.9831.80288225)]
list ([0.81459734967449, 0.07064141016207148, 0.5683665463277392, 0.5798712084908151, 0.04482624113860545, 0.14102892907620534, 0.39305425436742103, 0.36942030150742966, 0.512619106492389, 0.67641181747)
list ([0.7591268875733456, 0.11453918216754855, 0.4647680491351407, 0.040091866778552476, 0.1429846985183626, 0.6627834295485381, 0.0903669991356253, 0.810514260026902, 0.9931642800156654)
list ([0.8625025870468651, 0.3509634752035785, 0.039771391720511695, 0.141294678559875, 0.9785141897412436, 0.7871799464751338, 0.4392150886107685, 0.8797371371672483, 0.8847149683486721])
list ([0.497940921882306, 0.9082257949953394, 0.5833805332031865, 0.47877232055889385, 0.08331201995212023, 0.4006076893255164, 0.47959612630114455, 0.5886402830771224, 0.9980262909358327, 0.52242107568902)
list ([0.27197735039392024, 0.575555020643584366, 0.6911546654769439, 0.6995193500503479, 0.11448204495653014, 0.28069236013054844, 0.2778123126787495, 0.8300432017199777, 0.5017304070162142, 0.118235735192451924519245192451945.]
list ([0.5923816654094234, 0.42428317216701694, 0.21513107003802912, 0.3246311643857014, 0.8250821738073256, 0.10344631269777493, 0.038894047272288956, 0.467108730492769, 0.94249296790788388388])
list ([0.930484556882399, 0.0949142996713167, 0.197681927245144, 0.6532485659646121, 0.07650063989252631, 0.4641428636069759, 0.46430807358621857, 0.8777035315326964, 0.807913312569, 0.27151654275405405]
list ([0.5686227354480835, 0.20979325360116918, 0.41795367528452854, 0.6860058585789381, 0.25969874636294243, 0.3285464978554288, 0.599596888394171, 0.4151960837685723, 0.99513226604745, 0.4685945]
list([0.3783579341561787, 0.6599424089628657, 0.6756485785361779, 0.6579640744721366, 0.48120694336419967, 0.6699040599082838, 0.47013107179808233, 0.7897054241420943, 0.257977685834, 0.69841537598415925]
list ([0.2119275007054927, 0.9073692839679351, 0.7542551775514874, 0.7304899190222118, 0.8934351608341778, 0.37590231018995, 0.06554942790036467, 0.3782961369793859, 0.24940028099595324])
list ([0.4792099212751727, 0.767858701604624, 0.5162174625431372, 0.019386808204984396, 0.8516704517647707, 0.6695238721486, 0.10275441692328746, 0.7262108016346217, 0.274663309461903, 0.511411021778779]
list ([0.4736295383577239, 0.5594249849728473, 0.13118135935938803988, 0.6614701297724721, 0.31594547319531097, 0.25538415218610466, 0.38813802091478633, 0.9779182451444076, 0.12926086118813, 0.716446993099)
list([0.3356971055614466, 0.9960279356408579, 0.9018106146850836, 0.7056349705349919, 0.8677843649824027, 0.713199273910345, 0.4500223204462691, 0.16791339066028255, 0.808338765780848, 0.06552978285185)
list ([0.8653470673875016, 0.7158305241304749, 0.9241323423093925, 0.1598556330050731, 0.4655566657028062, 0.3714871548975628, 0.30538053450909164, 0.14440821986341756, 0.9057775689030844, 0.555959481488)
list ([0.29634069084155035, 0.7219891196591361, 0.4799806341569959, 0.43549102434868503, 0.675185724281284, 0.5623786004405398, 0.24797232810209235, 0.755702079954496, 0.6220137098402305, 0.318744944944]
CSV Files
[0.8964406887196086, 0.1289855921390296, 0.9821250613116147, 0.7277624380256356, 0.597179945506056, 0.5883628593677697, 0.46642181807325156, 0.046165752886358935, 0.9833551108804288,80265926]
[0.81459734967449, 0.07064141016207148, 0.5683665463277392, 0.5798712084908151, 0.04482624113860545, 0.14102892907620534, 0.39305425436742103, 0.36942030150742966, 0.512619106492389, 0.6764118175476419]
[0.7591268875733456, 0.11453918216754855, 0.4647680491351407, 0.040091866778552476, 0.1429846985183626, 0.6627834295485381, 0.09036694991356253, 0.810514260026902, 0.9931642800156654]
[0.8625025870468651, 0.3509634752035785, 0.039771391720511695, 0.141294678559875, 0.9785141897412436, 0.7871799464751338, 0.4392150886107685, 0.8797371371672483, 0.8847149683486721]
[0.497940921882306, 0.9082257949953394, 0.5833805332031865, 0.47877232055889385, 0.08331201995212023, 0.4006076893255164, 0.47959612630114945, 0.5886402830771224, 0.9980262909358327, 0.5224210756830902]
[0.27197735039392024, 0.5755502064358436, 0.6911546654769439, 0.6995193500503479, 0.11448204495653014, 0.28069236013054844, 0.27781231267874995, 0.8300432017199777, 0.5017304070162142, 0.11823577351924519]
[0.5923816654094234, 0.42428317216701694, 0.21513107003802912, 0.3246311643857014, 0.8250821738073256, 0.10344631269777493, 0.038894047272288956, 0.467108730492769, 0.9424929679078838]
[0.9304845568823439, 0.0949142996713167, 0.197681927245144, 0.6532485659646121, 0.07650063989252631, 0.4641428636069759, 0.4930807358621857, 0.8777035315326964, 0.8079133125265369, 0.27151654275407255]
[0.5686227354480835, 0.20979325360116918, 0.41795367528452854, 0.6860058585789381, 0.25969874636294243, 0.3285464977855428, 0.5999579688394171, 0.4151960837685723, 0.9951317222604745, 0.46859183299562845]
[0.3783579341561787, 0.6599424089628657, 0.6756485785361779, 0.6579640744721366, 0.48120694336419967, 0.6699040599082838, 0.47013107179808233, 0.7897054241420943, 0.2579579777685834, 0.6984153759229584]
[0.2119275007054927, 0.9073692839679351, 0.7542551775514874, 0.7304899190222118, 0.8934351608341778, 0.3759023101892295, 0.06554942790036467, 0.3782961369793859, 0.24940028099595324]
[0.4792099212751727, 0.19767858701604624, 0.5162174625431372, 0.019386808204984396, 0.8516704517647707, 0.669500238721486, 0.10275441692328746, 0.7262108016346217, 0.27466330949461903, 0.5114110217107879]
[0.4736295383577239, 0.5594249849728473, 0.13118135938803988, 0.6614701297724721, 0.31594547319531097, 0.25538415218610466, 0.38813802091478633, 0.9779182451444076, 0.1292926086118813, 0.7164415691309892]
[0.3356971055614466, 0.9960279356408579, 0.9018106146850836, 0.7056349705349919, 0.8677843649824027, 0.713199273910345, 0.4500223204462691, 0.16791339066028255, 0.8083384665780848, 0.06577552973825185]
[0.8653470673875016, 0.7158305241304749, 0.9241323423093925, 0.1598556330050731, 0.4655566657028062, 0.3714871548975628, 0.30538053450909164, 0.14440821986341756, 0.9057775689030844, 0.5559513690214988]
[0.29634069084155035, 0.7219891196591361, 0.4799806341569959, 0.43549102434868503, 0.675185724281284, 0.5623786004405398, 0.24797232810209235, 0.755702079954496, 0.6220137098402305, 0.31865744944572993]
Workaround:
As mentioned above, there seems to be a lot of problems, so if you want to do columns 0 to 15 continuously, it would be better to do it with the for inrange()
loop and pandas
as shown below.
work=[]
for i in range (16):
work.append(list(dropwhile(lambday:y<0.2, data[:,i])))))
df = pd.DataFrame(work).T.fillna(')
print(df)
df.to_csv('1214-945.csv', header=False, index=False)
If itertools.dropwhile(), it may take some time to run.
itertools.dropwhile (predicate, iterable)
Make an interpreter that drops elements from the possible as long as the predicate is true; afterwards, returns every element. Note, the iterator does not produce any output until the predicate first bodies false, so it may have a long time. For example, if an array of type I don't know if By the way, The Numba version is about 85 times faster (for a full scan of a giant array).numpy.ndarray
has 10^8 (100 million) elements and all elements are 0.1
, all elements of the array will be scanned.z=np.array([0.1]*int(1e8))
print(timeit.timeit(
'list(dropwhile(lambday:y<0.2,z))',
globals=globals(), number=1)
=>
17.09322979053482 seconds
itertools.dropwhile()
is responsible for "no sign of ending after a day" because I don't know your computer environment (CPU performance, memory size, etc.) or the actual data size.itertools.dropwhile()
can also be implemented with simple for loop.You can also accelerate by using Numba:A High Performance Python Compiler. from numba import njit
import numpy as np
import timeout
@njit
def dropwhile_njit(lst):
for i in range (len(lst)) :
if not (lst[i]<0.2):
break
return lst [i:]
z=np.array([0.1]*int(1e8))
print(timeit.timeit(
'dropwhile_njit(z)',
globals=globals(), number=1)
=>
0.20109811995644122 seconds
17.09332979053482/0.20109811995644122 = 84.999914
© 2024 OneMinuteCode. All rights reserved.