I want to extract columns of certain conditions from Numpy.
An array of numpy was obtained using Pandas in the dataset.
[[0.2 0. 0. ... 0. 2. 2. ]
[0.2 0. 0. ... 0. 2. 2. ]
[0.55 0. 0. ... 0. 0. 0. ]
[0.43 0. 0. ... 0. 2. 2. ]
[0.21 0. 0. ... 0. 2. 1. ]
[0.37 0. 0. ... 0. 2. 2. ]]
I would like to reclassify files using formulas based on the last elements (0,1,2) in each column. How do I do that?
ex)
print(train0)
[ [0.55 0. 0. ... 0. 0. 0. ]]
print(train1)
[ [0.21 0. 0. ... 0. 2. 1. ]]
print(train2)
[[0.2 0. 0. ... 0. 2. 2. ]
[0.2 0. 0. ... 0. 2. 2. ]
[0.43 0. 0. ... 0. 2. 2. ]
[0.37 0. 0. ... 0. 2. 2. ]]
I think it'll be possible if you do it cluelessly.
>>> train = np.random.randint(0, 3, (15, 5))
>>> train
array([[1, 0, 2, 0, 2],
[0, 1, 0, 1, 2],
[0, 2, 0, 1, 0],
[0, 2, 0, 0, 1],
[2, 0, 0, 1, 0],
[2, 2, 0, 0, 0],
[2, 2, 0, 1, 0],
[1, 2, 1, 0, 0],
[1, 2, 2, 1, 1],
[2, 2, 2, 0, 1],
[0, 2, 0, 0, 2],
[0, 0, 0, 1, 2],
[2, 1, 2, 2, 0],
[0, 1, 1, 2, 1],
[0, 2, 2, 0, 0]])
>>> t = dict()
>>> for row in train:
cond = tuple(row[-3:])
t[cond] = t.get(cond, [])
t[cond].append(row)
>>> for k, v in t.items():
t[k] = np.array(t[k])
print('--', k)
print(t[k])
-- (2, 0, 2)
[[1 0 2 0 2]]
-- (0, 1, 2)
[[0 1 0 1 2]
[0 0 0 1 2]]
-- (0, 1, 0)
[[0 2 0 1 0]
[2 0 0 1 0]
[2 2 0 1 0]]
-- (0, 0, 1)
[[0 2 0 0 1]]
-- (0, 0, 0)
[[2 2 0 0 0]]
-- (1, 0, 0)
[[1 2 1 0 0]]
-- (2, 1, 1)
[[1 2 2 1 1]]
-- (2, 0, 1)
[[2 2 2 0 1]]
-- (0, 0, 2)
[[0 2 0 0 2]]
-- (2, 2, 0)
[[2 1 2 2 0]]
-- (1, 2, 1)
[[0 1 1 2 1]]
-- (2, 0, 0)
[[0 2 2 0 0]]
If it's in the form of a pandas data frame, I'll make a column that combines the last three, and I'll do groupby
with that column.
© 2024 OneMinuteCode. All rights reserved.