I would like to separate text and numbers from Python DataFrame and make them into new columns.

Hello, I'm a beginner studying DataFrame at Python. I want to distinguish between letters and numbers in the specific column of the df. I looked it up and it seems like you're using the import re. I don't know what to do if there's a mix of numbers and letters.

 If there is a df like this, I want to eliminate the first "e" and separate numbers and characters to create idx5, idx6, and idx7.

a1  idx3       idx4
b   ex1.x.1     ex100f
c   ex1.x       ew200
d   ex2.x.2     ed200
e   ex2.x.3 
f   ex1.x.2 
g   ex2.x       ed200f
h   ex2.x.4     ed300f
i   ex2.x.5 
j   ex1.x.3     ew200
k   ex3.x   
l   ex2.x.6     ex300
m   ex2.x.7     ed200f

 I want to make it as below in the above df!

a1  idx3     idx4   idx5    idx6    idx7
b   ex1.x.1  ex100f   x      100        f
c   ex1.x    ew200    w      200    
d   ex2.x.2  ed200    d      200    
e   ex2.x.3             
f   ex1.x.2             
g   ex2.x    ed200f   d      200        f
h   ex2.x.4  ed300f   d      300        f
i   ex2.x.5             
j   ex1.x.3  ew200    w      200    
k   ex3.x               
l   ex2.x.6  ex300    x      300    
m   ex2.x.7  ed200f   d      200        f

python dataframe

2022-09-20 15:11

1 Answers

If it's a very, very simple case, it's possible as below, but if it's more complicated, we'll have to write a more complex parsing function, and we'll have to apply the parsing function.

First of all, I'm going to give you a very simple example.

>>> df = pd.DataFrame({"col":["ex100f", "ew200", "ed200", None, "ed200f", "ew200"] })
>>> df
      col
0  ex100f
1   ew200
2   ed200
3    None
4  ed200f
5   ew200
>>> df['a'] = df['col'].str[1]
>>> df
      col     a
0  ex100f     x
1   ew200     w
2   ed200     d
3    None  None
4  ed200f     d
5   ew200     w
>>> df['b'] = df['col'].str[2:5]
>>> df
      col     a     b
0  ex100f     x   100
1   ew200     w   200
2   ed200     d   200
3    None  None  None
4  ed200f     d   200
5   ew200     w   200
>>> df['c'] = df['col'].str[5:]
>>> df
      col     a     b     c
0  ex100f     x   100     f
1   ew200     w   200      
2   ed200     d   200      
3    None  None  None  None
4  ed200f     d   200     f
5   ew200     w   200

2022-09-20 15:11

If you have any answers or tips

Popular Tags

python x 4647

android x 1593

java x 1494

javascript x 1427

c x 927

c++ x 878

ruby-on-rails x 696

php x 692

python3 x 685

html x 656