First of all, it's a warm txt The contents of the data are raw data and are all contained in one row of Excel 1 row 9 columns.
Filming materials
Filming station = xx
Installation Location = xxxxxxxxxxxxxxxx
Observation item = xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
Observation time, water temperature (℃), salt (PSU), electrical conductivity (ms/m)
2015-04-01 00:00:00,20.6,RB,K,17.31,RB,K,25.7,RB,K
2015-04-01 00:15:00,20.6,RB,K,17.31,RB,K,25.7,RB,K
2015-04-01 00:30:00,20.6,RB,K,17.31,RB,K,25.7,RB,K
2015-04-01 01:00:00,20.6,RB,K,17.31,RB,K,25.7,RB,K
Data contents above.
It was hosted by "jupyter"
The code is as follows.
import pandas as pd
f= pd.read_csv('test.txt',sep='delimiter',header=None, encoding='cp949', engine='python')
f2= f.drop([0,1,2,3,4],axis=0)
I called Pandas and received the data f, and I created f2 with only the contents as a drop in the beginning.
And then
The 'Observation Time' value is 2015-04-0100:15:00, 2015-04-0100:30:00, 2015-04-0100:45:00, 2015-04-0101:00:00:00
Water temperature (℃)
20.6, 20.6, 20.6, 20.6
To crop by row in this way
f3= f2.split(' , ')
or
f3 = f2 ['Filming data'] = f2 ['Observation time', 'Water temperature', 'Salt', 'Electric conductivity'].str.split(' ',3)
In the first split, there was an error AttributeError: 'DataFrame' object has no attribute 'split'
The error KeyError:('Observation time', 'Water temperature', 'Salt', 'Electric conductivity') occurs when trying to make an item in a row.
What I want to do is to cut it based on a comma, tie it in a row, and select the rest without RB and K I'd like to combine them into one data with a value per row.
python pandas jupyter-notebook
Python 3.8.1 (tags/v3.8.1:1b293b6, Dec 18 2019, 23:11:46) [MSC v.1916 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license()" for more information.
>>> text = ''''Filming Materials
Filming station = xx
Installation Location = xxxxxxxxxxxxxxxx
Observation item = xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
Observation time, water temperature (℃), salt (PSU), electrical conductivity (ms/m)
2015-04-01 00:00:00,20.6,RB,K,17.31,RB,K,25.7,RB,K
2015-04-01 00:15:00,20.6,RB,K,17.31,RB,K,25.7,RB,K
2015-04-01 00:30:00,20.6,RB,K,17.31,RB,K,25.7,RB,K
2015-04-01 01:00:00,20.6,RB,K,17.31,RB,K,25.7,RB,K'''
>>> from io import StringIO
>>> with open("temp.csv", "wt", encoding="cp949") as f:
f.write(text)
305
>>> with open("temp.csv", "rt", encoding="cp949") as f:
text = f.read()
print(text)
print('-'*10)
text = text.replace(",RB,K", "")
print(text)
Filming materials
Filming station = xx
Installation Location = xxxxxxxxxxxxxxxx
Observation item = xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
Observation time, water temperature (℃), salt (PSU), electrical conductivity (ms/m)
2015-04-01 00:00:00,20.6,RB,K,17.31,RB,K,25.7,RB,K
2015-04-01 00:15:00,20.6,RB,K,17.31,RB,K,25.7,RB,K
2015-04-01 00:30:00,20.6,RB,K,17.31,RB,K,25.7,RB,K
2015-04-01 01:00:00,20.6,RB,K,17.31,RB,K,25.7,RB,K
----------
Filming materials
Filming station = xx
Installation Location = xxxxxxxxxxxxxxxx
Observation item = xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
Observation time, water temperature (℃), salt (PSU), electrical conductivity (ms/m)
2015-04-01 00:00:00,20.6,17.31,25.7
2015-04-01 00:15:00,20.6,17.31,25.7
2015-04-01 00:30:00,20.6,17.31,25.7
2015-04-01 01:00:00,20.6,17.31,25.7
>>> import pandas as pd
>>> df = pd.read_csv(StringIO(text), skiprows=4)
>>> df
Observation time Water temperature (℃) Salt (PSU) Electrical conductivity (ms/m)
0 2015-04-01 00:00:00 20.6 17.31 25.7
1 2015-04-01 00:15:00 20.6 17.31 25.7
2 2015-04-01 00:30:00 20.6 17.31 25.7
3 2015-04-01 01:00:00 20.6 17.31 25.7
>>>
''test.txt
TIME,TEMPER,rb,k,PSU,rb,k,MS/M,rb,k
2015-04-01 00:15:00,20.6,RB,K,17.31,RB,K,25.7,RB,K
2015-04-01 00:30:00,20.6,RB,K,17.31,RB,K,25.7,RB,K
2015-04-01 01:00:00,20.6,RB,K,17.31,RB,K,25.7,RB,K
'''
import pandas as pd
f= pd.read_csv('test.txt',sep=',',header=None, encoding='utf8', engine='python')
print(f)
'''
TIME TEMPER rb k PSU rb.1 k.1 MS/M rb.2 k.2
0 2015-04-01 00:15:00 20.6 RB K 17.31 RB K 25.7 RB K
1 2015-04-01 00:30:00 20.6 RB K 17.31 RB K 25.7 RB K
2 2015-04-01 01:00:00 20.6 RB K 17.31 RB K 25.7 RB K
'''
f = f.drop(['rb','k','rb.1','k.1','rb.2','k.2'], axis=1)
print(f)
'''
TIME TEMPER PSU MS/M
0 2015-04-01 00:15:00 20.6 17.31 25.7
1 2015-04-01 00:30:00 20.6 17.31 25.7
2 2015-04-01 01:00:00 20.6 17.31 25.7
'''
#REF.https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.read_csv.html
Please refer to the official document
© 2025 OneMinuteCode. All rights reserved.