First of all, it's a warm txt The contents of the data are raw data and are all contained in one row of Excel 1 row 9 columns.
Filming materials
Filming station = xx
Installation Location = xxxxxxxxxxxxxxxx
Observation item = xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
Observation time, water temperature (℃), salt (PSU), electrical conductivity (ms/m)
2015-04-01 00:00:00,20.6,RB,K,17.31,RB,K,25.7,RB,K
2015-04-01 00:15:00,20.6,RB,K,17.31,RB,K,25.7,RB,K
2015-04-01 00:30:00,20.6,RB,K,17.31,RB,K,25.7,RB,K
2015-04-01 01:00:00,20.6,RB,K,17.31,RB,K,25.7,RB,K
Data contents above.
It was hosted by "jupyter"
The code is as follows.
import pandas as pd
f= pd.read_csv('test.txt',sep='delimiter',header=None, encoding='cp949', engine='python')
f2= f.drop([0,1,2,3,4],axis=0)
I called Pandas and received the data f, and I created f2 with only the contents as a drop in the beginning.
And then
The 'Observation Time' value is 2015-04-0100:15:00, 2015-04-0100:30:00, 2015-04-0100:45:00, 2015-04-0101:00:00:00
Water temperature (℃)
20.6, 20.6, 20.6, 20.6
To crop by row in this way
f3= f2.split(' , ')
or
f3 = f2 ['Filming data'] = f2 ['Observation time', 'Water temperature', 'Salt', 'Electric conductivity'].str.split(' ',3)
In the first split, there was an error AttributeError: 'DataFrame' object has no attribute 'split'
The error KeyError:('Observation time', 'Water temperature', 'Salt', 'Electric conductivity') occurs when trying to make an item in a row.
What I want to do is to cut it based on a comma, tie it in a row, and select the rest without RB and K I'd like to combine them into one data with a value per row.
python pandas jupyter-notebook
Python 3.8.1 (tags/v3.8.1:1b293b6, Dec 18 2019, 23:11:46) [MSC v.1916 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license()" for more information.
>>> text = ''''Filming Materials
Filming station = xx
Installation Location = xxxxxxxxxxxxxxxx
Observation item = xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
Observation time, water temperature (℃), salt (PSU), electrical conductivity (ms/m)
2015-04-01 00:00:00,20.6,RB,K,17.31,RB,K,25.7,RB,K
2015-04-01 00:15:00,20.6,RB,K,17.31,RB,K,25.7,RB,K
2015-04-01 00:30:00,20.6,RB,K,17.31,RB,K,25.7,RB,K
2015-04-01 01:00:00,20.6,RB,K,17.31,RB,K,25.7,RB,K'''
>>> from io import StringIO
>>> with open("temp.csv", "wt", encoding="cp949") as f:
f.write(text)
305
>>> with open("temp.csv", "rt", encoding="cp949") as f:
text = f.read()
print(text)
print('-'*10)
text = text.replace(",RB,K", "")
print(text)
Filming materials
Filming station = xx
Installation Location = xxxxxxxxxxxxxxxx
Observation item = xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
Observation time, water temperature (℃), salt (PSU), electrical conductivity (ms/m)
2015-04-01 00:00:00,20.6,RB,K,17.31,RB,K,25.7,RB,K
2015-04-01 00:15:00,20.6,RB,K,17.31,RB,K,25.7,RB,K
2015-04-01 00:30:00,20.6,RB,K,17.31,RB,K,25.7,RB,K
2015-04-01 01:00:00,20.6,RB,K,17.31,RB,K,25.7,RB,K
----------
Filming materials
Filming station = xx
Installation Location = xxxxxxxxxxxxxxxx
Observation item = xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
Observation time, water temperature (℃), salt (PSU), electrical conductivity (ms/m)
2015-04-01 00:00:00,20.6,17.31,25.7
2015-04-01 00:15:00,20.6,17.31,25.7
2015-04-01 00:30:00,20.6,17.31,25.7
2015-04-01 01:00:00,20.6,17.31,25.7
>>> import pandas as pd
>>> df = pd.read_csv(StringIO(text), skiprows=4)
>>> df
Observation time Water temperature (℃) Salt (PSU) Electrical conductivity (ms/m)
0 2015-04-01 00:00:00 20.6 17.31 25.7
1 2015-04-01 00:15:00 20.6 17.31 25.7
2 2015-04-01 00:30:00 20.6 17.31 25.7
3 2015-04-01 01:00:00 20.6 17.31 25.7
>>>
''test.txt
TIME,TEMPER,rb,k,PSU,rb,k,MS/M,rb,k
2015-04-01 00:15:00,20.6,RB,K,17.31,RB,K,25.7,RB,K
2015-04-01 00:30:00,20.6,RB,K,17.31,RB,K,25.7,RB,K
2015-04-01 01:00:00,20.6,RB,K,17.31,RB,K,25.7,RB,K
'''
import pandas as pd
f= pd.read_csv('test.txt',sep=',',header=None, encoding='utf8', engine='python')
print(f)
'''
TIME TEMPER rb k PSU rb.1 k.1 MS/M rb.2 k.2
0 2015-04-01 00:15:00 20.6 RB K 17.31 RB K 25.7 RB K
1 2015-04-01 00:30:00 20.6 RB K 17.31 RB K 25.7 RB K
2 2015-04-01 01:00:00 20.6 RB K 17.31 RB K 25.7 RB K
'''
f = f.drop(['rb','k','rb.1','k.1','rb.2','k.2'], axis=1)
print(f)
'''
TIME TEMPER PSU MS/M
0 2015-04-01 00:15:00 20.6 17.31 25.7
1 2015-04-01 00:30:00 20.6 17.31 25.7
2 2015-04-01 01:00:00 20.6 17.31 25.7
'''
#REF.https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.read_csv.html
Please refer to the official document
578 Understanding How to Configure Google API Key
614 GDB gets version error when attempting to debug with the Presense SDK (IDE)
573 rails db:create error: Could not find mysql2-0.5.4 in any of the sources
924 When building Fast API+Uvicorn environment with PyInstaller, console=False results in an error
© 2024 OneMinuteCode. All rights reserved.