I have a question dealing with Python Low Data!!

First of all, it's a warm txt The contents of the data are raw data and are all contained in one row of Excel 1 row 9 columns.

 Filming materials
Filming station = xx
Installation Location = xxxxxxxxxxxxxxxx
Observation item = xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
Observation time, water temperature (℃), salt (PSU), electrical conductivity (ms/m)
2015-04-01 00:00:00,20.6,RB,K,17.31,RB,K,25.7,RB,K
2015-04-01 00:15:00,20.6,RB,K,17.31,RB,K,25.7,RB,K
2015-04-01 00:30:00,20.6,RB,K,17.31,RB,K,25.7,RB,K
2015-04-01 01:00:00,20.6,RB,K,17.31,RB,K,25.7,RB,K

Data contents above.

It was hosted by "jupyter"

The code is as follows.

import pandas as pd
f= pd.read_csv('test.txt',sep='delimiter',header=None, encoding='cp949', engine='python')
f2= f.drop([0,1,2,3,4],axis=0)

I called Pandas and received the data f, and I created f2 with only the contents as a drop in the beginning. And then The 'Observation Time' value is 2015-04-0100:15:00, 2015-04-0100:30:00, 2015-04-0100:45:00, 2015-04-0101:00:00:00 Water temperature (℃) 20.6, 20.6, 20.6, 20.6 To crop by row in this way

f3= f2.split(' , ')
or
f3 = f2 ['Filming data'] = f2 ['Observation time', 'Water temperature', 'Salt', 'Electric conductivity'].str.split(' ',3)

In the first split, there was an error AttributeError: 'DataFrame' object has no attribute 'split' The error KeyError:('Observation time', 'Water temperature', 'Salt', 'Electric conductivity') occurs when trying to make an item in a row.

What I want to do is to cut it based on a comma, tie it in a row, and select the rest without RB and K I'd like to combine them into one data with a value per row.


python
pandas
jupyter-notebook
					
					

	


		
	

	
		2022-09-20 20:47



			

			
			2 Answers


	
		
Python 3.8.1 (tags/v3.8.1:1b293b6, Dec 18 2019, 23:11:46) [MSC v.1916 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license()" for more information.
>>> text = ''''Filming Materials
Filming station = xx
Installation Location = xxxxxxxxxxxxxxxx
Observation item = xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
Observation time, water temperature (℃), salt (PSU), electrical conductivity (ms/m)
2015-04-01 00:00:00,20.6,RB,K,17.31,RB,K,25.7,RB,K
2015-04-01 00:15:00,20.6,RB,K,17.31,RB,K,25.7,RB,K
2015-04-01 00:30:00,20.6,RB,K,17.31,RB,K,25.7,RB,K
2015-04-01 01:00:00,20.6,RB,K,17.31,RB,K,25.7,RB,K'''
>>> from io import StringIO

>>> with open("temp.csv", "wt", encoding="cp949") as f:
    f.write(text)


305

>>> with open("temp.csv", "rt", encoding="cp949") as f:
    text = f.read()
    print(text)
    print('-'*10)
    text = text.replace(",RB,K", "")
    print(text)


Filming materials
Filming station = xx
Installation Location = xxxxxxxxxxxxxxxx
Observation item = xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
Observation time, water temperature (℃), salt (PSU), electrical conductivity (ms/m)
2015-04-01 00:00:00,20.6,RB,K,17.31,RB,K,25.7,RB,K
2015-04-01 00:15:00,20.6,RB,K,17.31,RB,K,25.7,RB,K
2015-04-01 00:30:00,20.6,RB,K,17.31,RB,K,25.7,RB,K
2015-04-01 01:00:00,20.6,RB,K,17.31,RB,K,25.7,RB,K
----------
Filming materials
Filming station = xx
Installation Location = xxxxxxxxxxxxxxxx
Observation item = xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
Observation time, water temperature (℃), salt (PSU), electrical conductivity (ms/m)
2015-04-01 00:00:00,20.6,17.31,25.7
2015-04-01 00:15:00,20.6,17.31,25.7
2015-04-01 00:30:00,20.6,17.31,25.7
2015-04-01 01:00:00,20.6,17.31,25.7

>>> import pandas as pd

>>> df = pd.read_csv(StringIO(text), skiprows=4)
>>> df
                  Observation time Water temperature (℃) Salt (PSU) Electrical conductivity (ms/m)
0  2015-04-01 00:00:00   20.6    17.31         25.7
1  2015-04-01 00:15:00   20.6    17.31         25.7
2  2015-04-01 00:30:00   20.6    17.31         25.7
3  2015-04-01 01:00:00   20.6    17.31         25.7
>>> 


		
		
			

				

					
				

				
					2022-09-20 20:47
				
			
		
	


	
		
''test.txt
TIME,TEMPER,rb,k,PSU,rb,k,MS/M,rb,k
2015-04-01 00:15:00,20.6,RB,K,17.31,RB,K,25.7,RB,K
2015-04-01 00:30:00,20.6,RB,K,17.31,RB,K,25.7,RB,K
2015-04-01 01:00:00,20.6,RB,K,17.31,RB,K,25.7,RB,K
'''
import pandas as pd
f= pd.read_csv('test.txt',sep=',',header=None, encoding='utf8', engine='python')
print(f)
'''
                  TIME  TEMPER  rb  k    PSU rb.1 k.1  MS/M rb.2 k.2
0  2015-04-01 00:15:00    20.6  RB  K  17.31   RB   K  25.7   RB   K
1  2015-04-01 00:30:00    20.6  RB  K  17.31   RB   K  25.7   RB   K
2  2015-04-01 01:00:00    20.6  RB  K  17.31   RB   K  25.7   RB   K
'''
f = f.drop(['rb','k','rb.1','k.1','rb.2','k.2'], axis=1)
print(f)
'''
                  TIME  TEMPER    PSU  MS/M
0  2015-04-01 00:15:00    20.6  17.31  25.7
1  2015-04-01 00:30:00    20.6  17.31  25.7
2  2015-04-01 01:00:00    20.6  17.31  25.7
'''
#REF.https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.read_csv.html
Please refer to the official document


		
		
			

				

					
				

				
					2022-09-20 20:47
				
			
		
	
			
			If you have any answers or tips



		

	
		Popular Tags
	
	python x 4647
android x 1593
java x 1494
javascript x 1427
c x 927
c++ x 878
ruby-on-rails x 696
php x 692
python3 x 685
html x 656
	


	
		Popular Questions
	
	
	758 Error in x, y, and format string must not be None

	642 ML-Agent tutorial says "Heuristic method called but not implemented.Returning placeholder actions." and fails to proceed

	1070 In Java servlet, when SHA-256 sends WW-Authenticate header for digest authentication, the client does not return the result.

	1019 /usr/bin/google-chrome:symbol lookup error:/usr/bin/google-chrome: undefined symbol:gbm_bo_get_modifier

	1235 When building Fast API+Uvicorn environment with PyInstaller, console=False results in an error