Extract more than a certain condition from the original data of the simulation data ← "This is complete"
I was trying to cluster the data and extract the destination, but an error occurred here
·Excel has original data
·Open python at the command prompt
·Write the programming in Notepad and save it on hoge.py
·Run at the command prompt
C:\datasyori>python hoge.py
latitude longitude
0 35.693590 139.712202
1 35.693497 139.712096
2 35.693217 139.712261
3 35.693549 139.712430
4 35.693621 139.712501
Traceback (most recent call last):
File "C:\Users\mable\AppData\Local\Programs\Python\Python310\lib\site-packages\pandas\core\indexing.py", line769, in_validate_tuple_indexer
self._validate_key(k, i)
File "C:\Users\mable\AppData\Local\Programs\Python\Python310\lib\site-packages\pandas\core\indexing.py", line 1378, in_validate_key
raiseValueError(f"Can only index by location with a [{self._valid_types}]")
ValueError: Can only index by location with a [ integer, integer slice (START point is INCLUDED, END point is EXCLUDED), listlike of integers, boolean array ]
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "C:\datasyori\hoge.py", line 95, in<module>
Cn = C.iloc [Tn, 0]
File "C:\Users\mable\AppData\Local\Programs\Python\Python310\lib\site-packages\pandas\core\indexing.py", line 961, in__getitem__
return self._getitem_tuple(key)
File "C:\Users\mable\AppData\Local\Programs\Python\Python310\lib\site-packages\pandas\core\indexing.py", line 1458, in_getitem_tuple
tup=self._validate_tuple_indexer(tup)
File "C:\Users\mable\AppData\Local\Programs\Python\Python310\lib\site-packages\pandas\core\indexing.py", line771, in_validate_tuple_indexer
raise ValueError(
ValueError: Location based indexing can only have [ integer, integer slice (START point is INCLUDED, END point is EXCLUDED), listlike of integers, boolean array] types
# Extracting destinations
from matplotlib import pyplot as plt
from sklearn import data sets, preprocessing
from sklearn.cluster importKEANS
import numpy as np
import pandas aspd
import cartopy.crs as ccrs
import cartopy.io.shapeleader as shpreader
pd.set_option('display.max_rows',600)
# Load preprocessed csv
yomi=pd.read_csv("simulationkai.csv")
df=pd.read_csv("simulationkai.csv", usecols=["longitude", "latitude")
# Convert to DataFrame
print(df.head())
# data shaping
X = df
# clustering
cls=KMeans(n_clusters=4)
result=cls.fit(X)
X['cluster'] = result.labels_
PC = pd.DataFrame (X['cluster'])
PC
df.head()
# Add cluster (cluster number) to yomi's data frame
yomi ['cluster_id'] = PC
yomi
# Save yomi (with cluster number added to original data) to allclsdata.csv
yomi.to_csv("allclsdata.csv")
D=X.sort_values(by="cluster")
D=D.drop_duplicates(subset='cluster')
D
# Count the number of data in each cluster
V = X ['cluster'].value_counts()
V
# Save the number and number of data for each cluster to clsvalue.csv
V.to_csv("clsvalue.csv")
# Checking cluster center of gravity
C=pd.DataFrame(result.cluster_centers_)
C
C.iloc [0,:]
lat=X ['latitude'].tolist()
lon=X ['longitude'].tolist()
clat=C[0].tolist()
clon=C[1].tolist()
From # to 1800 clusters of data, the number of data is obtained by eliminating duplication of the same subjects and sequentially summarizing them into CSVs.
from csv import writer
# pp = pd.DataFrame
#ppi=pd.DataFrame
# Extract only data from Nth cluster with While statement from yomi
i = 0
while i<=3:
yomic=yomi [yomi['cluster_id']==i]
# Remove duplicate subject id from Nth cluster df
yomics=yomic.drop_duplicates(subset=[id_questionnaire])
# Add the number of rows of Nth processed data to CSV
# file = [i,len(yomics)]
#ppi=pp.append([file], ignore_index=True)
# ppi.to_csv("pp.csv")
list_data=[i,len(yomics)]
with open('pp.csv', 'a', newline=') asf_object:
writer_object = writer(f_object)
writer_object.writerow(list_data)
f_object.close()
i=i+1
# else:
# ppi.to_csv("pp.csv")
# Save the number of people in pp.csv in descending order to pps.csv
PP = pd.read_csv("pp.csv", names = ["cls", "people")
T = PP.sort_values (by = ["people"], ascending = False)
T.to_csv("pps.csv")
PP.to_csv("pp.csv")
# Pull the cluster numbers from above pps.csv in order to extract the coordinates of the numbers from C.
num = 0
while num<=3:
Tn = T.iloc [num, 0]
# Tno=Tn+1
Cn = C.iloc [Tn, 0]
Cn2 = C.iloc [Tn, 1]
list_data2 = [Tn, Cn, Cn2 ]
with open('point.csv', 'a', newline=') asf_object:
writer_object = writer(f_object)
writer_object.writerow(list_data2)
f_object.close()
num = num+1
dfh=pd.read_csv("point.csv", names=["cluster_id", "latitude", "longitude"])
B=pd.read_csv("pps.csv", usecols=["people"])
# dfh2 = pd.DataFrame (B['people'])
dfh['people'] = B
dfh.to_csv("point.csv")
There seems to be an error in the type for the value of Cn, but I don't understand it well due to lack of study.
Python 3.10.4 (tags/v3.10.4:9d38120, Mar 23 2022, 23:13:41) [MSC v. 1929 64bit (AMD64)] on win32
python
First, regarding the error information, line 95
Cn=C.iloc [Tn, 0]
Occurs in and the content is "Location-based indexing (.iloc[]) can only accept 'integer, integer slice, integer list, Boolean array'.
So, just before line 95,
print(num,Tn,type(Tn))
After inserting and creating and executing the appropriate input data (simulationkai.csv), we found that:
·There is no error in the first run, and Tn is an integer value (1,0,2,3)
·Error occurred after the second run, Tn is 'cls'
of type str (string)
From this point of view, we believe that the csv file created when the first execution was probably affected, and the 76th line
with open('pp.csv', 'a', newline=') asf_object:
We have reached .Also, since this line is also executed in the first time (i=0) of the while loop, it seems to have been added ('a') to the pp.csv that was made in the past.Therefore, when I changed the while loop to overwrite only the first time ('w') as shown below, the error did not occur.
with open('pp.csv', 'w' if i==0 else'a', newline=') asf_object:
© 2025 OneMinuteCode. All rights reserved.