I have an Excel file, but I'm having a hard time because there are more conditions than I thought to erase duplicate ones.
1. The data looks like this, but convert it into a csv file and load the file using Pandas Successfully changed to No.Pi data. Let's call this data.
2. Next, only rows with a value of 20180101 or higher and 20180630 or lower in the second column behind you are separate It was extracted using a function called np.delete (up to this point, the shape is (489, 8) So let's say the data created like this is data1.
3. After removing only the station column using the number-fi indexing function, We made up to idx using return_index=True using np.unique function. For example, since there are three ADJs in the first station, the first position is 0 and Next, there are three BBKs, so a 1d-array with the 3rd, the first position of the BBKs, is created. (The 1d-array thus made is 144 in length.) The 1d-array made like this is called idx.
4. Using idx, which contains unique location information, we created it through work 2 I want to extract the row of data1 separately. For example, if idx = [0,3,6,...] then the position of the row corresponding to the number of idx, that is, the 0th row of data1 Can you find the third row, the sixth row, etc., extract the information of the row and make it into the form of (144,8)?
I'd appreciate your help.
python
>>> df = pd.DataFrame({"station":["AJD", "AJD", "BBK", "BBK"],
"channel":["HGE", "HGN", "HGE", "HGE"],
"network":["KG", "KG", "KG", "KG"],
"lat":[34.74, 34.74, 35.57, 35.57],
"lon":[126.12, 126.12, 129.43, 129.43],
"ele":[100, 100, 100, 100],
"st":[20140101, 20140101, 20170131, 20140101],
"end":[99991231, 99991231, 99991231, 20170130]})
>>> df
station channel network lat lon ele st end
0 AJD HGE KG 34.74 126.12 100 20140101 99991231
1 AJD HGN KG 34.74 126.12 100 20140101 99991231
2 BBK HGE KG 35.57 129.43 100 20170131 99991231
3 BBK HGE KG 35.57 129.43 100 20140101 20170130
>>> df.groupby("station").first()
channel network lat lon ele st end
station
AJD HGE KG 34.74 126.12 100 20140101 99991231
BBK HGE KG 35.57 129.43 100 20170131 99991231
>>> df.groupby("station").first().reset_index()
station channel network lat lon ele st end
0 AJD HGE KG 34.74 126.12 100 20140101 99991231
1 BBK HGE KG 35.57 129.43 100 20170131 99991231
Is this what you want?
© 2024 OneMinuteCode. All rights reserved.