For each ID, I would like to count the number of people whose last n month amount exceeds をn0,000.

Here's what I want to do:

For each ID, we have sales data for the last year, and we would like to count how many people have the most recent n-month amount exceeding \n0,000.

For example, I would like to count the number of people whose amount in the last three months exceeds 170,000 yen in a row (not including those who have one month less than 170,000 yen), 160,000 yen in a row, 150,000 yen in a row, and 10,000 yen from the starting point.

(In the above example, people who exceed 170,000 yen in a row will naturally exceed 160,000 yen in a row, so overlapping IDs will be allowed.)

I think it would be easy to use the for statement, but I don't know how to write it, so I would appreciate it if you could let me know.

python python3

2022-09-29 21:56

2 Answers

There seems to be no further additions or updates, but the following articles will be helpful.

Pandas DataFrame select rows based on values of multiple columns where names are specified in a list
Pandas excludes (deleted), replaces (fill in) and extracts (br/a>
How to use Python's range function
How to extract elements in iloc-Pandas (how to use loc, iloc, iat, at)
How to use Python's built-in functions all(), any()
Get rows, columns, and all elements (size) in pandas
How to use Python's f string (formatted string literal)

The first KINGAKU_S column is ID
The following column is the data from the last month, followed by x months ago
NaN is treated as a 0 circle
> > > >xx yen included

Assuming and complementing the above, you can do it with the following programs:

import pandas as pd
data=pd.read_csv('data.csv')#### CSV read, header, encoding, etc. if necessary, use read_excel() for EXCEL
data.fillna(0.0, replace=True)##### Set 'NaN' to '0.0'

months = 3####n months or specified
startvalue = 170000#### Processing Start Amount

for threshold in range (startvalue, 0, -10000): #### Decline in 10,000 yen increments
    count=len(data[data.iloc[:,1:months+1].apply(lambdax:all([v>=float(threshold) for vinx]),axis=1)])####Total
    print(f'{threshold:>7}: {count:>6}')##### Display



		
		
			

				

					
				

				
					2022-09-29 21:56



	
		
Consider, for example, data like this

The moon is next to me
Vertical (line beginning) as ID
Take out the minimum value for three consecutive months and see if it exceeds a certain value.
For February to April, we will decide that April is the applicable month.

(In other words, the number of cases in January and February is zero) 
import pandas as pd
import numpy as np

# First of all, we have random values for the experiment.
df = pd.DataFrame(np.random.randint(1000,20000,(4,12)))
df.rename (columns=lambda v:f'{v+1}month', index=lambda v:f'ID_{v:03d}', replace=True)
display(df)

# (For the time being) Display of values above 8000 and the number of cases
b8k = df.rolling(3,axis=1).min()>=8000
display(df[b8k].fillna('))
b8k.sum()
January 0

February 0

March 0

April 1

May 0

June 2

July 2

August 1

September 2

October 1

November 1

December 1

dtype:int64


		
		
			

				

					
				

				
					2022-09-29 21:56
				
			
		
	
			
			If you have any answers or tips



		

	
		Popular Tags
	
	python x 4647
android x 1593
java x 1494
javascript x 1427
c x 927
c++ x 878
ruby-on-rails x 696
php x 692
python3 x 685
html x 656
	


	
		Popular Questions
	
	
	550 Who developed the "avformat-59.dll" that comes with FFmpeg?

	551 rails db:create error: Could not find mysql2-0.5.4 in any of the sources

	742 When building Fast API+Uvicorn environment with PyInstaller, console=False results in an error

	984 In Java servlet, when SHA-256 sends WW-Authenticate header for digest authentication, the client does not return the result.

	551 PHP ssh2_scp_send fails to send files as intended