I want to perform extrapolation (data interpolation) using sample.

Asked 2 years ago, Updated 2 years ago, 45 views

Annual data is available every hour.
I'd like to interpolate this data as a minute-by-minute data.
I only know how to insert it by eleven o'clock in the afternoon.
Please tell me how to insert from 23:01 to 23:59.

I think it's a really basic part, but I can't find a good way, so please do it

df2=df.resample('1T').interpolate()

Annual data

 0 2019-01-01 00:00 356 0.121
1   2019-01-01 01:00:00 326 0.196
2   2019-01-01 02:00:00 313 0.257
3   2019-01-01 03:00:00 307 0.265
4   2019-01-01 04:00:00 307 0.195
... ... ... ...
8755    2019-12-31 19:00:00 55  0.151
8756    2019-12-31 20:00:00 28  0.090
8757    2019-12-31 21:00:00 348 0.036
8758    2019-12-31 22:00:00 205 0.047
8759    2019-12-31 23:00:00 179 0.140

Run Results

2019-01-01 00:00:00 356.000000.12100 
2019-01-01 00:01:00 355.500000  0.12225 
2019-01-01 00:02:00 355.000000  0.12350 
2019-01-01 00:03:00 354.500000  0.12475 
2019-01-01 00:04:00 354.000000  0.12600 
... ... ... ...
2019-12-31 22:56:00 180.733333  0.13380 
2019-12-31 22:57:00 180.300000  0.13535 
2019-12-31 22:58:00 179.866667  0.13690 
2019-12-31 22:59:00 179.433333  0.13845 
2019-12-31 23:00:00 179.000000  0.14000

python python3 pandas

2022-09-30 19:23

2 Answers

For upsampling with pandas.DataFrame.resample, there is a way to pre-add a NaN value outside one of the endpoints.

 If the type of #index is datetime64.
df [np.datetime64('2020-01-0100:00')] = np.nan

The problem is complementary.I think it is important to consider how pandas.DataFrame.interpolate should complement this increased NaN.

In general, extrapolation of time series data requires information about what characteristics the data assumes.For example, the default method='linear' for interpolate will not produce the desired result.

If the extrapolation method falls within the scope implemented by SciPy, you can use it by specifying method in interpolate such as slinear or quadratic.For more information, see the interpolate document.

Also, I haven't tried it properly, but I think I need to pay attention to the order of completion depending on the location of NaN.Related issues:


2022-09-30 19:23

Is it good to interpolate with extrapolation in the first place?

It says, "The only way to insert it is until 23:00," but it does not exist for an hour from 23:00 to 24:00 only on the day of 2019-12-31, and the data for other days exists.

The actual data as of 2020-01-01 00:00:00 should exist because it is already a past record at the time of processing.
Wouldn't it be meaningless to ignore the data and "extract" it?

It may be possible to simulate the last hour and compare it with the actual data, but it's a question of how to create the simulation program, and I don't know if there's an objective solution.

At the end of the data before interpolation, you can add one actual data as of 2020-01-01 00:00:00 and then interpolate.

If you have 2020-01-01 00:00:00 data in the results, for example, if the next step is a problem, you can delete only one of them from the results.

In terms of annual data (1 year's worth), I don't think I can get the data because I haven't completed it in 2020, but since I only need the data from January 1st, wouldn't it be possible to get the data from January 1st?

If that doesn't work, you can switch to the data available on this site.Then you can check and adjust the program.In the meantime, I will try to find out if I can get the data for January 1, 2020.
Meteorological Agency | Search past weather data /Meteorological Agency | Download past weather data


2022-09-30 19:23

If you have any answers or tips


© 2024 OneMinuteCode. All rights reserved.