Annual data is available every hour.
I'd like to interpolate this data as a minute-by-minute data.
I only know how to insert it by eleven o'clock in the afternoon.
Please tell me how to insert from 23:01 to 23:59.
I think it's a really basic part, but I can't find a good way, so please do it
df2=df.resample('1T').interpolate()
Annual data
0 2019-01-01 00:00 356 0.121
1 2019-01-01 01:00:00 326 0.196
2 2019-01-01 02:00:00 313 0.257
3 2019-01-01 03:00:00 307 0.265
4 2019-01-01 04:00:00 307 0.195
... ... ... ...
8755 2019-12-31 19:00:00 55 0.151
8756 2019-12-31 20:00:00 28 0.090
8757 2019-12-31 21:00:00 348 0.036
8758 2019-12-31 22:00:00 205 0.047
8759 2019-12-31 23:00:00 179 0.140
Run Results
2019-01-01 00:00:00 356.000000.12100
2019-01-01 00:01:00 355.500000 0.12225
2019-01-01 00:02:00 355.000000 0.12350
2019-01-01 00:03:00 354.500000 0.12475
2019-01-01 00:04:00 354.000000 0.12600
... ... ... ...
2019-12-31 22:56:00 180.733333 0.13380
2019-12-31 22:57:00 180.300000 0.13535
2019-12-31 22:58:00 179.866667 0.13690
2019-12-31 22:59:00 179.433333 0.13845
2019-12-31 23:00:00 179.000000 0.14000
For upsampling with pandas.DataFrame.resample
, there is a way to pre-add a NaN value outside one of the endpoints.
If the type of #index is datetime64.
df [np.datetime64('2020-01-0100:00')] = np.nan
The problem is complementary.I think it is important to consider how pandas.DataFrame.interpolate
should complement this increased NaN.
In general, extrapolation of time series data requires information about what characteristics the data assumes.For example, the default method='linear'
for interpolate
will not produce the desired result.
If the extrapolation method falls within the scope implemented by SciPy, you can use it by specifying method
in interpolate
such as slinear
or quadratic
.For more information, see the interpolate
document.
Also, I haven't tried it properly, but I think I need to pay attention to the order of completion depending on the location of NaN.Related issues:
Is it good to interpolate with extrapolation in the first place?
It says, "The only way to insert it is until 23:00," but it does not exist for an hour from 23:00 to 24:00 only on the day of 2019-12-31, and the data for other days exists.
The actual data as of 2020-01-01 00:00:00 should exist because it is already a past record at the time of processing.
Wouldn't it be meaningless to ignore the data and "extract" it?
It may be possible to simulate the last hour and compare it with the actual data, but it's a question of how to create the simulation program, and I don't know if there's an objective solution.
At the end of the data before interpolation, you can add one actual data as of 2020-01-01 00:00:00 and then interpolate.
If you have 2020-01-01 00:00:00 data in the results, for example, if the next step is a problem, you can delete only one of them from the results.
In terms of annual data (1 year's worth), I don't think I can get the data because I haven't completed it in 2020, but since I only need the data from January 1st, wouldn't it be possible to get the data from January 1st?
If that doesn't work, you can switch to the data available on this site.Then you can check and adjust the program.In the meantime, I will try to find out if I can get the data for January 1, 2020.
Meteorological Agency | Search past weather data /Meteorological Agency | Download past weather data
549 PHP ssh2_scp_send fails to send files as intended
548 rails db:create error: Could not find mysql2-0.5.4 in any of the sources
546 Understanding How to Configure Google API Key
710 When building Fast API+Uvicorn environment with PyInstaller, console=False results in an error
537 Uncaught (inpromise) Error on Electron: An object could not be cloned
© 2024 OneMinuteCode. All rights reserved.