phv

光伏短期功率预测大赛

李家翔,武睿琦,靳晓松 2023-02-06

模型融合

我们尝试的模型融合有

神经网络模型
Xgboost模型
时间序列模型
基于概率模型的融合

结论

本次比赛，我们主要的实现方式是神经网络模型，最终的排名是52名。我们的特征工程涵盖了时间相关变量、平方项、立方项、比率、滚动SMA、滚动方差、PCA主成分、实发辐射的测试集预测值、NMF衍生变量、prophet等，而模型融合则涵盖了神经网络模型、Xgboost模型、时间序列模型以及基于概率模型的融合。

光伏短期功率预测大赛

这个项目是参加国能日新的光伏短期功率预测大赛的结稿。我们的团队名为 PHotoVoltaic (phv)，最终排名是52名。

在这个比赛中，我们尝试了一系列的特征工程和模型融合，以提高模型的性能。在特征工程方面，我们加入了时间相关变量、平方项、立方项、比率、滚动SMA、滚动方差、PCA主成分、实发辐射的测试集预测值、NMF衍生变量、prophet等；在模型融合方面，我们尝试了神经网络模型、Xgboost模型、时间序列模型以及基于概率模型的融合。

我们的实现方式主要是神经网络模型，具体见Python代码wushen.ipynb，而Xgboost的融合则见R代码note.Rmd。我们也使用了trelliscope来进行EDA，交互方便，但是不适合上线部署，不便于交流。

最终，我们的模型达到了较好的效果，跑出了52名的排名。

EDA

使用trelliscope，交互方便，但是不适合上线部署，不便于交流。

后续可以做的空间

深度学习的方法

可以采用空洞卷积的方法(A. van den Oord et al. 2016a; A. van den Oord et al. 2016b; Sprangers, Schelter, and Rijke 2022; Kechyn et al. 2018)，这种方法可以用于一些其他的应用，比如音频的频谱、长时间序列等。

XGBoost

由于比赛过程中主办方修改了数据集和评价函数，我们无法复现原来的历史预测，因此，我们没有将神经网络和XGboost进行融合，这也是我们下一次比赛需要注意的问题。
我们可以采用更加合理的窗口特征提取方式(Elsayed et al. 2021)，以及考虑多任务的框架，如MT-GBT(Ying et al. 2022)，来提高模型的性能。

EDA和特征工程

我们需要做好EDA，观察被解释变量关于时间的波动，查看异常值。
在特征工程的部分，为了拟合非线性关系，我们可以使用更高效的Ramsey’s RESET test，详见Github。
我们也可以参考预测值迁移的问题，发现模型可能存在欠拟合的情况，并采取模型校正部分的方法来解决。
因为有四个光伏板，并且都是时间序列，所以这里可以采用LSTM训练，参考6神经网络应用。
既然考虑了PCA作为聚类特征，那么应该考虑DTW(Salvador and Chan 2007; Izakian, Pedrycz, and Jamal 2015)和TS-PCA(Chang, Guo, and Yao 2018)。
既然考虑了prophet，那么应该使用prophet的NNs训练(Triebe et al. 2021)。

Code of Conduct

Please note that the ‘phv’ project is released with a [Contributor Code of Conduct](CODE_OF_CONDUCT.md). By contributing to this project, you agree to abide by its terms.

License

Chang, Jinyuan, Bin Guo, and Qiwei Yao. 2018. “Principal Component Analysis for Second-Order Stationary Vector Time Series.” *The Annals of Statistics* 46 (5). <https://doi.org/10.1214/17-aos1613>.

Elsayed, Shereen, Daniela Thyssens, Ahmed Rashed, Hadi Samer Jomaa, and Lars Schmidt-Thieme. 2021. “Do We Really Need Deep Learning Models for Time Series Forecasting?” *arXiv Preprint arXiv:2101.02118*.

Izakian, Hesam, Witold Pedrycz, and Iqbal Jamal. 2015. “Fuzzy Clustering of Time Series Data Using Dynamic Time Warping Distance.” *Engineering Applications of Artificial Intelligence* 39: 235–44.

Kechyn, Glib, Lucius Yu, Yangguang Zang, and Svyatoslav Kechyn. 2018. “Sales Forecasting Using WaveNet Within the Framework of the Kaggle Competition.” *arXiv: Learning*.

Oord, Aaron van den, Sander Dieleman, Heiga Zen, Karen Simonyan, Oriol Vinyals, Alex Graves, Nal Kalchbrenner, Andrew Senior, and Koray Kavukcuoglu. 2016a. “Wavenet: A Generative Model for Raw Audio.” *arXiv Preprint arXiv:1609.03499*.

Oord, Aaron van den, Nal Kalchbrenner, Oriol Vinyals, Lasse Espeholt, Alex Graves, and Koray Kavukcuoglu. 2016b. “Conditional Image Generation with PixelCNN Decoders.” *Neural Information Processing Systems*.

Salvador, Stan, and Philip Chan. 2007. “Toward Accurate Dynamic Time Warping in Linear Time and Space.” *Intelligent Data Analysis* 11 (5): 561–80.

Sprangers, Olivier, Sebastian Schelter, and Maarten de Rijke. 2022. “Parameter-Efficient Deep Probabilistic Forecasting.” *International Journal of Forecasting*.

Triebe, Oskar, Hansika Hewamalage, Polina Pilyugina, Nikolay Laptev, Christoph Bergmeir, and Ram Rajagopal. 2021. “NeuralProphet: Explainable Forecasting at Scale.” <https://arxiv.org/abs/2111.15397>.

Ying, ZhenZhe, Zhuoer Xu, Weiqiang Wang, and Changhua Meng. 2022. “MT-GBM: A Multi-Task Gradient Boosting Machine with Shared Decision Trees.” *arXiv Preprint arXiv:2201.06239*.

This site is open source. Improve this page.

phv

光伏短期功率预测大赛

模型融合

结论

光伏短期功率预测大赛

EDA

后续可以做的空间

深度学习的方法

XGBoost

EDA和特征工程

**Code of Conduct**

Please note that the ‘phv’ project is released with a [Contributor Code of Conduct](CODE_OF_CONDUCT.md). By contributing to this project, you agree to abide by its terms.

**License**

MIT © [Jiaxiang Li;Ruiqi Wu;Xiaosong Jin](LICENSE.md)

Code of Conduct

License