matlab资源基于快速傅立叶变换回归的股价预测-机器学习纳米级顶点项目仅供学习参考用代码.zip资源-CSDN文库

共110个文件

csv：83个

png：21个

py：2个

版权申诉

2 浏览量 2023-10-18 15:32:17 上传评论收藏 4.25MB ZIP 举报

在本项目中，我们主要探讨的是利用MATLAB进行基于快速傅立叶变换(FFT)的回归分析，以此来预测股票价格。快速傅立叶变换是一种高效计算离散傅立叶变换的方法，它在信号处理、图像分析、机器学习等多个领域有着广泛的应用。在这个机器学习纳米级顶点项目中，我们将深入理解FFT如何与回归模型结合，为股票价格预测提供一种新的思路。让我们了解快速傅立叶变换。FFT是离散傅立叶变换(DFT)的一种算法实现，它将时域中的信号转换到频域，揭示信号的频率成分。在金融数据分析中，时间序列数据如股票价格可以被视为一种信号，通过FFT我们可以分析其内在的周期性和趋势。股价预测通常涉及时间序列分析，而FFT可以帮助我们识别出股票价格变化的周期性模式。在本项目中，我们可能会先对历史股票价格数据进行预处理，去除异常值并进行平滑处理，然后应用FFT转换数据到频域。在频域中，我们可以观察到哪些频率对股票价格波动影响较大，从而提取这些特征作为回归模型的输入。接下来，我们将构建一个回归模型。回归分析是一种统计方法，用于研究因变量（如股票价格）与一个或多个自变量之间的关系。在本案例中，我们可能会使用线性回归、决策树、随机森林、支持向量机(SVM)或神经网络等模型。这些模型会根据FFT得到的频域特征来预测未来的股票价格。 MATLAB作为强大的数学和工程计算工具，提供了丰富的函数和工具箱，如神经网络工具箱和统计与机器学习工具箱，方便我们构建和训练模型。在项目实施过程中，我们需要对模型进行训练、验证和测试，通过调整模型参数以优化预测性能。常用的性能指标包括均方误差(MSE)、平均绝对误差(MAE)和R^2分数等。此外，为了提高预测的稳定性和准确性，我们可能还会引入其他技术，如特征选择、集成学习（如Bagging和Boosting）或者结合多个模型进行融合。在训练过程中，我们需注意避免过拟合和欠拟合，通过交叉验证来评估模型的泛化能力。项目的代码应当包含清晰的注释，以便学习者理解每一步操作的目的和实现方式。通过这个项目，学习者不仅能掌握FFT和回归模型的基本概念，还能了解到如何在MATLAB环境中实现这两种技术的结合，以及如何解决实际问题。这个机器学习纳米级顶点项目提供了一个实践性的学习机会，让学习者能够深入理解如何利用快速傅立叶变换和回归分析预测股票价格，同时提升在MATLAB中的编程和数据分析技能。

资源推荐

资源详情

资源评论

收起资源包目录

matlab资源基于快速傅立叶变换回归的股价预测-机器学习纳米级顶点项目仅供学习参考用代码.zip （110个子文件）

Stock-AAPL-1995-12-27-2017-09-05.csv 408KB

Stock-BMY-2000-01-01-2017-08-27.csv 316KB

Stock-GOOG-1995-12-27-2017-09-05.csv 266KB

Stock-GLD-1995-12-27-2017-09-05.csv 234KB

Stock-GLD-2010-01-01-2014-12-16.csv 95KB

Stock-GLD-2010-01-01-2014-12-05.csv 94KB

Stock-GLD-2010-01-01-2014-08-27.csv 89KB

Stock-GLD-2012-01-01-2015-05-07.csv 64KB

Stock-GOOG-2011-01-01-2014-01-15.csv 62KB

Stock-AAPL-2011-01-01-2014-01-15.csv 59KB

Stock-GLD-2011-01-01-2014-01-15.csv 59KB

Stock-IBM-2011-01-01-2014-01-15.csv 59KB

Stock-XOM-2011-01-01-2014-01-15.csv 54KB

Stock-GLD-2012-01-01-2014-08-27.csv 51KB

Stock-GLD-2013-01-01-2015-09-06.csv 50KB

Stock-GLD-2015-01-01-2017-08-27.csv 49KB

Stock-GLD-2013-01-01-2015-08-07.csv 49KB

Stock-BMY-2007-05-31-2010-02-04.csv 47KB

Stock-BMY-2007-03-02-2009-11-06.csv 47KB

Stock-BMY-2007-06-30-2010-03-06.csv 47KB

Stock-BMY-2007-05-01-2010-01-05.csv 47KB

Stock-BMY-2007-04-01-2009-12-06.csv 47KB

Stock-GLD-2005-03-02-2007-11-07.csv 47KB

Stock-GLD-2005-01-01-2007-09-08.csv 47KB

sp500.csv 47KB

Stock-GLD-2005-01-31-2007-10-08.csv 47KB

Stock-BMY-2007-01-31-2009-10-07.csv 47KB

Stock-GLD-2005-04-01-2007-12-07.csv 47KB

Stock-BMY-2007-01-01-2009-09-07.csv 47KB

Stock-GLD-2005-05-01-2008-01-06.csv 47KB

Stock-GLD-2005-05-31-2008-02-05.csv 47KB

Stock-GLD-2005-06-30-2008-03-06.csv 47KB

Stock-GLD-2005-07-30-2008-04-05.csv 46KB

Stock-GLD-2005-09-28-2008-06-04.csv 46KB

Stock-GLD-2005-10-28-2008-07-04.csv 46KB

Stock-GLD-2005-08-29-2008-05-05.csv 46KB

Stock-GLD-2006-01-26-2008-10-02.csv 46KB

Stock-GLD-2005-11-27-2008-08-03.csv 46KB

Stock-GLD-2006-02-25-2008-11-01.csv 46KB

Stock-GLD-2005-12-27-2008-09-02.csv 46KB

Stock-GLD-2006-04-26-2008-12-31.csv 46KB

Stock-GLD-2006-03-27-2008-12-01.csv 46KB

Stock-GLD-2006-05-26-2009-01-30.csv 46KB

Stock-GLD-2006-06-25-2009-03-01.csv 46KB

Stock-GLD-2013-01-01-2015-01-05.csv 38KB

Stock-AAPL-2012-07-24-2014-03-31.csv 33KB

Stock-AAPL-2012-06-24-2014-03-01.csv 33KB

Stock-AAPL-2012-05-25-2014-01-30.csv 33KB

Stock-AAPL-2013-01-20-2014-09-27.csv 32KB

Stock-AAPL-2012-08-23-2014-04-30.csv 32KB

Stock-AAPL-2012-09-22-2014-05-30.csv 32KB

Stock-AAPL-2013-02-19-2014-10-27.csv 32KB

Stock-AAPL-2012-12-21-2014-08-28.csv 32KB

Stock-AAPL-2012-11-21-2014-07-29.csv 32KB

Stock-AAPL-2012-10-22-2014-06-29.csv 32KB

Stock-AAPL-2013-03-21-2014-11-26.csv 32KB

Stock-AAPL-2014-04-15-2015-12-21.csv 32KB

Stock-AAPL-2014-03-16-2015-11-21.csv 32KB

Stock-AAPL-2013-04-20-2014-12-26.csv 32KB

Stock-AAPL-2014-02-14-2015-10-22.csv 32KB

Stock-AAPL-2014-01-15-2015-09-22.csv 32KB

Stock-AAPL-2014-06-14-2016-02-19.csv 32KB

Stock-AAPL-2014-05-15-2016-01-20.csv 32KB

Stock-AAPL-2013-12-16-2015-08-23.csv 32KB

Stock-AAPL-2013-05-20-2015-01-25.csv 32KB

Stock-AAPL-2013-10-17-2015-06-24.csv 32KB

Stock-AAPL-2013-06-19-2015-02-24.csv 32KB

Stock-AAPL-2013-11-16-2015-07-24.csv 32KB

Stock-AAPL-2013-07-19-2015-03-26.csv 32KB

Stock-AAPL-2013-09-17-2015-05-25.csv 32KB

Stock-AAPL-2013-08-18-2015-04-25.csv 31KB

Stock-GLD-2008-04-20-2009-12-26.csv 29KB

Stock-GLD-2008-03-21-2009-11-26.csv 29KB

Stock-GLD-2008-06-19-2010-02-24.csv 29KB

Stock-GLD-2008-02-20-2009-10-27.csv 29KB

Stock-GLD-2008-05-20-2010-01-25.csv 29KB

Stock-GLD-2008-01-21-2009-09-27.csv 29KB

Stock-GLD-2007-12-22-2009-08-28.csv 29KB

Stock-GLD-2007-10-23-2009-06-29.csv 29KB

Stock-GLD-2007-11-22-2009-07-29.csv 29KB

Stock-GLD-2007-08-24-2009-04-30.csv 29KB

Stock-GLD-2007-09-23-2009-05-30.csv 29KB

Stock-GLD-2007-07-25-2009-03-31.csv 29KB

StockRegressor.ipynb 1.51MB

StockRegressor User Interface.ipynb 670KB

README.md 69KB

StockRegressor User Interface.md 32KB

output_14_11.png 96KB

output_51_3.png 95KB

output_14_17.png 93KB

output_14_29.png 91KB

output_45_1.png 90KB

output_14_23.png 86KB

output_14_1.png 86KB

output_49_3.png 86KB

output_43_1.png 80KB

output_14_5.png 77KB

output_39_1.png 74KB

output_35_2.png 73KB

output_41_2.png 69KB

共 110 条

# Machine Learning Nanodegree ## Capstone Project ### Project: Stock Price Prediction **Discalimer**: all stock prices historical data were downloaded from Yahoo Finance. **Discalimer**: lstm.py was provided as part of the project files. --- # Definition ## Problem Statement As already stated in the “Problem Statement” of the Capstone project description in this area, the task will be to build a predictor which will use historical data from online sources, to try to predict future prices. The input to the ML model prediction should be only the date range, and nothing else. The predicted prices should be compared against the available prices for the same date range in the testing period. ## Metrics The metrics used for this project will be the R^2 scores between the actual prices in the testing period, and the predicted prices by the model in the same period. There are also another set of metrics that could be used, that are indicative, which is the percent difference in absolute values between real prices and predicted ones. However, for machine learning purposes (training and testing), R^2 scores would be more reliable measures. --- # Analysis ## Data Exploration First, let's explore the data .. Downloading stock prices for Google. For that purpose, I have built a special class called StockRegressor, that has the ability to download and store the data in a Pandas DataFrame. First step, is to import the class. ```python %matplotlib inline import numpy as np np.random.seed(0) import time import datetime from calendar import monthrange import pandas as pd from IPython.display import display from IPython.display import clear_output from statsmodels.tsa.arima_model import ARIMA from sklearn.metrics import mean_squared_error import warnings warnings.filterwarnings('ignore') from StockRegressor import StockRegressor from StockRegressor import StockGridSearch import matplotlib.pyplot as plt plt.rcParams["figure.figsize"] = (15,8) # initializing numpy seed so that we get reproduciable results, especially with Keras ``` ### The First StockRegressor Object Getting our first historical price data batch ... After download the prices from the Yahoo Finance web services, the below StockRegressor instance will save the historical prices into the pricing_info DataFrame. As a first step of processing, we have changed the index of the DataFrame from 'dates' to 'timeline' which is an integer index. The reason is that it is easier for processing, since the dates correspond to trading dates, and are not sequential: they do not include weekends or holidays, as seen by the gap below between 02 Sep 2016 and 06 Sep 2016, which must have corresponded to a long weekend (Labor Day?). > **Note:** Please note that there might be a bug in the Pandas library, that is causing an intermitten error with the Yahoo Finance web call. The bug could be traced to the file in /anaconda/envs/**your_environment**/lib/python3.5/site-packages/pandas/core/indexes/datetimes.py, at line 1050: This line is causing the error: "if this.freq is None:". Another if condition should be inserted before that, to test for the "freq" attribute, such as: "if hasattr(this, 'freq'):" > **Note:** The fixed datetimes.py file is included with the submission ```python stock = StockRegressor('GOOG', dates= ['2014-10-01', '2016-04-30']) display(stock.pricing_info[484:488]) ``` Getting pricing information for GOOG for the period 2014-10-01 to 2016-09-27 Found a pricing file with wide range of dates, reading ... Stock-GOOG-1995-12-27-2017-09-05.csv <div> <table border="1" class="dataframe"> <thead> <tr style="text-align: right;"> <th></th> <th>Open</th> <th>High</th> <th>Low</th> <th>Close</th> <th>Adj Close</th> <th>Volume</th> <th>dates</th> <th>timeline</th> </tr> <tr> <th>timeline</th> <th></th> <th></th> <th></th> <th></th> <th></th> <th></th> <th></th> <th></th> </tr> </thead> <tbody> <tr> <th>484</th> <td>769.250000</td> <td>771.020020</td> <td>764.299988</td> <td>768.780029</td> <td>768.780029</td> <td>925100</td> <td>2016-09-01</td> <td>484</td> </tr> <tr> <th>485</th> <td>773.010010</td> <td>773.919983</td> <td>768.409973</td> <td>771.460022</td> <td>771.460022</td> <td>1072700</td> <td>2016-09-02</td> <td>485</td> </tr> <tr> <th>486</th> <td>773.450012</td> <td>782.000000</td> <td>771.000000</td> <td>780.080017</td> <td>780.080017</td> <td>1442800</td> <td>2016-09-06</td> <td>486</td> </tr> <tr> <th>487</th> <td>780.000000</td> <td>782.729980</td> <td>776.200012</td> <td>780.349976</td> <td>780.349976</td> <td>893700</td> <td>2016-09-07</td> <td>487</td> </tr> </tbody> </table> </div> ```python stock.adj_close_price['dates'].iloc[stock.testing_end_date] ``` Timestamp('2016-07-13 00:00:00') ### The Impact of the 'Volume' Feature The next step would be to eliminate all the columns that are not needed. The columns 'Open', 'High', 'Low', 'Close' will all be discarded, because we will be working with the 'Adj Close' prices only. For 'Volume', let's explore the relevance below. From the below table and graph, we conclude that Volume has very little correlation with prices, and so we will drop it from discussion from now on. There might be evidence that shows that there is some correlation between spikes in Volume and abrupt changes in prices. That might be logical since higher trading volumes might lead to higher prices fluctuations. However, these spikes in volume happen on the same day of the changes in prices, and so have little predictive power. This might be a topic for future exploration. --- ```python from sklearn.preprocessing import MinMaxScaler scaler_volume = MinMaxScaler(copy=True, feature_range=(0, 1)) scaler_price = MinMaxScaler(copy=True, feature_range=(0, 1)) prices = stock.pricing_info.copy() prices = prices.drop(labels=['Open', 'High', 'Low', 'Close', 'dates', 'timeline'], axis=1) scaler_volume.fit(prices['Volume'].reshape(-1, 1)) scaler_price.fit(prices['Adj Close'].reshape(-1, 1)) prices['Volume'] = scaler_volume.transform(prices['Volume'].reshape(-1, 1)) prices['Adj Close'] = scaler_price.transform(prices['Adj Close'].reshape(-1, 1)) print("\nCorrelation between Volume and Prices:") display(prices.corr()) prices.plot(kind='scatter', x='Adj Close', y='Volume') ``` Correlation between Volume and Prices: <div> <table border="1" class="dataframe"> <thead> <tr style="text-align: right;"> <th></th> <th>Adj Close</th> <th>Volume</th> </tr> </thead> <tbody> <tr> <th>Adj Close</th> <td>1.00000</td> <td>-0.06493</td> </tr> <tr> <th>Volume</th> <td>-0.06493</td> <td>1.00000</td> </tr> </tbody> </table> </div> <matplotlib.axes._subplots.AxesSubplot at 0x1100e5ac8> ![png](imgs/output_12_3.png) ## Exploratory Visualization Now let's explore the historical pricing .. For that purpose, we have built two special purpose functions into the StockRegressor class. The first plotting function will show the "learning_df" DataFrame. This is the dataframe that will be used to store all "workspace" data, i.e. dates, indexes, prices, predictions of multiple algorithms. The second plotting function which will be less frequently used is a function that plots prices with the Bollinger bands. This is for pricing exploration only. Below, we call those two functions. As we haven't trained the StockRegressor, the plot_learning_data_frame() function will show the learning_df dataframe with only the pricing, and a vertical r

评论收藏

内容反馈

版权申诉