# Deep learning networks for stock market analysis and prediction (Paper Summary)

Here is the link to the paper.

### Summary

The authors consider 2 main problems:

2. Predicting covariance matrix using the predicted stock returns

### Dataset

Dataset consists of 38 stocks from Korea KOSPI with prices sampled every 5 minutes. The date range for the data collection is from 2010-01-04 to 2014-12-30. First 80% of the sample (from 2010-01-04 to 2013-12-24) is taken for training. At each timestamp, the algorithm has access to last 10 log returns for each stock. Log return is computed as $r_t = \ln(S_t/S_{t-\Delta{t}})$, where $S_t$ is the stock price at time $t$, and $\Delta{t}$ is 5 minutes. The sample contains a total of 1239 trading days and 73,041 five-minute returns (excluding the first ten returns each day) for each stock.

### Data Preprocessing

The authors explore various preprocessing techniques. Preprocessed data is fed into the neural network in the prediction stage.

• RawData: No proprocessing. Raw returns in a 38 * 10 sized vector.
• PC200: PCA with output dimension 200.
• PC380: PCA with output dimention 380.
• AE400: Sparse Autoencoder with output dimension 400. (The autoencoder has 1-hidden layer with size 400.)
• AE800: Sparse Autoencoder with output dimention 800.

### Intraday Stock Return Prediction Approaches

A neural network with 2 hidden layers is compared against a univariate autoregressive model with 10 lagged variables. Sizes of the hidden layers are 200 and 100 respectively. Since this a regression model, the final output is a scalar.

$h_1 = ReLU(W_1u_t + b_1)$
$h_2 = ReLU(W_2h_1 + b_2)$
$\hat{r}_{i,t+1} = W_3h_2 + b_3$

### Stock Return Results

Method NMSE
AR(10) 0.9655
ANN (RawData) 0.9937
DNN (RawData) 0.9629
DNN (PCA380) 0.9660
DNN (RBM400) 0.9702
DNN (AE400) 0.9638

NMSE is the normalized Mean Squared Error defined as

where $var()$ is the variance.