Why Time Series Prediction is Hard
Stock price prediction is fundamentally a time series problem — we want to predict future values based on a sequence of historical observations. Traditional machine learning algorithms like linear regression or random forests struggle with time series data because they treat each data point independently, ignoring the sequential dependencies that are crucial in financial markets.
NEPSE stock prices in particular exhibit complex temporal patterns:
- Short-term momentum (price tends to continue in the same direction over days)
- Weekly seasonality (trading volumes and volatility vary by day of week)
- Long-term mean reversion (extreme moves tend to reverse over months)
- Earnings-season effects (increased volatility around company reporting periods)
To capture these multi-timescale patterns simultaneously, we need a model architecture that can remember information across different time horizons. Enter LSTM neural networks.
What is an LSTM?
A Long Short-Term Memory (LSTM) network is a special type of Recurrent Neural Network (RNN) designed to learn long-range dependencies in sequential data. Invented by Hochreiter & Schmidhuber in 1997, LSTMs solve the "vanishing gradient" problem that plagued earlier RNNs — which caused them to forget information from more than a few time steps back.
The LSTM Cell Architecture
Each LSTM cell contains three "gates" that control what information is remembered, forgotten, and output:
- Forget Gate (f): Decides what information from the previous cell state to discard. For stock prediction, this might "forget" pre-crash data once a market recovery is well underway.
- Input Gate (i): Decides what new information to store in the cell state. It learns to identify and memorize important price events like breakouts, earnings surprises, or sector rotations.
- Output Gate (o): Decides what part of the cell state to output as the hidden state, which feeds into the next time step and the final prediction layer.
i_t = σ(W_i · [h_{t-1}, x_t] + b_i) # Input gate
C̃_t = tanh(W_C · [h_{t-1}, x_t] + b_C) # Candidate cell state
C_t = f_t * C_{t-1} + i_t * C̃_t # Cell state update
o_t = σ(W_o · [h_{t-1}, x_t] + b_o) # Output gate
h_t = o_t * tanh(C_t) # Hidden state output
How We Train LSTM on NEPSE Data
Step 1: Data Collection & Preparation
Our training data includes 5+ years of NEPSE daily OHLCV data for all listed companies, adjusted for:
- Stock splits and bonus shares
- Right offerings and FPOs
- Delistings and symbol changes
- Trading halts (filled with appropriate missing value handling)
Step 2: Feature Engineering
Raw OHLCV data alone is insufficient. We engineer 40+ additional features for each stock and time step:
- Technical indicators: RSI(14), MACD, Bollinger Bands, ATR, OBV, VWAP
- Relative features: Stock return vs NEPSE index, sector-relative strength
- Volume features: Volume Z-score, volume relative to 30-day average
- Market structure: Trend state (up/down/sideways), distance from 52-week high/low
- Temporal features: Day of week, days until next trading halt, time since last circuit breaker
Step 3: Sequence Construction
The model looks back 60 trading days (approximately 3 months) to make predictions about the next 5, 10, and 20 trading days. This "lookback window" is a hyperparameter we optimized through extensive cross-validation on NEPSE data.
Step 4: Model Architecture
Our production LSTM ensemble consists of:
- Primary LSTM: 3-layer stacked LSTM with 256, 128, and 64 units, dropout regularization (0.2), and batch normalization
- Attention mechanism: Multi-head self-attention to dynamically weight which historical time steps are most relevant for each prediction
- Output head: Three separate prediction heads — directional classifier (up/down/flat), magnitude regressor (% change), and volatility estimator (confidence range)
Step 5: Training & Validation
We use a walk-forward validation strategy — training on all data up to a specific date, validating on the next 3 months, then rolling forward. This mimics real-world deployment and prevents the lookahead bias that plagues many backtests.
The model is retrained every Sunday night on the latest available data, ensuring it captures recent market regime changes — crucial in NEPSE where macroeconomic events (Nepal Rastra Bank decisions, government budget, FPO approvals) can rapidly shift market dynamics.
LSTM vs Traditional Forecasting Methods
We compared our LSTM ensemble against several baseline methods on out-of-sample NEPSE data (2023–2024):
| Method | Directional Accuracy | MAE (%) |
|---|---|---|
| Buy & Hold NEPSE | 52% | — |
| ARIMA | 54% | 3.2% |
| Random Forest | 61% | 2.8% |
| Single LSTM | 74% | 1.9% |
| ASHVA LSTM Ensemble | 87% | 1.2% |
What LSTM Can and Cannot Do
LSTM Strengths
- Captures complex non-linear temporal patterns in OHLCV data
- Learns to identify regime changes (bull/bear transitions) from historical sequences
- Scales effectively across hundreds of NEPSE stocks simultaneously
- Continuously improves with more data through regular retraining
LSTM Limitations
- Cannot predict truly unprecedented "black swan" events (COVID, geopolitical shocks)
- Performance degrades during regime changes until the model is retrained
- Requires large amounts of clean, consistent historical data to train effectively
- Computationally expensive to train — production inference requires optimized infrastructure
Access ASHVA's LSTM Signals
Get daily AI-generated NEPSE trade signals powered by our LSTM ensemble — with confidence scores and risk levels.
View ASHVA Signals