Bitcoin Price Prediction Using CNN-LSTM Models

·

The financial landscape has evolved rapidly in recent years, with investors increasingly shifting from traditional savings to dynamic assets such as stocks, bonds, and cryptocurrencies. Among these, Bitcoin stands out due to its extreme volatility, nonlinearity, and non-stationarity—characteristics that make price prediction both challenging and highly valuable. With the advancement of deep learning, models like Long Short-Term Memory (LSTM) and Convolutional Neural Networks (CNN) have emerged as powerful tools for time series forecasting. This article explores how combining these two architectures into a CNN-LSTM hybrid model significantly improves Bitcoin price prediction accuracy.

Understanding LSTM: Capturing Temporal Dependencies

LSTM is a specialized type of Recurrent Neural Network (RNN) designed to overcome the vanishing gradient problem and effectively capture long-term dependencies in sequential data. Unlike standard RNNs, LSTMs use a gating mechanism—comprising the input gate, forget gate, and output gate—to regulate information flow, enabling selective retention or discarding of past data.

How LSTM Works

  1. Forget Gate
    Determines which information from the previous cell state should be discarded. It uses a sigmoid function to output values between 0 (completely forget) and 1 (fully retain):

    $$ f_t = \sigma(W_f \cdot [h_{t-1}, x_t] + b_f) $$

  2. Input Gate
    Decides what new information will be stored in the cell state. It consists of a sigmoid layer (to filter values) and a tanh layer (to create candidate values):

    $$ i_t = \sigma(W_i \cdot [h_{t-1}, x_t] + b_i) $$

    $$ \tilde{C}_t = \tanh(W_C \cdot [h_{t-1}, x_t] + b_C) $$

  3. Output Gate
    Computes the final output based on the updated cell state and hidden state:

    $$ o_t = \sigma(W_o \cdot [h_{t-1}, x_t] + b_o) $$

    $$ h_t = o_t \cdot \tanh(C_t) $$

This architecture allows LSTM to maintain relevant historical context—ideal for financial time series where trends and patterns unfold over time.

👉 Discover how advanced AI models can enhance market forecasting

Empirical Analysis: Building the LSTM Model

Data Selection and Features

We used daily Bitcoin price data from Nasdaq spanning September 11, 2016, to September 10, 2021—a total of 1,826 trading days. In addition to closing prices, six technical indicators were incorporated as input features:

A sliding window approach with a size of 10 days was applied to structure the time series for prediction.

Model Architecture and Evaluation

The LSTM model consisted of:

Model performance was evaluated using Mean Absolute Percentage Error (MAPE):

$$ \text{MAPE} = \frac{1}{n} \sum_{t=1}^{n} \left| \frac{y_t - \hat{y}_t}{y_t} \right| \times 100 $$

The initial MAPE was 10.14%, indicating moderate accuracy. Visual analysis revealed noticeable lag in tracking sudden price movements—a known limitation of pure LSTM models in volatile markets.

Introducing CNN: Extracting Spatial and Local Features

While LSTM excels at modeling sequences, CNN is renowned for extracting local patterns through convolutional filters—originally developed for image recognition but now widely used in time series analysis.

CNN Architecture Overview

A typical CNN includes:

To enhance performance, this study employed:

An additional feature—rate of change for each technical indicator—was introduced to enrich input data.

CNN Model Performance

The optimized CNN architecture included:

Using the same dataset split (80% training, 10% validation, 10% testing), the CNN achieved a MAPE of 9.29%, outperforming the base LSTM model. It showed stronger responsiveness to abrupt changes but exhibited vertical prediction errors due to overfitting on short-term fluctuations.

The Hybrid Solution: CNN-LSTM Model

Recognizing the complementary strengths of both models, we developed a CNN-LSTM hybrid that leverages:

Model Integration Strategy

The hybrid model follows a three-step process:

  1. Train CNN and LSTM separately to extract optimal features.
  2. Assign weights (α for CNN, β for LSTM) based on individual MAPE scores—lower error receives higher weight.
  3. Combine outputs via weighted sum:

    $$ y_{\text{hybrid}} = \alpha \cdot y_{\text{CNN}} + \beta \cdot y_{\text{LSTM}} $$

    After extensive testing, optimal weights were found to be α = 0.1, β = 0.9.

White noise test p-values were also used to assess confidence levels in predictions.

👉 Explore how machine learning is reshaping crypto analytics

Comparative Results and Performance Evaluation

ModelMAPE (%)
Base LSTM10.14
Base CNN9.29
Optimized LSTM8.20
Optimized CNN7.09
CNN-LSTM4.74

The final CNN-LSTM model achieved a remarkable 4.74% MAPE, demonstrating superior accuracy compared to standalone models. Graphical comparisons show:

Residual analysis confirms fewer and smaller errors, although extreme market events still pose challenges due to Bitcoin’s inherent unpredictability.

Frequently Asked Questions (FAQ)

Q: Why combine CNN and LSTM instead of using one model alone?
A: CNN excels at identifying local patterns and spatial hierarchies in data, while LSTM captures long-term temporal dynamics. Together, they provide a more comprehensive understanding of complex time series like Bitcoin prices.

Q: What makes Bitcoin price prediction so difficult?
A: Bitcoin exhibits high volatility, nonlinear behavior, and sensitivity to external factors like regulatory news and macroeconomic shifts. These make it resistant to traditional linear forecasting methods.

Q: Can this model predict sudden market crashes or rallies?
A: While the CNN-LSTM model improves short-term forecasting accuracy, predicting black swan events remains challenging without incorporating real-time sentiment or news data.

Q: Is technical analysis alone sufficient for accurate predictions?
A: Technical indicators provide valuable historical context, but integrating on-chain data, trading volume, and market sentiment can further enhance model robustness.

Q: How often should the model be retrained?
A: Given Bitcoin’s evolving market structure, retraining every 3–6 months with updated data ensures the model adapts to new trends and maintains predictive power.

Conclusion

This study demonstrates that hybrid deep learning models offer a significant edge in cryptocurrency price forecasting. By integrating CNN’s feature extraction capability with LSTM’s sequential modeling strength, the proposed CNN-LSTM architecture achieves a MAPE of just 4.74%, outperforming individual models. The results confirm that ensemble approaches are essential for navigating the complexities of digital asset markets.

Future work could explore integrating external variables such as social media sentiment, blockchain metrics, or macroeconomic indicators to further refine predictions. As AI continues to evolve, models like CNN-LSTM will play an increasingly vital role in shaping data-driven investment strategies.

👉 See how cutting-edge platforms leverage AI for smarter trading decisions