AI-Based Financial Forecasting Model using Random Forest
Objective:
The objective of this model is to accurately forecast monthly financial metrics—specifically Credit, Debit, and Balance—for each account group across an upcoming fiscal year. By analyzing historical balance sheet data, the model learns seasonal and temporal trends using Random Forest Regression and lag-based features. This enables organizations to proactively plan budgets, monitor cash flow, and identify financial risks or anomalies in advance. The forecasts are generated month- wise and exported to Excel, providing a structured and easily interpretable format for finance teams. Additionally, a user-friendly Gradio application facilitates quick predictions by uploading input files and selecting the desired fiscal year.
Data Source:
- Historical Data: Two Excel files containing monthly balance sheet data:
- Balance sheet_Monthly_FY22-23.xlsx
- Balance sheet_Monthly_FY23-24.xls
- Input Format: Each file contains multiple sheets, one for each month, with financial values for various account groups.
Data Preprocessing:
- Extracts month and year from sheet names.
- Cleans and consolidates data across fiscal years into one DataFrame.
- Converts Current Period (INR) to numeric values.
- Maps months to integers and encodes account names numerically.
- Builds a time series DataFrame with one row per account per month.
Model Architecture:
- Model Used: RandomForestRegressor from scikit-learn.
- Lag-Based Feature Engineering: For each account, 12 lag features are created (e.g., values
from the past 12 months) to capture historical trends. - Forecasting Window: 12-month forecast for each fiscal year using the trained model.
- Rolling Forecast: Model predictions for each year are appended to the historical data for
cumulative forecasting.
Gradio Application:
An interactive Gradio app is integrated to:
- Allow users to upload Excel files.
- Specify a target fiscal year (e.g., “2026–27”).
- Automatically generate:
- A downloadable Excel file with predicted values.
- Line plots comparing historical vs. predicted values for key accounts.
Output:
- Main Forecast File: Forecasted_Monthly_Amounts.xlsx
- Monthly Excel Sheets by Fiscal Year: Auto-generated in the Forecast_Files_By_Fiscal_Year/
folder. - Line Graphs: Auto-saved and visualized to show trends for top accounts.
Technology Used:
- Python: The core programming language used to develop the entire forecasting pipeline
and user interface. - Pandas: Utilized for efficient data manipulation, cleaning, and transformation of Excel-based
financial records. - NumPy: Supports numerical computations and array-based operations essential for feature
engineering and modeling. - scikit-learn: Provides the Random Forest Regressor and tools for training, prediction, and
evaluation of the forecasting model. - Matplotlib: Used for visualizing trends and comparing actual vs. predicted financial values
over time. - Seaborn: Enhances the aesthetics of plots for deeper insight into feature relationships and
distribution patterns. - OpenPyXL: Enables programmatic creation and export of structured Excel files containing
forecasted financial results. - Gradio: Powers the interactive web interface that allows users to upload data and generate
predictions without coding.