Automate financial data extraction from balance sheets and profit & loss statements with Python and export results to Excel.
Introduction
Financial institutions generate large volumes of data in the form of balance sheets and profit & loss (P&l) statements. Extracting relevant information manually can be time-consuming and prone to errors. In this guide, we will show you how to automate the process using Python.
Why Automate Financial Data Extraction?
- Speeds up data processing
- Reduces human errors
- Enables quick financial analysis
- Exports results directly to Excel for easy interpretation
Steps to Extract Data from Balance Sheets and P&l Statements
- Read financial data from text files.
- Use Python’s re module to extract key financial values.
- Compare data between current and previous months.
- Store extracted data in an Excel file for further analysis.
Python Code for Financial Data Extraction
Below is the Python script to automate the extraction and comparison of financial data:
import openpyxl
import re
# Define file locations
current_month_balance_sheet = 'C:\\FinancialData\\BalanceSheet_October.txt'
last_month_balance_sheet = 'C:\\FinancialData\\BalanceSheet_September.txt'
current_month_pl_account = 'C:\\FinancialData\\ProfitLoss_October.txt'
last_month_pl_account = 'C:\\FinancialData\\ProfitLoss_September.txt'
# Define key financial phrases
balance_sheet_key_phrases = ['Total Term Deposits', 'Total Demand Deposits', 'Total Advances - Agriculture']
pl_account_key_phrases = ['Total Expenses', 'Total Income', 'Printing And Stationery']
# Initialize dictionaries for financial data
current_balance_data = {}
last_balance_data = {}
current_pl_data = {}
last_pl_data = {}
# Function to extract financial data from text files
def extract_data(filename, key_phrases):
extracted_data = {}
with open(filename, 'r') as file:
data = file.read()
for phrase in key_phrases:
match = re.search(fr'{phrase}\s+([\d,]+\.\d+)', data)
if match:
extracted_data[phrase] = float(match.group(1).replace(',', ''))
return extracted_data
# Extract data from financial statements
current_balance_data = extract_data(current_month_balance_sheet, balance_sheet_key_phrases)
last_balance_data = extract_data(last_month_balance_sheet, balance_sheet_key_phrases)
current_pl_data = extract_data(current_month_pl_account, pl_account_key_phrases)
last_pl_data = extract_data(last_month_pl_account, pl_account_key_phrases)
# Compare and store differences
results = {key: [current_balance_data.get(key, 0), last_balance_data.get(key, 0),
current_balance_data.get(key, 0) - last_balance_data.get(key, 0)] for key in balance_sheet_key_phrases}
results.update({key: [current_pl_data.get(key, 0), last_pl_data.get(key, 0),
current_pl_data.get(key, 0) - last_pl_data.get(key, 0)] for key in pl_account_key_phrases})
# Save results to Excel
workbook = openpyxl.Workbook()
sheet = workbook.active
sheet.append(['Key Phrase', 'Current Month', 'Last Month', 'Difference'])
for key, values in results.items():
sheet.append([key] + values)
workbook.save('Financial_Report.xlsx')
print("Results saved to Financial_Report.xlsx")
Conclusion
This Python script simplifies the process of extracting and analyzing financial statements. By automating data extraction and storing results in Excel, financial analysts can focus on deeper insights rather than manual data entry.
Try this script to streamline your financial reporting process today!