How I Built a Bank Statement Converter to Turn PDFs into Excel for QuickBooks

An accounting story at the origin of StatementSheet

Posted on May 30, 2025

Tired of manually copying and pasting bank statements into Excel?
I was too. That’s why I created StatementSheet, an automatic bank statement converter that transforms PDF files into Excel format for QuickBooks and beyond.

1. The Problem: Manual Bookkeeping Pain

It all began one evening in February 2024, in London, United Kingdom. At that time I was working as a freelance Full Stack Java/Angular developer for one of my clients in the financial sector. I'd been running my own Private Limited Company (LTD) since March 2023.

Like many business owners, I do my own bookkeeping. This evening I have some bookkeeping to do on my dedicated Intuit QuickBooks software, more specifically I have to enter the financial transactions in the bank journals.

To do this I need all my company bank statements, including Business Account statements, Credit Card statements and Cheque statements. So I go to the Barclays Bank UK PLC website to download the documents for the last 6 months

2. A QuickBooks Feature Discovery

At the same time, my curiosity led me to discover a feature in Quickbooks that lets me import my transactions directly in Excel (XLSX) or CSV format.

All this to save time and because all developers are notoriously lazy 😄.

Great, all you have to do is obtain a file in one of these formats and you're done !

So I went to my bank to check whether I could export each month, and I found that this was impossible, only the PDF format was available. 😫. It's 2024 and I'm thinking this must be a joke.

3. The Nightmare of Copy-Pasting Bank Statements from PDF

Being a stubborn person, I want my export to be usable by all means. I start creating an Excel file, and then I start copying and pasting blocks from the transaction table in my first PDF statement into Excel.

PDF data :

Bank statement PDF with five transactions for January 8–10 Here we see 5 lines of transactions for 8, 9 and 10 January.
Each column is divided into 5 sections (Date, Description, Money out, Money in, Balance).

Copying a line from the PDF to excel gives :

Bank statement PDF To Excel Result Here we can see that each line from the PDF is in a single column in Excel.
Date and Description are merged.
The amounts are shown on a single line.
When Money In is empty for a transaction, there is no representation in the flow resulting from the copy.

Here is the desired result :

Bank Statement PDF To Excel desired result

The light at the end of the tunnel

So I reorganise this for the first page, which contains a total of 10 lines of transactions. I'm wasting an incredible amount of time... 20 minutes later I can finally run an import test on QuickBooks.

Some adjustments are necessary :
1. Set dates in "DD/MM/YYYY" format
2. You need a date in each line of the excel file
3. Enter the amounts in the correct format using the decimal separator ",".
4. Delete the "Balance" column

I realise that I can't enter all the pages of all the PDFs manually and repeat this process as it would take me too long.

It must be possible to automatically convert these bank statements into Excel format.

4. No Existing Tools Worked

I go to Google and search for existing solutions. I came across generic converters and more specialised converters but none of them allowed me to convert my PDF statements accurately. I also tried a few different prompts on Chat GTP but it didn't work. The developer in me will have to get back to work.

If I've had this problem, other people have too.

I talk to some freelance IT and accountancy friends and they say, "We also manually copy and paste the data into Excel. Then we import these files into our accounting software". They use Sage, Cegid and SAP software and have business account statements from HSBC UK, Lloyds Bank and NatWest.

💡 What if I created a tool that automatically extracts financial transactions from a bank statement and formats them neatly in an Excel file?

5. How I Built a Bank Statement to Excel Converter

The first step is to determine the top and bottom coordinates of the start and end cells of the transaction table in the PDF file in order to target the area to be extracted. The data frame then has to be processed, which is no easy task.

For a given transaction, the "Description" column may be on several rows, producing a row with no empty column and rows with several empty columns. These rows must be merged in the correct order.

In addition, the tabs spaces in the "Description" column, particularly between the payment method and the description, can create a new empty column between "Date" and "Description".
The "Date" and "Description" columns can also be merged.

In the next few articles I'll explain in more detail the process of converting PDF bank statements into Excel files, as well as other problems you may encounter.