Why generic LLMs fail at financial spreading (+ alternatives)
As LLMs have expanded their capabilities, investors have experimented with just how well a chatbot can perform in the underwriting process. But while tools like ChatGPT or Claude are great at summarizing text, they fail to manipulate Excel documents and perform second-order analysis by deconstructing a borrower’s assumptions.
The issue is that a generic LLM sees a spreadsheet as a static document rather than a set of interconnected calculations. If the tool can't see the formulas or the links between tabs, it can’t perform financial analysis; instead, it’s just making a statistical guess.
Underwriting teams need a system that actually understands how a model is constructed and how the numbers flow through it. This article explains why generic LLMs fail the Excel test and how F2 handles the deep math and complex formatting that text-based models miss.
Why text-based LLMs break in spreadsheets
The technical challenges that generic LLMs face in financial spreading stem from how these models see the data. LLMs are trained to predict the next token in a sequence of text. Spreadsheets, however, are not just text; they are long chains of logic with formula dependencies.
Here are the three core reasons generic LLMs fail at financial spreading.
Formula blindness
When a generic LLM looks at a spreadsheet (“vision encoding”), it typically converts the file into a text representation (e.g., CSV or Markdown). In doing so, it strips away the formula layer. It sees, for example, "1,000,000" in a cell, but loses the logic that created it.
We call this primary analysis: reading the text output of a cell and assuming it can be used as input for subsequent analysis. However, it’s the secondary analysis — understanding how those cells are calculated — that’s required for AI to truly understand the data.
Without the ability to see the underlying formula in a cell, a generic LLM cannot tell you what happens to, let’s say, net income if revenue drops by 10%, because it doesn't know that those two cells are mathematically linked.
F2’s agentic system, Excel Intelligence, reads the formula chain, understands the spreadsheet structure, and ensures that the spread is a dynamic model — not a static snapshot.
The cross-sheet problem
Private market financials are dense and complex. You might have the P&L on the first tab, the debt schedule on the fourth, and cash flow adjustments on the seventh — all of which need to be treated as a single, interconnected system.
To spread financials correctly, an agent must retrieve, compare, and synthesize data across the entire file. If you are verifying the interest expense, you aren't just looking at the P&L; you are checking the debt schedule to ensure the math adds up.
Generic LLMs have a limited context window and struggle to remember the relationships across multiple tabs. Their limited context window can lead to hallucinations, where the AI guesses certain values because it can’t remember where the source data lives.
F2 solves this by maintaining the connective tissue of the entire workbook. The system maps the relationships between tabs, ensuring that when it pulls a number for your spread, it understands exactly which supporting schedule that number came from and how it affects the rest of the model.
Probabilistic vs. deterministic chain of reasoning
LLMs are probabilistic — they guess the next word in a sentence based on likelihood.
When it comes to institutional-grade investment analysis, a model that predicts the next output using probabilities is highly likely to produce excessive errors. This probabilistic approach to analysis is a core reason investors are skeptical of using AI in their analysis — it comes down to a model’s tendency to “hallucinate.”
F2’s platform takes a different approach. It deconstructs borrower calculations, extracts the raw data, and then uses a deterministic calculation engine to derive financial metrics with accuracy and auditability.
How specialized agents handle the complex analysis that generic LLMs miss
To solve these problems, you need a specialized spreadsheet architecture. F2 has built a proprietary agentic system designed for the models that investors rely on.
Here’s how it works:
- Structure encoding: It maps the spatial relationships between cells (headers, sub-headers, totals).
- Logic tracing: It parses formula chains to understand how values flow through the workbook.
- Handling complexity: It can interpret merged cells, hidden rows, and cross-sheet references that standard data extraction tools often fail to handle.
This allows the system to reconstruct a borrower’s financials with cell-level precision. It doesn't just copy the data; it understands the financial logic intended by the borrower’s accountant.
Giving analysts a way to verify the data in seconds
If an analyst can’t prove exactly where a number came from, they shouldn’t submit it to the Investment Committee. Without a way to trace every value back to the source, you end up double-checking the entire spread manually, which defeats the purpose of using AI in the first place.
F2 is built to solve this tedious second-guessing by giving you the evidence you need to stand behind your numbers with complete conviction.
Visual verification back to the source cell (or paragraph)
Every number the system generates is backed by a direct citation. If you’re looking at a revenue figure and it feels off, you don't have to go hunting through the data room to find the source.
- Instant verification: Every number is clickable.
- Visual proof: Clicking a value opens the original source document — whether it’s a PDF or a complex Excel file — and highlights the exact cell range or line item in yellow.
- Context matters: This lets you confirm that the system pulled from, for example, the "Actuals" column rather than a "Budget" or "Pro Forma" projection.
Maintaining an audit trail across borrower revisions
One of the biggest pain points in underwriting is the constant stream of updated files. Normally, when a borrower submits a "v2" or "v3" of their model, the analyst must restart the spreading process from scratch.
F2 handles version control by separating the analytical logic from the data source:
- Intelligent swapping: You can simply upload the new files and tell the agent, "I just uploaded three new files. Use the latest data to redo this report."
- Logic preservation: The system replaces the underlying numbers while preserving your original normalization and mapping.
- Continuous traceability: Even with new files, the yellow-highlighted citations remain, ensuring the audit trail remains unbroken from the first draft to the final Investment Committee memo.
These features are designed to eliminate any doubt among underwriting teams about the product's accuracy and to give them confidence that any output is backed by audit-and defense-ready source citations.
Conclusion
For institutional investors, a general-purpose chatbot is not enough. When a system treats a workbook as flat text, it loses the logic, the math, and the trust required for underwriting.
By replacing statistical guessing with a specialized architecture that understands the complexity of your models, F2 transforms a deal team’s role in three specific ways:
- Zero low-value work: No more manually keying data into internal templates or building out multi-year P&Ls from scratch.
- Instant verification: The burden of manual verification disappears because every cell in your finished spread is clickable and linked directly to its original source.
- Better judgment: Analysts can spend their time more wisely by leveraging the full context of an application to make informed investment decisions.
The goal isn't to replace the analyst, but to provide them with a stronger foundation. By solving the spreading problem, F2 enables deal teams to stop processing data and make better investment decisions.
