

It’s not self hostable, but you mean something like this? https://calendarbudget.com/
It’s not self hostable, but you mean something like this? https://calendarbudget.com/
I played a bit with the basic concept of identifying and categorizing merchants by importing a transaction csv into google sheets and writing a custom function that called the OpenAI API, basically just passing the raw merchant string along with “What category of business is this?”. It did well, the next step would have been to add a step that compared to a predefined list of possible categories. I didn’t compare any models or other platforms though. This was last year so I might play with it again.
I found this which is overkill for personal use but does a good job of laying out this sort of application: https://midday.ai/updates/automatic-reconciliation-engine/
“Instead of just comparing text strings, we use 768-dimensional vector embeddings to capture the semantic meaning of transactions and receipts.
// Generate embeddings for transaction data
const transactionText = prepareTransactionText({
name: transaction.name,
counterpartyName: transaction.counterpartyName,
merchantName: transaction.merchantName,
description: transaction.description
});
const embedding = await generateEmbeddings([transactionText]);
These embeddings allow our system to understand that “AMZN MKTP” and “Amazon Marketplace Purchase” refer to the same thing, even though the text strings are completely different. The system learns patterns like:
You’re missing the point, that would require sitting down and manually doing that for every conceivable payee. Walmart is just an example. The value of any sort of “intelligent” component would be for this to happen automatically and seamlessly for the user. Hell, the AI layer could just be “write regex for al the possible similar payees across these documents”.
Yep, that’s exactly the sort of thing I’m thinking about here. And it doesn’t even need to be full on chat style LLM, just some decent NLP that can recognize WALMART, WAL-MART, or WMART are all the same thing and label it.
But for some reason this question brings out all the assumption people who want to give financial advice or talk about the AI image the saw last year with 6 fingers.
I’m just presenting that as a “is this what you mean”. If it is, then perhaps a FOSS or self hostable version fists or the community might be interested in one existing.