Mastodawn

today I'm thinking about the "don't use floating point for money" advice I hear all the time. It obviously has a lot of truth to it.

But -- Excel/Google Sheets uses floating point for all of its calculations, people use spreadsheets for money calculations all the time, and it generally seems to work just fine -- the results get rounded for display.

So I'm trying to figure out if there's a more nuanced guideline than "never use floating point for money".

Show thread

Walt Mankowski Apr 13, 2023

@b0rk My guess is that most spreadsheets don't iterate enough on their floating point calculations for it to make a significant different.

Show thread

Paul Rosen Apr 13, 2023

@b0rk I've heard "never use floating point for money in javascript". But that is hard because as soon as something is divided it is floating point. But we keep prices in pennies, never dollars.

Show thread

Nelson Minar Apr 13, 2023

@paulrosen even worse, even integers are floating point in Javascript although it's no big deal on numbers smaller than 9 quadrillion. https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Number/MAX_SAFE_INTEGER

Number.MAX_SAFE_INTEGER - JavaScript | MDN

The Number.MAX_SAFE_INTEGER static data property represents the maximum safe integer in JavaScript (253 – 1).

MDN Web Docs

Show thread

Daniel Lo Nigro 🇦🇺Apr 13, 2023

@paulrosen @b0rk I hope that one day we have a fixed point number type in JS. It could be implemented using a BigInt for the value plus a number for where the decimal point should go, but BigInt has a bunch of limitations. Not being able to JSON encode it is a major one.

Show thread

/me Apr 14, 2023

@paulrosen
@b0rk
Not true. Everything in JavaScript is a float. JS only has a single number type called Number (wasm excluded), which is a 64 bit float.

Show thread

Summer Oct 7, 2023

@paulrosen For integer division, can't always be sure that the result doesn't need rounding, so its somewhat hinting that you should use something that provides the right rounding method for the work you are doing (rounding up or down to a specific number, rounding certain transactions in certain directions, etc)

And then additionally needing to keep track of the decimal position too. It becomes complicated and deserving of a library of functions and wrappers.

Show thread

@[email protected]Apr 13, 2023

@b0rk I guess the important thing to remember is that errors (can) compound, so floating point is fine if you're not doing any complex or repeated calculations.

Time-limited budgets is probably fine, but holding accounts and calculating compounding interest and stuff, nah, just move to counting cents/cent fractions in an integer or something.

Show thread

Graham Sutherland / Polynomial Apr 13, 2023

@b0rk Excel violates the IEEE float spec by not implementing denormals, which helps a little in this specific case because it removes one of the common causes of error accumulation.

Show thread

Graham Sutherland / Polynomial Apr 13, 2023

@b0rk The main situation where floating point numbers go wrong is if you're dealing with very large multiples of precise values, or if you need consistency of results when you calculate something in different ways, especially if you're using a float32.

For most organisations this isn't a problem, because we're talking about sums in the millions. For financial organisations dealing with huge transaction volumes, or very precise pricing models, it's something that needs to be considered.

Show thread

Graham Sutherland / Polynomial Apr 13, 2023

@b0rk I also think the common examples around floating point causing problems in finance are a little misleading. If you're using a double precision float, the actual calculations are probably not going to be skewed by a meaningful amount. What more commonly rears its head is comparison problems.

If you perform a calculation and then compare the result to some well-defined number, a small precision error can lead to unexpected outcomes. if (x <= y) might fail because x accumulated a tiny error.

Show thread

Graham Sutherland / Polynomial Apr 13, 2023

@b0rk you have to do trickery like `if (y - x > epsilon)` to compensate for these issues, and doing this reliably can be quite hard work.

Show thread

Jernej Simončič �Apr 13, 2023

@gsuberland @b0rk About 20 years ago my father mentioned that he had to do some calculations by hand, because the results in Excel were wrong by a few percent.

Show thread

Julia Evans Apr 13, 2023

@jernej__s @gsuberland I think the "20 years ago" thing here is key -- I'd guess that floating point numbers 20 years ago were 32-bit, and 32-bit floats are SUPER different from 64-bit floats

You start getting money inaccuracies in 32-bit floats at around $100,000. But in 64-bit floats, you have to go up to ~$1,000,000,000,000,000 to see similar issues (1 quadrillion dollars)

Show thread

Gasper Zejn Apr 13, 2023

@b0rk @jernej__s @gsuberland For the love of good engineering, please stop thinking money is integers.

If you want to be precise on two digits, you need four digits. This brings you down to 100 billion.

Not that high of a number any more.

Show thread

DCoder 🇱🇹❤🇺🇦Apr 13, 2023

@gsuberland @b0rk Microsoft documents Excel’s float handling and possible problems, with examples: https://learn.microsoft.com/en-us/office/troubleshoot/excel/floating-point-arithmetic-inaccurate-result

Floating-point arithmetic may give inaccurate result in Excel - Microsoft 365 Apps

Discusses that floating-point arithmetic may give inaccurate results in Excel.

Show thread

Shane Kerr Apr 13, 2023

@dcoderlt @gsuberland @b0rk Wow what a great read! Filled with lovely tidbits like, "Even common decimal fractions, such as decimal 0.0001, cannot be represented exactly in binary. (0.0001 is a repeating binary fraction that has a period of 104 bits)."

104 bits! You can't just casually drop something like that in the middle of a text without further comment, but there it is.... 😄

Show thread

filipa mv Apr 13, 2023

@gsuberland @b0rk

my read on this help article is that a) excel is a generic numbers tool, and b) Msft paid someone to think very carefully about the failure modes

but if you’re a random developer adding up invoices,

a) you have a much narrower set of requirements & can tailor the data structure accordingly, and b) you might not have time to think very carefully about floats

https://learn.microsoft.com/en-us/office/troubleshoot/excel/floating-point-arithmetic-inaccurate-result

Floating-point arithmetic may give inaccurate result in Excel - Microsoft 365 Apps

Discusses that floating-point arithmetic may give inaccurate results in Excel.

Show thread

Julia Evans Apr 13, 2023

@gsuberland TIL thanks!

Show thread

Chad Geidel Apr 13, 2023

@b0rk It doesn't specifically state "Excel" but VBA's currency type is a fixed point 64 bit integer. Maybe if you specify "currency" as the column it uses this math?

https://learn.microsoft.com/en-us/office/vba/language/reference/user-interface-help/currency-data-type

Currency data type

Office VBA reference topic

Show thread

Nate Apr 13, 2023

@chadgeidel @b0rk

If you have the cell set as the accounting or currency vba scripts by reading from a sheet do get the 64bit fixed point currency type

Show thread

Toby Apr 13, 2023

@b0rk I prefer to think of it as “try and store the smallest unit possible” for money.

Sure I could store pounds and pence; but easiest to store pence and do some formatting.

Show thread

Dan Apr 13, 2023

@b0rk sql server money type appears to be floats where the min/max are bounded at ten thousandths remaining accurate. That's a reasonable guideline I think.

Show thread

Jordi Rosell Apr 13, 2023

@b0rk 0.00000001 BTC is the smallest unit

Show thread

chrysn Apr 13, 2023

@b0rk People use Excel for all kinds of things it's not good for – the renaming of the MARCH1 and SEPT1 genes is a testament to that.
If it's just for your own budgeting or a travel ledger, it's fine. But when being off by a cent means that the tax office will have a closer look at your books, you better do the calculations by the letter of the tax code, and "float" are not something I've ever seen in there.

Show thread

chrysn Apr 13, 2023

@b0rk For precision and clarification: I'm not regularly dealing with the tax code, but conveniently the German term "Gleitkomma" is nowhere near as flexibly used as the English "float", and I didn't find it in any legal Austrian legal text except school curricula.

Show thread

Drew 🐘Apr 13, 2023

@b0rk I think people who say that often mean "don't use floating point for financial transactions" and I think that's good advice. If you're a stock analyst or a manager doing budget projections then the assumptions the spreadsheet are based on introduce many orders of magnitude more error than the inexactness of floating point adds. Transactions are different because they can lead to liability and loss of customer trust. No one wants to give money to a website that shows $0.10000000000000002 :)

Show thread

Howard Chu @ Symas Apr 13, 2023

@b0rk Well, if every number you're feeding in is already fixed to 1cent resolution, and you're not doing many division ops on the values, there's nothing to worry about. If you're doing chained division ops, that's pretty weird, for money.

Show thread

Benny Powers 🇨🇦️🇮🇱️Apr 13, 2023

@b0rk Excel also admits that date formatting significantly skewed genetics research, but won't fix it because "the geneticists are holding it wrong."

https://timharford.com/2021/07/the-tyranny-of-spreadsheets/

> Or say you’re a genetics researcher typing in the name of a gene such as “Membrane Associated Ring-CH-Type Finger 1”, or March1 for short, or perhaps the Sept1 gene. You can imagine what Excel does next. One study estimated that 20 per cent of all genetics papers had errors caused by Excel’s autocorrect.

The Tyranny of Spreadsheets

Early last October my phone rang. On the line was a researcher calling from Today, the BBC’s agenda-setting morning radio programme. She told me that something strange had happened, and she hoped I…

Tim Harford

Show thread

renee on the internet Apr 13, 2023

@b0rk Plenty of domains, like various kinds of financial analytics, can happily use floats for money. Biggest issue is when you're actually dealing with real dollars and cents (or whatever the atom of your subject currency is) - you're rounding out real money.

I've taken to saying "don't use floats for accounting"' instead

Show thread

Random Damage 🌻Apr 13, 2023

@b0rk the flip side to this is the Buffett Overflow from a couple of years ago.

I think the important takeaway is "use whatever numeric format works".

64-bit integers with units of hundredths of a cent is probably the "bestest" we have now, but that's not always going to be available

Show thread

Boba Yaga Apr 13, 2023

@b0rk I would guess Excel is using base-10 floats and not base-2, which goes a long way to help.

Show thread

Richard Barrell Apr 13, 2023

@bobayaga @b0rk last I checked nah, Excel is using ieee754 doubles.

I think one small factor is that financial *forecasting* just isn't really very precise in the first place. Small factor because forecasting isn't the main use of Excel, even though as I understand it that's what Excel was originally intended for.

Bigger factor may be that nobody actually really cares about single penny rounding errors. They're nothing compared to shrinkage.

Show thread

Josh Apr 13, 2023

@b0rk use floating point for money, just don’t do it 100 billion times

Show thread

Michael Richardson Apr 13, 2023

@b0rk Don't use spreadsheets for money. It's not safe for dozens of reasons.

Show thread

Nelson Minar Apr 13, 2023

@b0rk this is a really interesting and smart question.

Show thread

Mike Loukides Apr 13, 2023

@nelson @b0rk It depends on what you're doing. My father was an accountant for the phone company, and he'd worry about being pennies off in a billion-dollar figure. Errors of that size accumulate quickly with floating point. It's a problem (probably not a big one, but that's what accountants do) if it shows up on a corporate tax document. It's all about precision.

OTOH, if you're doing a sales projection, your audience really only cares about two (maybe three) significant digits.

Show thread

Emily S Apr 13, 2023

@mikeloukides @nelson @b0rk the place I've seen this matter was a telecoms billing system that was written by someone who didn't know this. Lots and lots of very small amounts adding up. Bonus points for having the itemised listing showing the customer how the maths has gone wrong too.

Show thread

Shane Kerr Apr 13, 2023

@Emily_S @mikeloukides @nelson @b0rk hold it... wasn't that how Richard Pryor's character got rich in Superman 3?

https://youtu.be/N7JBXGkBoFc

Superman III — Richard Pryor — half cents and Ferrari scenes

YouTube

Show thread

Kenneth Finnegan Apr 13, 2023

@b0rk "never use floating point for money, and never use Excel for processing data where the stakes matter"

Show thread

Kevin Karhan

Apr 13, 2023

@kwf @b0rk I'd like to add more to it: Never use #Excel for anything - period!

#LibreOffice #Calc does the same but better...

Show thread

Till O'Rly

Apr 15, 2023

@kkarhan @kwf @b0rk
When in doubt use CSV and Jupyter/Pandas/(Matplotlib).

Show thread

Kevin Karhan

Apr 16, 2023

@mrgl @kwf @b0rk I prefer TSV's (using tabs instead of commas) because I follow Genevan Nomenclature on Mathematics and that standardizes the use of comma instead of dot as seperator for non-intregers 1st decimal.

Show thread

Sophie Schmieg Apr 13, 2023

@kwf @b0rk
This post being boosted by @soatok made me assume it was about cryptography for a second

Show thread

Rob Apr 13, 2023

@b0rk My first reaction to this is kinda condescending, which is just “Don’t use excel or google sheets for money if you actually need accuracy” and I’m debating how much that’s genuine advice I’d stand behind.

Possibly losing pennies or dollars or thousands of dollars to rounding errors is gonna be small potatoes compared to other errors caused by the difficult-to-debug nature of spreadsheets (see for example https://www.science.org/content/article/one-five-genetics-papers-contains-errors-thanks-microsoft-excel).

One in five genetics papers contains errors thanks to Microsoft Excel

Show thread

Sam Tobin-Hochstadt Apr 13, 2023

@simrob @b0rk I think regardless of whether _you_ would stand behind that advice, in practice people who work with money professionally all use Excel all the time for it, including in complicated ways at large financial institutions.

Show thread

sayrer Apr 13, 2023

@samth @simrob @b0rk There have been many studies showing how many spreadsheets have errors. But it's true that people use them, I like to say it's always a good sign when square people with the Thinkpads show up.

A funny thing about the JP Morgan / Frank suit is that someone at the bank noticed their customer list was the Excel maximum row count.

Show thread

Sam Tobin-Hochstadt Apr 13, 2023

@sayrer @simrob @b0rk Definitely lots of excel errors, but I don't know of any that either indicate a higher or lower rate than software, or errors due to binary floats.

Show thread

sayrer Apr 13, 2023

@samth @simrob @b0rk there have been some studies that indicate a very high error rate, but I’m not sure how they got the data.

Floating point in Excel is a famous footgun:

https://en.wikipedia.org/wiki/Numeric_precision_in_Microsoft_Excel

and then you can dig in here:

https://learn.microsoft.com/en-US/office/troubleshoot/excel/floating-point-arithmetic-inaccurate-result

They also have weird cases where numbers are treated as dates, and things like that.

Numeric precision in Microsoft Excel - Wikipedia

Show thread

sayrer Apr 13, 2023

@samth @simrob @b0rk you can also find decimal/money types in most databases. IBM always has them: https://www.ibm.com/docs/en/informix-servers/12.10?topic=esqlc-numeric-data-types

Numeric data types

Show thread

tobychev Apr 14, 2023

@sayrer
One source of that I've seen is the "Enron corpus" spreadsheets harvested from all the emails published after Enron went bankrupt in 2001. That collection of spreadsheets had an error in about 24 % of all reviewed documents.
@samth @simrob @b0rk

Show thread

Rob Apr 13, 2023

@samth …yes, obviously?

Show thread

Sam Tobin-Hochstadt Apr 13, 2023

@simrob Right, and I think that situation calls for (a) humility from people who don't do that about whether maybe the people who do are actually more-expert at their area and (b) an answer to Julia's question as to why it works out.

Show thread

Julia Evans Apr 13, 2023

@simrob I'm struggling to see a real world use case where you would lose even 1 cent to a floating point error in a spreadsheet

like I'm sure it exists! But I personally do not know how to make a spreadsheet that does a remotely realistic calculation that has such an error

Show thread

Nathan Sorenson Apr 13, 2023

@b0rk @simrob when I was working in finance the biggest problem related to rounding and spreadsheets was that numbers often were copied around between sheets, to and from proposals, etc, and those would be the displayed rounding not the underlying floating point. So not floating point's fault but more software UX and process/provenance issues.

Show thread

Nathan Sorenson Apr 13, 2023

@b0rk @simrob I basically agree that floating point has been made too big of a scapegoat and lack of version control, easy-to-ruin data provenance, and sloppy auto coercions are excel's bigger flaws.

Show thread

Ryan Cavanaugh Apr 13, 2023

@b0rk probably an easy heuristic is "Is it OK if all amounts are randomly off by 1 cent"? It's fine to be off by 1 cent if you're calculating a paycheck, but it's not OK to split a $100 transaction into $50.00 and $50.01 and have money just go missing / appear out of nowhere. Spreadsheets are rarely used in the latter case where exact accounting identities must be upheld

Show thread

wrosecrans Apr 13, 2023

@b0rk Never use floats for calculations with other people's money.

You won't get into a law suit if you miscalculate your own budget. If you are responsible for somebody else's money, there is a much higher standard of care, and much higher return on CYA investments.

Show thread

nebunez Apr 13, 2023

@b0rk
https://www.red-gate.com/hub/product-learning/sql-prompt/the-dangers-of-using-float-or-real-datatypes

This article is regarding SQL, but a relevant read.

The Dangers of using Float or Real Datatypes | Redgate

Floating point datatypes accommodate very big numbers but sacrifice precision. They are handy for some types of scientific calculations, but are dangerous when used more widely, because they can introduce big rounding errors.

Redgate

Show thread

BjarniBjarniBjarni 🙊 🇮🇸 🍏Apr 13, 2023

@b0rk As I understand this advice, proper bookkeeping abhors rounding - and truncating!

Traditional money isn't infinitely divisible, you can't pay someone 1/4th of a cent - it's not allowed (in the U.S. a half-cent did exist tho).

But worse, a float will just start losing money if the amount gets too large - it has limited precision: 1e16 + 1 == 1e16.

So using a data type which will at times round or truncate behind our backs and without warning, is just incompatible with sound bookkeeping?

Show thread

BjarniBjarniBjarni 🙊 🇮🇸 🍏Apr 13, 2023

@b0rk Also for context, I think this advice dates from when you ran the risk of a float being relatively small.

A single precision float (32 bits) only has 24 bits of precision, which is small enough that you will run into trouble with real life amounts pretty quickly.