View previous topic :: View next topic |
Author |
Message |
Zucca Moderator
Joined: 14 Jun 2007 Posts: 3768 Location: Rasi, Finland
|
Posted: Wed Aug 21, 2024 7:16 pm Post subject: bc vs. awk |
|
|
I found this rather interesting: Code: | zucca@NBLK-WAX9X ~ $ time echo -e "scale=5\n1.00001^100000" | bc
2.71826
real 0m11.629s
user 0m11.599s
sys 0m0.005s
zucca@NBLK-WAX9X ~ $ time awk 'BEGIN {print 1.00001^100000}'
2.71827
real 0m0.005s
user 0m0.002s
sys 0m0.003s | ... My guess is awk takes some shortcuts by rounding while calculating the formula, while bc doesn't drop a single digit of accuracy behind the curtains.
What do other think (or know)? _________________ ..: Zucca :..
My gentoo installs: | init=/sbin/openrc-init
-systemd -logind -elogind seatd |
Quote: | I am NaN! I am a man! |
|
|
Back to top |
|
|
szatox Advocate
Joined: 27 Aug 2013 Posts: 3477
|
Posted: Wed Aug 21, 2024 8:02 pm Post subject: |
|
|
Quote: | echo -e "scale=7\n1.00001^100000" | bc
2.7182682
| I think awk wins this round
Anyway, AFAIR bc uses base10 for its calculations, and calculates with arbitrary precision. It basically calculates things the same way a human with a pen and paper would.
Apparently, it just truncated the output without properly rounding it, which kinda makes sense given the above. _________________ Make Computing Fun Again |
|
Back to top |
|
|
Zucca Moderator
Joined: 14 Jun 2007 Posts: 3768 Location: Rasi, Finland
|
Posted: Wed Aug 21, 2024 8:33 pm Post subject: |
|
|
szatox wrote: | AFAIR bc uses base10 for its calculations, and calculates with arbitrary precision. It basically calculates things the same way a human with a pen and paper would. | That really would explain the execution time.
Btw... The result is eerily close to Euler's number.
EDIT: Do'h! That's one way to calculate it: _________________ ..: Zucca :..
My gentoo installs: | init=/sbin/openrc-init
-systemd -logind -elogind seatd |
Quote: | I am NaN! I am a man! |
|
|
Back to top |
|
|
pjp Administrator
Joined: 16 Apr 2002 Posts: 20508
|
Posted: Wed Aug 21, 2024 10:32 pm Post subject: |
|
|
szatox wrote: | Anyway, AFAIR bc uses base10 for its calculations, and calculates with arbitrary precision. It basically calculates things the same way a human with a pen and paper would.
Apparently, it just truncated the output without properly rounding it, which kinda makes sense given the above. | I presumed most people were taught to round up/down*. It seems odd to me to just ignore "those other numbers."
* I've also learned that there different rules to rounding, which seem very strange. I don't remember what they are now... maybe it had to do with money or data analysis? _________________ Quis separabit? Quo animo? |
|
Back to top |
|
|
flexibeast Guru
Joined: 04 Apr 2022 Posts: 473 Location: Naarm/Melbourne, Australia
|
|
Back to top |
|
|
pjp Administrator
Joined: 16 Apr 2002 Posts: 20508
|
Posted: Thu Aug 22, 2024 4:06 am Post subject: |
|
|
Yikes. I had no idea.
I wish I could remember the context of whatever rounding it was.
"Directed rounding to an integer" mentions financial calculations, but I'm almost certain it was decimal rounding.
"Rounding half away from zero" is a possibility. Also called commercial rounding. It mentions "currency conversions and price roundings."
"Rounding half to even" seems very possible. "Banker's rounding" has a certain familiarity to it. Also that it is "the default rounding mode used in IEEE 754 operations for results in binary floating-point formats."
I'm leaning toward the last one, but I really don't recall what rabbit hole I went down. _________________ Quis separabit? Quo animo? |
|
Back to top |
|
|
Zucca Moderator
Joined: 14 Jun 2007 Posts: 3768 Location: Rasi, Finland
|
Posted: Thu Aug 22, 2024 10:38 am Post subject: |
|
|
pjp wrote: | szatox wrote: | Anyway, AFAIR bc uses base10 for its calculations, and calculates with arbitrary precision. It basically calculates things the same way a human with a pen and paper would.
Apparently, it just truncated the output without properly rounding it, which kinda makes sense given the above. | I presumed most people were taught to round up/down*. It seems odd to me to just ignore "those other numbers." | So if bc rounds "incorrectly" but awk does it "correctly" and is pretty accurate (anyone know how to add more precision to awk's math operations?), then does the proposed base10 calculating really add up that much overhead? I mean performing around 5500 times slower is quite significant. _________________ ..: Zucca :..
My gentoo installs: | init=/sbin/openrc-init
-systemd -logind -elogind seatd |
Quote: | I am NaN! I am a man! |
|
|
Back to top |
|
|
lars_the_bear Guru
Joined: 05 Jun 2024 Posts: 537
|
Posted: Thu Aug 22, 2024 11:04 am Post subject: |
|
|
I think BC calculates powers by repeated multiplication. Since it's working to arbitrary precision, once you have a arbitrary-precision multiplication, you can get an arbitrary-precision power just by repeating it. I'm not sure what other way there is to get a true power to a fixed number of digits without internal rounding. In any event, bc seems to take twice as long if I make the power term twice as large; but I guess you'd need to look at the source code to be sure that's what's going on.
I believe gawk uses regular floating-point math, and (I would guess) calculates powers by multiplying logarithms. Such an operation would take, I guess, a fixed time. You can do the same thing with bc, and it's pretty fast:
Code: |
time echo -e "scale=20; e(l(1+ 1/100000)*100000)" | bc -l
2.71826823717448944150
real 0m0.002s
user 0m0.001s
sys 0m0.001s
|
BR, Lars. |
|
Back to top |
|
|
pjp Administrator
Joined: 16 Apr 2002 Posts: 20508
|
Posted: Thu Aug 22, 2024 4:06 pm Post subject: |
|
|
Zucca wrote: | So if bc rounds "incorrectly" but awk does it "correctly" and is pretty accurate (anyone know how to add more precision to awk's math operations?), then does the proposed base10 calculating really add up that much overhead? I mean performing around 5500 times slower is quite significant. | awk uses "%.6g" by default: Code: | $ time awk 'BEGIN {printf "%.20g\n", 1.00001^100000}'
2.7182682371975284141
real 0m0.002s
user 0m0.001s
sys 0m0.000s | And even that only does 19 decimal digits, so that's odd.
I would have guessed bc to be "better" since that's it's sole purpose. Apparently gawk can do 100bits of precision. Compared to lars' use of bc, awk's output differs notably after 10 digits: Code: | awk
2.7182682371 975284141
bc
2.7182682371 7448944150 |
And I think that issue is related to the rabbit hole I went down regarding rounding. _________________ Quis separabit? Quo animo? |
|
Back to top |
|
|
wanne32 n00b
Joined: 11 Nov 2023 Posts: 71
|
Posted: Fri Aug 23, 2024 2:34 pm Post subject: |
|
|
You can use maxima to calculate the exact number without rounding:
Code: | maxima --batch-string='(1/100000+1)^100000;' |
Or floor to 21 digits:
Code: | maxima --batch-string='floor((1/100000+1)^100000*100000000000000000000)/100000000000000000000;' |
So 2.71826823717448966803... would be right.
So bc with 20digits is only slightly better since it also seems to implement the power function on its own, rounding intermediate results.
acalc does a really decent job:
Code: | calc '1.00001^100000'
~2.71826823717448966804 |
While the glibc is blazing fast but has really bad precision:
Code: | #include <stdio.h>
#include <math.h>
int main()
{ printf("%0.20lf",pow(1.00001,100000.0)); } |
Code: | gcc -Wall a.c ; time ./a.out
2.71826823719229748733
real 0m0.001s
user 0m0.001s
sys 0m0.000s |
This affects more or less all programming languages,non-math programs like python/libreoffice calc, kcalc... |
|
Back to top |
|
|
sublogic Apprentice
Joined: 21 Mar 2022 Posts: 280 Location: Pennsylvania, USA
|
Posted: Sat Aug 24, 2024 12:21 am Post subject: |
|
|
wanne32 wrote: | While the glibc is blazing fast but has really bad precision:
Code: | #include <stdio.h>
#include <math.h>
int main()
{ printf("%0.20lf",pow(1.00001,100000.0)); } |
Code: | gcc -Wall a.c ; time ./a.out
2.71826823719229748733
real 0m0.001s
user 0m0.001s
sys 0m0.000s |
This affects more or less all programming languages,non-math programs like python/libreoffice calc, kcalc... |
There is already a loss of precision in the binary representation of 1.00001 . When taking a power of a number so close to 1 the log1p() function can be useful: Code: | #include <stdio.h>
#include <math.h>
int main(int argc, char *argv[])
{
printf("%0.20le", 1.00001-1);
puts("\t1.00001-1");
printf("%0.20lf", pow(1.00001, 100000.0));
puts("\t\tpow(1.00001, 100000.0)");
printf("%0.20lf", exp(log1p(.00001)*100000.0));
puts("\t\texp(log1p(.00001)*100000.0)");
exit(0);
} | Result: Code: | 1.00000000000655120402e-05 1.00001-1
2.71826823719229748733 pow(1.00001, 100000.0)
2.71826823717448995410 exp(log1p(.00001)*100000.0) | The log1p() version gets about 16 decimals right, which is machine precision for double precision floating-point. |
|
Back to top |
|
|
dmpogo Advocate
Joined: 02 Sep 2004 Posts: 3454 Location: Canada
|
Posted: Sat Aug 24, 2024 8:38 am Post subject: |
|
|
I remember when I was a graduate student in the 80-s, the younger brother of my classmate was writing a bachelors thesis ( kind of diploma thesis) in math department which was about algorithms and precision of computer evaluation of trigonometric functions .... It is all a non-trivial business when you work close to machine precision. |
|
Back to top |
|
|
wanne32 n00b
Joined: 11 Nov 2023 Posts: 71
|
Posted: Sat Aug 24, 2024 10:00 am Post subject: |
|
|
sublogic wrote: | Code: | 2.71826823717448995410 exp(log1p(.00001)*100000.0) |
| I see why that works. Being near zero is much more exact than near one. But I would never had the idea by myself. Respect for that. |
|
Back to top |
|
|
eccerr0r Watchman
Joined: 01 Jul 2004 Posts: 9841 Location: almost Mile High in the USA
|
Posted: Sat Aug 24, 2024 1:43 pm Post subject: |
|
|
Yes, bc is not using the FPU or even CPU register floating or fixed point math (like FDIV or ADDL instructions) because these arbitrary precision numbers can and will exceed 32 or 64-bit registers of the CPU. So it rather manually handle each digit as a character. Larger numbers with more digits will require more CPU time. Similar calculation times should be seen with using GMP.
Awk, perl, C, etc., etc. will handle numbers with CPU/FPU floating point routines and whatever's in glibc. They also make shortcuts but unless you're using a Pentium with the FDIV bug, were designed to be accurate within the precision (number of bits in a cpu register) asked and no more. Some floating point operations require Newton's method to calculate and thus iterative, but it takes roughly the same amount of time no matter how "large" the number is. Yes some operations even use look up tables but that's how it's done to take as little time (and silicon) as possible to do the math. _________________ Intel Core i7 2700K/Radeon R7 250/24GB DDR3/256GB SSD
What am I supposed watching? |
|
Back to top |
|
|
|