Quantcast
Channel: Arbitrary large unsigned integers - Code Review Stack Exchange
Viewing all articles
Browse latest Browse all 3

Answer by Quuxplusone for Arbitrary large unsigned integers

$
0
0

Looks like fairly clean code, but I see plenty of places it could be improved. Letting my eyes skim over it and stop on the weird bits...

hugeint *hugeint_ladd_cutoverflow(size_t n, const hugeint *const *summands,        unsigned int *residue);hugeint *hugeint_ladd(size_t n, const hugeint *const *summands);hugeint *hugeint_vadd(size_t n, va_list ap);hugeint *hugeint_add(size_t n, ...);

This is weird. You have a sane-looking sub function, and a sane-looking mult function —

hugeint *hugeint_sub(const hugeint *self, const hugeint *diff);hugeint *hugeint_mult(const hugeint *hi, const hugeint *factor);

— yet somehow when it comes to the simplest possible operation, add, you have all these weird helper functions and no way to just add two numbers? I'm looking for

hugeint *hugeint_add(const hugeint *a, const hugeint *b);

Having to write that function signature made me notice that your sub function's parameters are named self and diff, which makes no sense to me. self is a name usually reserved for the "this pointer" in OO style, so from the signature of this function I'd be expecting the equivalent of

auto hugeint_sub(auto self, auto diff) {    self -= diff;    return self;}

But that's not what the function does — nor what I'd expected it to do before looking at the parameter names. So basically, your parameter names are misleading. The English names for them would be minuend and subtrahend, but personally I'd go with a and b. I think everyone knows what sub(a, b) means. :)


So let's look at that weird N-argument add function. Does it work?

hugeint *hugeint_ladd(size_t n, const hugeint *const *summands){    unsigned int res;    hugeint *result = hugeint_ladd_cutoverflow(n, summands, &res);    if (res)    {        size_t i = result->n;        result = hugeint_expand(result);        result->e[i] = res;    }    return result;}

We can tell instantly that it can't possibly work, because it's got this thing called residue that's basically a "carry flag" for the addition. The reason we know it can't be right is that this carry flag doesn't have enough bits to store an arbitrarily large number of carries! Try adding together 4294967297 copies of the hugeint 4294967295 and see what happens. One bit of carry is only enough for 2 addends, and 32 bits of carry is only enough for 4294967296 addends. What you need is a hugeint's worth of carry; or else just get rid of this whole "carry" business and just expand the result as you go; or else get rid of the notion that it's a good idea to add arbitrarily many addends and just make a hugeint_add(a, b) that does the obvious thing.


Speaking of trouble with carries, let's look at the shift-left and shift-right functions. They're interesting because you're using hugeint for the shift count itself, which is unlike any programming language or instruction set I'm aware of. The shift count can only sanely get as high as the log of the left-hand operand, which is to say, it had better fit into size_t. But okay, let's look at right-shift and find the place where you do "If the right-hand operand is bigger than 4 billion, just set the result to 0 and return"...

    if (count) hugeint_decrement(&count);} while (count && !hugeint_isZero(count));

...Oh dear. You should definitely give this one a rewrite. At minimum, it should know that bit-shifting by a multiple of 8 is equivalent to a single memmove, and bit-shifting by any other number can be reduced to a memmove plus 1–7 loop iterations.


Speaking of bit-widths, why are you using unsigned int as your unit data type instead of uint64_t or __uint128_t? I guarantee that uint64_t would be faster for most of what you're doing.


The other confusing thing about that right-shift code is:

if (positions) {    if (hugeint_isZero(positions)) return result;    count = hugeint_clone(positions);} else {    count = 0;}

Here we're assigning 0 (an int) to count (a hugeint). This isn't going to do what we wanted! ...Or rather, it is, but only via a very bizarre path. Here 0 means NULL, which because of the if (count) elsewhere in the function, means "do only 1 iteration". So this 0 semantically means 1!

The ultimate effect of all this spaghetti code is to make it so that if the user accidentally passes in a null pointer —

hugeint *result = hugeint_shiftright(a, NULL);

— then his operand will be shifted by 1. If this behavior is desirable, it should be provided as a separate function:

hugeint *result = hugeint_shiftright_by_one(a);

Providing two separate functions for the two separate functionalities is not only good for the reader's sanity, it's also going to be more efficient, because you won't have all those ifs confusing the CPU's branch predictor.


Getting back to add, I see this hugeint_expand function that you're calling all over the place. It doubles the length of the underlying array, which is way overkill. What you should be doing is computing the correct length for the array and then allocating exactly that much: that is, not

for (i = 0; i < n; ++i) {    while (summands[i]->n > result->n) {        result = hugeint_expand(result);    }}

but rather

int result_n = result->n;for (size_t i = 0; i < n; ++i) {    result_n = max(result_n, summands[i]->n);}result = hugeint_expand(result, result_n);

That is, hugeint_expand needs to take a parameter that tells it the size of the array you want. Compare this API to std::vector::reserve in C++; it should look very very similar.


There's definitely more to critique, but this should give you some ideas, anyway.


Viewing all articles
Browse latest Browse all 3

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>