Overloaded arithmetic operators
basic arithmetic
z = x + 1
z = [1] - x
z = x * np.random.rand(3)
z = numfi([1,0,0.1234],1,21,15) / x
For basic arithmetic operators of numfi object x
, if secondary operand y
is not numfi, it will first convert y
to numfi with template x
then do the rest arithmetic.
It will first use numfi.double
to do 64 bit floating point arithmetic, then convert back to numfi
with proper s/w/f
(new s/w/f follow some rules), i.e.:
y = x + 1 <==> y = numfi(x.double + numfi(1,like=x).double, s, w, f)
If one of the operands, x
or y
has FullPrecision=False
, then the result will keep old s/w/f
unchanged, otherwise it will follow new s/w/f rules below. (In any case, RoundingMethod
and OverflowAction
will keep unchanged)
New s/w/f rules
Signed/Unsigned
s = 1 if x.s or y.s else 0
Answer is unsigned(s=0) if and only if each operand is unsigned. Otherwise, answer is signed.
ADD/SUB
f = max(x.f, y.f)
w = max(x.i, y.i) + f + s + (1 if (x.s==y.s) else 2) # need extra bit to convert unsigned to signed
MUL
w = x.w + y.w
f = x.f + y.f
DIV
w = max(x.w, y.w)
f = x.f - y.f
Division in fixed point arithmetic has many different definition, here we use the same definition as Matlab. Other division definition will be added in future version.
other arithmetic
y = -x <==> numfi(-x.double, like=x)
y = x ** 0.5 <==> numfi(x.double ** 0.5, like=x)
y = x % 3 <==> numfi(x.double % 3, like=x)
arithmetic operators other than add/sub/mul/div
will use numfi.double
as operand.
bitwise operation
y = ~x <==> numfi((~x.int) * x.precision, like=x)
y = x & 0b101 <==> numfi((x.int & 0b101) * x.precision, like=x)
y = x | 0b100 <==> numfi((x.int | 0b100) * x.precision, like=x)
y = x ^ 0b001 <==> numfi((x.int ^ 0b001) * x.precision, like=x)
y = x << 4 <==> numfi((x.int << 4) * x.precision, like=x)
y = x >> 2 <==> numfi((x.int >> 2) * x.precision, like=x)
Bitwise operators will use numfi.int
as operand, then convert back to float value then numfi object
logical comparsion
y = x > 0.5 <==> x.double > 0.5
y = x >= 0.5 <==> x.double >= 0.5
y = x == x <==> x.double == x
y = x != [0,0,0] <==> x.double != [0,0,0]
y = x <= np.ones(3) <==> x.double <= np.ones(3)
y = x < [1,1,1] <==> x.double < [1,1,1]
Logical comparsion operators will use numfi.double
as operand, result is numpy.ndarray(dtype=bool)
numpy.ndarray ufunc function
y = np.sin(x) <==> numfi(np.sin(x.double))
y = np.sum(x) <==> numfi(np.sum(x.double))
y = x.mean() <==> numfi(x.double.mean())
# y.s == x.s y.w == x.w
# y.f == numfi.get_best_presicion(y, x.s, x.w) if x.FullPrecision else x.f
ufunc function will use numfi.double
as operand, then convert to numfi
object. If FullPrecision=True
, it will find best fraction precision with floating point result, if not, it will keep same s/w/f as operand.
Note this new s/w/f behavior is different from Matlab's, since in Matlab each ufunc has it own embedded@fi
version, which is not possible in python/numpy.
Example: np.cos(numfi(0))
is s16/14
and equal to 1
exactly, which need two integer bit to represent. (remember upper limit of one integer bit is 1 - 2**-f
, not 1
)
But in Matlab cos(fi(0))
is s16/15
and equal to 0.999969482421875
. Because Matlab's cos
function R2023a\toolbox\fixedpoint\fixedpoint\+embedded\@fi\cos.m
explicitly set answer to s16/15
. Which make sense since -1<=cos<=1
, but sacrifice some precision when answer is exactly 1.
How numfi do fixed-point arithmetic
In theory, to accurately simulate fixed-point arithmetic bit by bit, we should store integers in memory and perform the fixed-point arithmetic in the integer domain. However there are some problems with this approach:
-
there may be extra step during arithmetic. For example:
a(s16/8) + b(s16/10)
, to add them together in integer domain we must first left shifta
by 2 to align the radix point. -
not compatible with other function like
np.sin(x)/plt.plot(x)/etc
. They will use data in memory, which is the integer representation instead of 'real-value' floating-point data. We have to write extra code likenp.sin(x.double)
to get correct answer, this not only makes porting code between floating and fixed point harder, but also introduce extra computation.
numfi use 'real-value' np.float64
as underlying data type instead of integer, so numfi's 'fixed-point arithmetic' is actually happened in floating-point domain with some quantization/overflow control. Since it's essentially floating-point arithmetic, under some extreme condition the result may not be exactly the same as real fixed-point arithmetic using integer representation. (due to precision limit of floating point data type)
Speed and compatibility is the reason why numfi use floating-point arithmetic to 'simulate' fixed-point arithmetic. Unlike embedded chips, for modern desktop CPU there is no significant preformance differece between floating-point and integer arithmetic. And as a numpy.ndarray
subclass, numfi object using 'real-value' floating-point data can be used in any floating-point algorithm flawlessly, take the advantage of python's dynamic type feature