| www.delorie.com/gnu/docs/gmp/gmp_59.html | search |
![]() Buy GNU books! | |
| [ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
This chapter describes low-level GMP functions, used to implement the high-level GMP functions, but also intended for time-critical user code.
These functions start with the prefix mpn_.
The mpn functions are designed to be as fast as possible, not
to provide a coherent calling interface. The different functions have somewhat
similar interfaces, but there are variations that make them hard to use. These
functions do as little as possible apart from the real multiple precision
computation, so that no time is spent on things that not all callers need.
A source operand is specified by a pointer to the least significant limb and a limb count. A destination operand is specified by just a pointer. It is the responsibility of the caller to ensure that the destination has enough space for storing the result.
With this way of specifying operands, it is possible to perform computations on subranges of an argument, and store the result into a subrange of a destination.
A common requirement for all functions is that each source area needs at least one limb. No size argument may be zero. Unless otherwise stated, in-place operations are allowed where source and destination are the same, but not where they only partly overlap.
The mpn functions are the base for the implementation of the
mpz_, mpf_, and mpq_ functions.
This example adds the number beginning at s1p and the number beginning at s2p and writes the sum at destp. All areas have n limbs.
cy = mpn_add_n (destp, s1p, s2p, n) |
In the notation used here, a source operand is identified by the pointer to the least significant limb, and the limb count in braces. For example, {s1p, s1n}.
This is the lowest-level function for addition. It is the preferred function
for addition, since it is written in assembly for most CPUs. For addition of
a variable to itself (i.e., s1p equals s2p, use mpn_lshift
with a count of 1 for optimal speed.
This function requires that s1n is greater than or equal to s2n.
This is the lowest-level function for subtraction. It is the preferred function for subtraction, since it is written in assembly for most CPUs.
This function requires that s1n is greater than or equal to s2n.
The destination has to have space for 2*n limbs, even if the product's most significant limb is zero.
This is a low-level function that is a building block for general multiplication as well as other operations in GMP. It is written in assembly for most CPUs.
Don't call this function if s2limb is a power of 2; use mpn_lshift
with a count equal to the logarithm of s2limb instead, for optimal speed.
This is a low-level function that is a building block for general multiplication as well as other operations in GMP. It is written in assembly for most CPUs.
This is a low-level function that is a building block for general multiplication and division as well as other operations in GMP. It is written in assembly for most CPUs.
The destination has to have space for s1n + s2n limbs, even if the result might be one limb smaller.
This function requires that s1n is greater than or equal to s2n. The destination must be distinct from both input operands.
No overlap is permitted between arguments. nn must be greater than or equal to dn. The most significant limb of dp must be non-zero. The qxn operand must be zero.
mpn_tdiv_qr instead for best
performance.]
Divide {rs2p, rs2n} by {s3p, s3n}, and write the quotient at r1p, with the exception of the most significant limb, which is returned. The remainder replaces the dividend at rs2p; it will be s3n limbs long (i.e., as many limbs as the divisor).
In addition to an integer quotient, qxn fraction limbs are developed, and stored after the integral limbs. For most usages, qxn will be zero.
It is required that rs2n is greater than or equal to s3n. It is required that the most significant bit of the divisor is set.
If the quotient is not needed, pass rs2p + s3n as r1p. Aside from that special case, no overlap between arguments is permitted.
Return the most significant limb of the quotient, either 0 or 1.
The area at r1p needs to be rs2n - s3n + qxn limbs large.
The integer quotient is written to {r1p+qxn, s2n} and in addition qxn fraction limbs are developed and written to {r1p, qxn}. Either or both s2n and qxn can be zero. For most usages, qxn will be zero.
mpn_divmod_1 exists for upward source compatibility and is simply a
macro calling mpn_divrem_1 with a qxn of 0.
The areas at r1p and s2p have to be identical or completely separate, not partially overlapping.
mpn_tdiv_qr instead for best
performance.]
mpn_divexact_by3c takes an initial carry parameter, which can be the
return value from a previous call, so a large calculation can be done piece by
piece from low to high. mpn_divexact_by3 is simply a macro calling
mpn_divexact_by3c with a 0 carry parameter.
These routines use a multiply-by-inverse and will be faster than
mpn_divrem_1 on CPUs with fast multiplication but slow division.
The source a, result q, size n, initial carry i,
and return value c satisfy c*b^n + a-i = 3*q, where
\N\}, b=2^mp_bits_per_limb}. The
return c is always 0, 1 or 2, and the initial carry i must also
be 0, 1 or 2 (these are both borrows really). When c=0 clearly
q=(a-i)/3. When c!=0, the remainder (a-i) mod
3 is given by 3-c, because b == 1 mod 3 (when
mp_bits_per_limb is even, which is always so currently).
mp\_bits\_per\_limb limbs of q =
{s1p, s1n}/{s2p, s2n} mod 2^d at
rp, and returns the high d mod mp_bits_per_limb bits of
q.
{s1p, s1n} - q * {s2p, s2n} mod \N\{2
\GMPraise{s1n*mp\_bits\_per\_limb},
2^(s1n*mp\_bits\_per\_limb)} is placed at s1p. Since the
low floor(d)/mp\_bits\_per\_limb limbs of this
difference are zero, it is possible to overwrite the low limbs at s1p
with this difference, provided rp <= s1p.
This function requires that s1n * mp\_bits\_per\_limb
>= D, and that {s2p, s2n} is odd.
This interface is preliminary. It might change incompatibly in future revisions.
count must be in the range 1 to mp_bits_per_limb-1. The
regions {sp, n} and {rp, n} may overlap, provided
rp >= sp.
This function is written in assembly for most CPUs.
count must be in the range 1 to mp_bits_per_limb-1. The
regions {sp, n} and {rp, n} may overlap, provided
rp <= sp.
This function is written in assembly for most CPUs.
{s1p, s1n} must have at least as many bits as {s2p, s2n}. {s2p, s2n} must be odd. Both operands must have non-zero most significant limbs. No overlap is permitted between {s1p, s1n} and {s2p, s2n}.
{s1p, s1n} >= {s2p, s2n} is required, and both must be non-zero. The regions {s1p, s1n+1} and {s2p, s2n+1} are destroyed (i.e. the operands plus an extra limb past the end of each).
The cofactor r1 will satisfy r2*s1 + k*s2 = r1. The second cofactor k is not calculated but can easily be obtained from (r1 - r2*s1) / s2.
The most significant limb of {sp, n} must be non-zero. The areas {r1p, ceil(n)/2} and {sp, n} must be completely separate. The areas {r2p, n} and {sp, n} must be either identical or completely separate.
If the remainder is not wanted then r2p can be NULL, and in this
case the return value is zero or non-zero according to whether the remainder
would have been zero or non-zero.
A return value of zero indicates a perfect square. See also
mpz_perfect_square_p.
The most significant limb of the input {s1p, s1n} must be non-zero. The input {s1p, s1n} is clobbered, except when base is a power of 2, in which case it's unchanged.
The area at str has to have space for the largest possible number represented by a s1n long limb array, plus one extra character.
str[0] is the most significant byte and str[strsize-1] is the least significant. Each byte should be a value in the range 0 to base-1, not an ASCII character. base can vary from 2 to 256.
The return value is the number of limbs written to rp. If the most significant input byte is non-zero then the high limb at rp will be non-zero, and only that exact number of limbs will be required there.
If the most significant input byte is zero then there may be high zero limbs written to rp and included in the return value.
strsize must be at least 1, and no overlap is permitted between {str,strsize} and the result at rp.
It is required that there be a clear bit within the area at s1p at or beyond bit position bit, so that the function has something to return.
It is required that there be a set bit within the area at s1p at or beyond bit position bit, so that the function has something to return.
mpn_random generates
uniformly distributed limb data, mpn_random2 generates long strings of
zeros and ones in the binary representation.
mpn_random2 is intended for testing the correctness of the mpn
routines.
| [ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
| webmaster donations bookstore | delorie software privacy |
| Copyright © 2003 by The Free Software Foundation | Updated Jun 2003 |