Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

masm: added procedure to load exemption points #320

Closed
wants to merge 1 commit into from

Conversation

hackaugusto
Copy link
Contributor

The generated procedure takes the approach of pre-computing the values during compilations, and then searching for the correct exemption points during runtime, doing a binary search on the runtime trace length to find the correct points.

The alternative of computing the points during runtime or creating a lookup table would have worked, but both approaches would be sligthly slower in comparison.

The data is kept in the stack for 1. testability 2. performance, since mem load/stores are no longer necessary.

@hackaugusto
Copy link
Contributor Author

Relevant previous comment: #316 (comment)

@bobbinth bobbinth changed the base branch from next to hacka-z-caching June 21, 2023 01:00
@bobbinth bobbinth changed the base branch from hacka-z-caching to next June 21, 2023 01:01
@bobbinth
Copy link
Contributor

Relevant previous comment: #316 (comment)

Based on this discussion, should we update the approach? If I'm understanding things correctly, we'd need to do the following:

  1. Update code in stdlib of Miden VM to save $g$ to some well-known memory location.
  2. Use this $g$ here to compute two exemption points using division as $g^{N-1} = \frac{1}{g}$ and and multiplication as $g^{N-2} = \frac{g^{N-1}}{g}$.

@Al-Kindi-0
Copy link
Collaborator

Al-Kindi-0 commented Jun 21, 2023

We can use the binary-search based approach that @hackaugusto came up with in order to look-up the LDE domain generator $g_l$. Given that, the trace domain generator (should be double checked but I am quiet sure) is just $g := g_l^8$. Given $g$, and trace length $N$, we can get $g^{N-1}$ as $g^{-1}$ and $g^{N-2}$ as $g^{-1}*g^{-1}$
Can you confirm the range of exponents for $g_l$ i.e. the range of LDE domain sizes @bobbinth ? I think the upper bound is $2^{32}$ and the lower bound is $8$ times the minimal trace length (which is I believe $1024$) i.e. $2^{13}$?

Copy link
Collaborator

@Al-Kindi-0 Al-Kindi-0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great @hackaugusto , I like the binary search based approach and I wonder if we can apply it to the LDE domain (Low-degree extension domain i.e. it has size equal to the trace domain times a blow-up factor of $2^3$ (with an offset which is not important here)) instead to gain even further efficiency.

/// This procedure handles two power using conditional drops, instead of control flow with if
/// statements, since the former is slightly faster for the small number of instructions used. The
/// emitted code assumes the trace_length is at the top of the stack, and afer executing it will
/// leave leave the stack as follows:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor nit: leave (single)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed


/// Generate code to push the exemptions point to the top of the stack.
///
/// This procedure handles two power using conditional drops, instead of control flow with if
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor nit: "... handles two powers"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

@@ -287,6 +365,140 @@ impl<'ast> CodeGenerator<'ast> {
Ok(())
}

/// Emits code for the procedure `get_exemptions_points`.
///
/// The generated procedure contains the precomputed exeption points to be used when computing
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor nit: exception

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or exemption (first time I notice they are close in Hamming distance 😄)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

fn gen_get_exemptions_points(&mut self) -> Result<(), CodegenError> {
// Notes:
// - Computing the exemption points on the fly would require 1 exponentiation to find the
// root-of-unity from the two-adacity, followed by another exponetiation to compute the
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor nit: adicity

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

Comment on lines +378 to +489
// - For the range from powers 3 to 32 there are 30 unique values, which requires 8 words
// of data. Storing the data to memory requires pushing the 4 elements of a word to the
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the ranges need to be update, either for the current approach or the other proposed one.

@hackaugusto hackaugusto force-pushed the hacka-masm-load-exemption-points branch from f67b5ef to 8c98c00 Compare June 22, 2023 07:32
@hackaugusto
Copy link
Contributor Author

hackaugusto commented Jun 22, 2023

@bobbinth @Al-Kindi-0 to clarify. Is the idea is to the following?

  1. rename get_exemption_points to something like get_constants
  2. the procedure will be:
# Input: [trace_len, ...]
# Output: [g, g', e1, e0]
#  where:
#   g is the LDE generator
#   g' is the domain generator
#   e1 is g'^{trace_len-1}
#   e0 is g'^{trace_len-2}
prod.get_constants
  # binary search ...
end

@hackaugusto hackaugusto force-pushed the hacka-masm-load-exemption-points branch from 8c98c00 to 4ad8636 Compare June 22, 2023 16:23
@hackaugusto
Copy link
Contributor Author

@bobbinth @Al-Kindi-0 anything else needs to be done on this PR? This procedure is a dependency for the rest of the divisor code

The generated procedure takes the approach of pre-computing the values
during compilations, and then searching for the correct exemption points
during runtime, doing a binary search on the runtime trace length to
find the correct points.

The alternative of computing the points during runtime or creating a
lookup table would have worked, but both approaches would be sligthly
slower in comparison.

The data is kept in the stack for 1. testability 2. performance, since
mem load/stores are no longer necessary.
@hackaugusto hackaugusto force-pushed the hacka-masm-load-exemption-points branch from 4ad8636 to 5172df4 Compare June 26, 2023 05:37
@Al-Kindi-0
Copy link
Collaborator

@bobbinth @Al-Kindi-0 to clarify. Is the idea is to the following?

1. rename `get_exemption_points` to something like `get_constants`

2. the procedure will be:
# Input: [trace_len, ...]
# Output: [g, g', e1, e0]
#  where:
#   g is the LDE generator
#   g' is the domain generator
#   e1 is g'^{trace_len-1}
#   e0 is g'^{trace_len-2}
prod.get_constants
  # binary search ...
end

Yes, it makes sense to me. The main work of the procedure is getting the the LDE domain generator corresponding to the LDE domain using the binary search approach. Once that is found, we can get the other 3 by multiplications and an inversion. The trace domain generator is just the LDE domain generator to the power $2^3$ and the exemption points are found by a simple inversion and a multiplication.
If I compute correctly, the total cost is the cost of the binary search plus $(3 + 1)$ multiplications plus $1$ inversion in the field.

@bobbinth
Copy link
Contributor

  1. rename get_exemption_points to something like get_constants
  2. the procedure will be:
# Input: [trace_len, ...]
# Output: [g, g', e1, e0]
#  where:
#   g is the LDE generator
#   g' is the domain generator
#   e1 is g'^{trace_len-1}
#   e0 is g'^{trace_len-2}
prod.get_constants
  # binary search ...
end

I guess one question for me is whether there is significant benefit for doing binary search. If we are saving something like a dozen of cycles or so, I'd still probably prefer a simple exponentiation because it make the output much easier to reason about.

@hackaugusto
Copy link
Contributor Author

hackaugusto commented Jun 26, 2023

If we are saving something like a dozen of cycles or so

It is probably less than 100 cycles. Changing to the exponentiation approach

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants