-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[DO-NOT-MERGE] .ef optimization experiments #12907
base: main
Are you sure you want to change the base?
Conversation
note: code quality is "hacky", pls do not consider for review yet, only for evaluation of the overall solution + validating the strategy |
now that the prototype 1 is out, let me rebase it to main as it is 20 days outdated... |
3ea3702
to
9e5031a
Compare
9e7c1b2
to
ae6c0bf
Compare
As part of #12907 I'll have other iterator-like sequence implementations. It makes sense to generalize the ErrEliasFanoIterExhausted and move it to a common package and reuse it, rather than making a bunch of IteratorExhausted-like for each implementation.
ae6c0bf
to
6d7f604
Compare
Current status:
Next steps:
|
@wmitsuda FYI: maybe next commands may help to test files |
928dc9d
to
b0c3db9
Compare
Current status:
so, we can conclude data is correct + size reduction is the expected one so far. |
from now on, I think we are ready to advance one more step towards making it stable:
|
f5dfa8c
to
b6db9b9
Compare
} | ||
|
||
// simple encoding | ||
if data[0]&0b11110000 == 0b10000000 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use const
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
forgot to use the existing SimpleEncoding const... will also add one for the comparison mask
// from raw data | ||
func Count(baseNum uint64, data []byte) uint64 { | ||
// plain elias fano (legacy) | ||
if data[0]&0b10000000 == 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thats Simple Encoding constant value, probably zero bit could be just compared to 0b0 const
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no, because if the 8th bit is 0, meaning it is legacy elias fano encoding type, I'm using that entire byte as part of the following elias fano struct, so in theory the remaining 7 bits of data[0] will be used to decode elias fano.
anyway, will add more consts to clarify that comparison.
} | ||
|
||
// simple encoding | ||
if raw[0]&0b11110000 == 0b10000000 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
whats that mask 0xf0? using const is better, also seems can use just 2 bits instead of 8, have something to encode up to 2^6? :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
because I'm reserving the range [0b10000000, 0b100001111] to SimpleEncoding. I'm using the it to encode at the same time the type (SimpleEncoding) + the number of elements (up to 16) in the same byte and save space.
So, the masking + comparison is saying: ignore the less significant 4 bits, if it is 0b1000????, then it is SimpleEncoding.
yeah, will add consts, forgot it :)
I didn't understand the part about 2^6?
it.Seek(uint64(v)) | ||
} | ||
return it | ||
} else if s.currentEnc == PlainEliasFano || s.currentEnc == RebasedEliasFano { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
btw since constants are growing could just do s.currentEnc < threshold. So maybe reorder them as PlainEF, RebasedEF, and then SimpleEnc?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hm, problem here is that the consts are at the same time the type itself and the bit fragments that are used when encoding them, i.e., they are not 1, 2 and 3, but 0, 128 and 144.
I think I could do it, but I think the code now is clearer to read and understand?
Also, if we are going to merge it now, maybe when can even remove the plain elias-fano choice since it would not be used anymore in this part of code and have one less comparison.
Regarding comparing converted holesky vs. generated holesky (only snapshots dir), it seems now everything matches, except:
|
Given the overall results on holesky now seems satisfactory, now triggering a regeneration run on mainnet, should be done in < 1 week |
Status: mainnet sync from scratch still running after 1 week, at block 17.9M. note: mainnet sync performance noticeably drops after block 16M, going from +700Mgas/s -> 400-500Mgas/s. |
Status: mainnet sync from scratch done after 11 days, now proceeding to check data integrity. FYI: a few hours after reaching the tip I got: #13676 , but I think it has nothing to do with this change, so ignoring in this context. |
data seems good, diffs are similar to what I found on holesky. I have another node that is not converted, but new data is being generated in the new format, and new .ef files are equivalent to plain E3 ones. |
given our previous conversation about leaving this one post-E3 final, I'll keep merging main and running multiple rounds of sync from scratch on mainnet from now on in order to catch any regressions. |
Status:
in order to answer that question I conducted another experiment:
The expected result is:
if the above is true, then it means there are not errors introduced in other files by this PR and the currently published snapshots are not 100% correspondent to what should be produced by the current code (but it doesn't mean they are incorrect). the following comments will contain the result of this experiment for future reference. |
Results: (2) vs (3)
the attached diff shows the result is what we expected, hence validating our hypothesis this PR is correct. the only thing not expected was all the .kvei differs, but we'll assume it is not deterministic or it uses some seed/salt/etc we didn't properly configured at chain generation. |
Results: (1) vs (3)
noticed how many data files seems different, like regenerated .ef and .v files, making a direct comparison between published snapshots and regenerated ones unreliable, so I'll rely on (2) vs (3) to validate this PR. |
I'm taking this PR out of draft state in order to trigger the CI pipeline and because after the latest round of tests on mainnet I'm now confident about it's correctness. I'm leaving the do-not-merge tags because it is a one-way operation and that should be properly planned with the team. |
also, next step is merge recent main as current test was run against beta2. |
@AskAlexSharov I think I could use some preliminary review now if you have some time. We are not going to merge it yet, I've proposed an in-person session to discuss how the rollout will be coordinated for 3.1, but I think the code is good enough to be reviewed now so we don't have to rush last minute modifications later. |
commitment .kvei and .kvi - is index of .kv - also expected to be different |
If your PR changing .ef then .efi will also changed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks great.
left some cosmetic comments.
log.Println("Sumarizing idx files...") | ||
cEF := 0 | ||
for _, file := range files { | ||
if file.IsDir() || !strings.HasSuffix(file.Name(), ".ef") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if file.IsDir() || file.Type().IsRegular() ||`
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI: we have dir.ListFiles
helper
logger := debug.SetupCobra(cmd, "integration") | ||
|
||
// accessorDir := filepath.Join(datadirCli, "snapshots", "accessor") | ||
idxPath := filepath.Join(datadirCli, "snapshots", "idx") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI: we have dirs := datadir.New(datadirCli)
and dirs
object will have all paths.
For example: ilepath.Join(datadirCli, "snapshots", "idx")
can replace by dirs.SnapIdx
|
||
import ( | ||
"io/fs" | ||
"log" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please use "github.com/erigontech/erigon-lib/log/v3"
package for logging.
can be used by log.Info("message", "some_key", "some_value", "another_key", "another_value")
.
and remove log
package use
return uint64(data[0]&SimpleEncodingSizeMask) + 1 | ||
} | ||
|
||
panic("unknown encoding") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's add a bit info to this message - to make it easier to debug in future
No description provided.