Skip to content

Commit ab3ae92

Browse files
committed
describe M1 and PGO
1 parent 7689b90 commit ab3ae92

File tree

1 file changed

+2
-0
lines changed

1 file changed

+2
-0
lines changed

README.md

+2
Original file line numberDiff line numberDiff line change
@@ -12,6 +12,8 @@ For an 80MB gzipped log file containing 915,427 JSON event objects (which is 1.0
1212

1313
This is... very good. For comparison, a Python script that used AWS Glue to do something similar took about _30 minutes_. My first approach of writing a `nom` parser-combinator to parse the User Agent field, instead of using a regex, took 18.7 seconds. Processing a gigabyte of almost a million JSON objects into useful histograms in less than 8 seconds just blows my mind. But then I figured out how to use Rayon, and now it can parse 8 gzipped log files in parallel on an 8-core MacBook Pro, and that's super fast.
1414

15+
Then Rust got more optimized and Apple released the M1, and it got still faster. Finally, and I found the [profile-guided optimization](https://doc.rust-lang.org/rustc/profile-guided-optimization.html) docs, and it improved even more than I thought was still possible.
16+
1517
### Wait, _how_ fast?
1618

1719
~525 records/second/cpu in Python on Apache Spark in AWS Glue

0 commit comments

Comments
 (0)