String::remove_matches O(n^2) -> O(n) #83515

tamird · 2021-03-26T14:54:00Z

Copy only non-matching bytes. Replace collection of matches into a
vector with iteration over rejections, exploiting the guarantee that we
mutate parts of the haystack that have already been searched over.

r? @joshtriplett

tamird · 2021-03-29T16:53:32Z

cc @jcotton42 @pickfire

jcotton42 · 2021-03-29T16:55:37Z

Clever. Not sure why I didn't think of that.

library/alloc/src/string.rs

pickfire

Other than the mentioned comments it looks good to me but not sure if it improves the performance. Maybe we should do a perf run?

tamird · 2021-03-29T17:19:21Z

Not sure what a perf run would do - does this function have good benchmark coverage? Seems like no and since it's new it's unlikely to show up in any other perf paths.

pickfire · 2021-03-31T03:04:50Z

Not sure what a perf run would do - does this function have good benchmark coverage? Seems like no and since it's new it's unlikely to show up in any other perf paths.

No, it's not about benchmark. Although it does have benchmarks but perf run is to check if this affects compile time and runtime of generated code.

Dylan-DPC-zz · 2021-04-19T09:52:29Z

r? @m-ou-se

m-ou-se · 2021-05-05T14:09:39Z

After this change, all bytes are still copied/moved several times (exactly the number of matches before them). With 100 matches, the last segment gets moved 100 times.

You cannot efficiently do this in reverse: The last segment cannot be put into the right place right away, because its new place still holds data that needs to be moved. Doing this in order starting at the first match means that every segment can directly be moved into the right position. To do so, you'd have to look at the start of the next match to find the end of the segment to move.

tamird · 2021-05-05T15:14:38Z

@m-ou-se there's a kernel of truth to what you say, but the optimization you describe requires bookkeeping which does not exist before this change.

Consider this string:

[-----|match1|-----|match2|-----|match3|----|matchN|---]

When iterating in forward-order, all bytes after match1 will be copied, then all bytes after match2 will be copied, and so on. In order to do what you say, the implementation would have to defer copying the non-matching interstitial (e.g. between match1 and match2) until after the next match is found. I think it would be possible to implement that, but it's not in place today. Thoughts?

m-ou-se · 2021-05-05T15:28:21Z

@tamird Yes, exactly. The current implementation and your implementation are both O(n²). Iterating forwards allows for an O(n) implementation, by only copying the part until the next match.

would have to defer copying the non-matching interstitial (e.g. between match1 and match2) until after the next match is found. I think it would be possible to implement that, but it's not in place today.

The current implementation already finds all the matches and puts them all in a Vec before it moves anything, so all the information is already there.

bors · 2021-05-06T07:11:49Z

☔ The latest upstream changes (presumably #84266) made this pull request unmergeable. Please resolve the merge conflicts.

tamird · 2021-05-09T12:49:10Z

@m-ou-se how does this look?

m-ou-se · 2021-05-17T10:55:54Z

Very nice!

It'd be good to narrow the scope of the unsafe block by putting unsafe { .. } only around the copy_nonoverlapping and set_len lines, each with their own // SAFETY: comment.

tamird · 2021-05-17T13:09:54Z

Done. Replaced next_match with next_reject to simplify things and also removed batching up matches into a vector made possible by that change.

library/alloc/src/string.rs

Copy only non-matching bytes.

m-ou-se · 2021-06-06T12:41:19Z

@bors r+

bors · 2021-06-06T12:41:20Z

📌 Commit 977903b has been approved by m-ou-se

JohnTitor · 2021-06-07T05:10:35Z

Should be perf-sensitive, @bors rollup=never

bors · 2021-06-07T05:11:27Z

⌛ Testing commit 977903b with merge 5943cc133f74e8b9065e6c8c767c2506971a415b...

bors · 2021-06-07T08:01:41Z

💔 Test failed - checks-actions

rust-log-analyzer · 2021-06-07T08:01:46Z

A job failed! Check out the build log: (web) (plain)

Click to see the possible cause of the failure (guessed by this bot)

m-ou-se · 2021-06-07T11:43:46Z

Should be perf-sensitive, @bors rollup=never

This function isn't used anywhere in the standard library or compiler.

m-ou-se · 2021-06-07T11:44:48Z

@bors retry

bors · 2021-06-07T13:03:50Z

⌛ Testing commit 977903b with merge cec1b46a23d8b88c4afe74607d7a1d799b3eb025...

JohnTitor · 2021-06-07T14:05:55Z

This function isn't used anywhere in the standard library or compiler.

Oh sorry, my bad!

rust-log-analyzer · 2021-06-07T14:39:38Z

The job x86_64-apple failed! Check out the build log: (web) (plain)

Click to see the possible cause of the failure (guessed by this bot)

      Memory: 14 GB
      Boot ROM Version: VMW71.00V.13989454.B64.1906190538
      Apple ROM Info: [MS_VM_CERT/SHA1/27d66596a61c48dd3dc7216fd715126e33f59ae7]Welcome to the Virtual Machine
      SMC Version (system): 2.8f0
      Serial Number (system): VMb/Qc+zq+Fn

hw.ncpu: 3
hw.byteorder: 1234
hw.memsize: 15032385536
---
failures:

---- [ui] ui/abi/abi-sysv64-arg-passing.rs stdout ----

error: test compilation failed although it shouldn't!
status: signal: 9
command: "/Users/runner/work/rust/rust/build/x86_64-apple-darwin/stage2/bin/rustc" "/Users/runner/work/rust/rust/src/test/ui/abi/abi-sysv64-arg-passing.rs" "-Zthreads=1" "--target=x86_64-apple-darwin" "--error-format" "json" "-Ccodegen-units=1" "-Zui-testing" "-Zdeduplicate-diagnostics=no" "-Zemit-future-incompat-report" "-C" "prefer-dynamic" "-o" "/Users/runner/work/rust/rust/build/x86_64-apple-darwin/test/ui/abi/abi-sysv64-arg-passing/a" "-Crpath" "-O" "-Cdebuginfo=0" "-Zunstable-options" "-Lnative=/Users/runner/work/rust/rust/build/x86_64-apple-darwin/native/rust-test-helpers" "-L" "/Users/runner/work/rust/rust/build/x86_64-apple-darwin/test/ui/abi/abi-sysv64-arg-passing/auxiliary"
------------------------------------------

------------------------------------------
stderr:
---

Some tests failed in compiletest suite=ui mode=ui host=x86_64-apple-darwin target=x86_64-apple-darwin


command did not execute successfully: "/Users/runner/work/rust/rust/build/x86_64-apple-darwin/stage0-tools-bin/compiletest" "--compile-lib-path" "/Users/runner/work/rust/rust/build/x86_64-apple-darwin/stage2/lib" "--run-lib-path" "/Users/runner/work/rust/rust/build/x86_64-apple-darwin/stage2/lib/rustlib/x86_64-apple-darwin/lib" "--rustc-path" "/Users/runner/work/rust/rust/build/x86_64-apple-darwin/stage2/bin/rustc" "--src-base" "/Users/runner/work/rust/rust/src/test/ui" "--build-base" "/Users/runner/work/rust/rust/build/x86_64-apple-darwin/test/ui" "--stage-id" "stage2-x86_64-apple-darwin" "--suite" "ui" "--mode" "ui" "--target" "x86_64-apple-darwin" "--host" "x86_64-apple-darwin" "--llvm-filecheck" "/Users/runner/work/rust/rust/build/x86_64-apple-darwin/llvm/build/bin/FileCheck" "--nodejs" "/usr/local/bin/node" "--npm" "/usr/local/bin/npm" "--host-rustcflags" "-Crpath -O -Cdebuginfo=0 -Zunstable-options  -Lnative=/Users/runner/work/rust/rust/build/x86_64-apple-darwin/native/rust-test-helpers" "--target-rustcflags" "-Crpath -O -Cdebuginfo=0 -Zunstable-options  -Lnative=/Users/runner/work/rust/rust/build/x86_64-apple-darwin/native/rust-test-helpers" "--docck-python" "/usr/local/opt/python@3.9/bin/python3.9" "--lldb-python" "/usr/bin/python3" "--lldb-version" "lldb-1200.0.44.2\nApple Swift version 5.3.2 (swiftlang-1200.0.45 clang-1200.0.32.28)\n" "--lldb-python-dir" "/Applications/Xcode_12.4.app/Contents/SharedFrameworks/LLDB.framework/Resources/Python3" "--llvm-version" "12.0.1-rust-1.54.0-nightly" "--llvm-components" "aarch64 aarch64asmparser aarch64codegen aarch64desc aarch64disassembler aarch64info aarch64utils aggressiveinstcombine all all-targets analysis arm armasmparser armcodegen armdesc armdisassembler arminfo armutils asmparser asmprinter avr avrasmparser avrcodegen avrdesc avrdisassembler avrinfo binaryformat bitreader bitstreamreader bitwriter bpf bpfasmparser bpfcodegen bpfdesc bpfdisassembler bpfinfo cfguard codegen core coroutines coverage debuginfocodeview debuginfodwarf debuginfogsym debuginfomsf debuginfopdb demangle dlltooldriver dwarflinker engine executionengine extensions filecheck frontendopenacc frontendopenmp fuzzmutate globalisel hellonew hexagon hexagonasmparser hexagoncodegen hexagondesc hexagondisassembler hexagoninfo instcombine instrumentation interfacestub interpreter ipo irreader jitlink libdriver lineeditor linker lto mc mca mcdisassembler mcjit mcparser mips mipsasmparser mipscodegen mipsdesc mipsdisassembler mipsinfo mirparser msp430 msp430asmparser msp430codegen msp430desc msp430disassembler msp430info native nativecodegen nvptx nvptxcodegen nvptxdesc nvptxinfo objcarcopts object objectyaml option orcjit orcshared orctargetprocess passes powerpc powerpcasmparser powerpccodegen powerpcdesc powerpcdisassembler powerpcinfo profiledata remarks riscv riscvasmparser riscvcodegen riscvdesc riscvdisassembler riscvinfo runtimedyld scalaropts selectiondag sparc sparcasmparser sparccodegen sparcdesc sparcdisassembler sparcinfo support symbolize systemz systemzasmparser systemzcodegen systemzdesc systemzdisassembler systemzinfo tablegen target textapi transformutils vectorize webassembly webassemblyasmparser webassemblycodegen webassemblydesc webassemblydisassembler webassemblyinfo windowsmanifest x86 x86asmparser x86codegen x86desc x86disassembler x86info xray" "--cc" "" "--cxx" "" "--cflags" "" "--adb-path" "adb" "--adb-test-dir" "/data/tmp/work" "--android-cross-path" "" "--color" "always"


failed to run: /Users/runner/work/rust/rust/build/bootstrap/debug/bootstrap --stage 2 test
Build completed unsuccessfully in 1:26:23

bors · 2021-06-07T14:40:21Z

💔 Test failed - checks-actions

bors · 2021-06-08T01:05:57Z

⌛ Testing commit 977903b with merge dda4a88...

bors · 2021-06-08T03:47:03Z

☀️ Test successful - checks-actions
Approved by: m-ou-se
Pushing dda4a88 to master...

rust-highfive assigned joshtriplett Mar 26, 2021

rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Mar 26, 2021

jcotton42 reviewed Mar 29, 2021

View reviewed changes

library/alloc/src/string.rs Outdated Show resolved Hide resolved

pickfire reviewed Mar 29, 2021

View reviewed changes

library/alloc/src/string.rs Outdated Show resolved Hide resolved

pickfire reviewed Mar 29, 2021

View reviewed changes

library/alloc/src/string.rs Outdated Show resolved Hide resolved

pickfire approved these changes Mar 29, 2021

View reviewed changes

tamird force-pushed the string-remove-matches-rev branch from 0e052e2 to ba528fb Compare March 29, 2021 17:17

JohnCSimon added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Apr 18, 2021

rust-highfive assigned m-ou-se and unassigned joshtriplett Apr 19, 2021

tamird force-pushed the string-remove-matches-rev branch from ba528fb to 4055e71 Compare May 9, 2021 12:48

tamird force-pushed the string-remove-matches-rev branch from 4055e71 to 38e8408 Compare May 9, 2021 12:50

tamird changed the title ~~String::remove_matches in reverse~~ String::remove_matches O(n^2) -> O(n) May 9, 2021

tamird force-pushed the string-remove-matches-rev branch from 38e8408 to 65adf91 Compare May 17, 2021 13:09

tmiasko reviewed Jun 6, 2021

View reviewed changes

library/alloc/src/string.rs Outdated Show resolved Hide resolved

tamird added 2 commits June 6, 2021 08:06

Use iter::from_fn in String::remove_matches

38013e7

String::remove_matches O(n^2) -> O(n)

977903b

Copy only non-matching bytes.

tamird force-pushed the string-remove-matches-rev branch from bc17b9c to 977903b Compare June 6, 2021 12:07

bors added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. labels Jun 7, 2021

bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jun 7, 2021

bors added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. labels Jun 7, 2021

bors added the merged-by-bors This PR was explicitly merged by bors. label Jun 8, 2021

bors merged commit dda4a88 into rust-lang:master Jun 8, 2021

rustbot added this to the 1.54.0 milestone Jun 8, 2021

tamird deleted the string-remove-matches-rev branch June 8, 2021 04:56

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

String::remove_matches O(n^2) -> O(n) #83515

String::remove_matches O(n^2) -> O(n) #83515

tamird commented Mar 26, 2021 •

edited

Loading

tamird commented Mar 29, 2021

jcotton42 commented Mar 29, 2021

pickfire left a comment

tamird commented Mar 29, 2021

pickfire commented Mar 31, 2021

Dylan-DPC-zz commented Apr 19, 2021

m-ou-se commented May 5, 2021

tamird commented May 5, 2021

m-ou-se commented May 5, 2021

bors commented May 6, 2021

tamird commented May 9, 2021

m-ou-se commented May 17, 2021

tamird commented May 17, 2021

m-ou-se commented Jun 6, 2021

bors commented Jun 6, 2021

JohnTitor commented Jun 7, 2021

bors commented Jun 7, 2021

bors commented Jun 7, 2021

rust-log-analyzer commented Jun 7, 2021

m-ou-se commented Jun 7, 2021

m-ou-se commented Jun 7, 2021

bors commented Jun 7, 2021

JohnTitor commented Jun 7, 2021

rust-log-analyzer commented Jun 7, 2021

bors commented Jun 7, 2021

bors commented Jun 8, 2021

bors commented Jun 8, 2021

String::remove_matches O(n^2) -> O(n) #83515

String::remove_matches O(n^2) -> O(n) #83515

Conversation

tamird commented Mar 26, 2021 • edited Loading

tamird commented Mar 29, 2021

jcotton42 commented Mar 29, 2021

pickfire left a comment

Choose a reason for hiding this comment

tamird commented Mar 29, 2021

pickfire commented Mar 31, 2021

Dylan-DPC-zz commented Apr 19, 2021

m-ou-se commented May 5, 2021

tamird commented May 5, 2021

m-ou-se commented May 5, 2021

bors commented May 6, 2021

tamird commented May 9, 2021

m-ou-se commented May 17, 2021

tamird commented May 17, 2021

m-ou-se commented Jun 6, 2021

bors commented Jun 6, 2021

JohnTitor commented Jun 7, 2021

bors commented Jun 7, 2021

bors commented Jun 7, 2021

rust-log-analyzer commented Jun 7, 2021

m-ou-se commented Jun 7, 2021

m-ou-se commented Jun 7, 2021

bors commented Jun 7, 2021

JohnTitor commented Jun 7, 2021

rust-log-analyzer commented Jun 7, 2021

bors commented Jun 7, 2021

bors commented Jun 8, 2021

bors commented Jun 8, 2021

tamird commented Mar 26, 2021 •

edited

Loading