Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

profile: new column for time of the function without the time of its children functions #24056

Merged
merged 1 commit into from
Mar 27, 2025

Conversation

Eliyaan
Copy link
Contributor

@Eliyaan Eliyaan commented Mar 26, 2025

Adds a new column to the profiler output: time spent in the current function without the time spent in its children
Known problem: for multithreaded programs the time calculations are wrong and might show negative times.

Sample profiling output:

             1          0.034ms          0.007ms          34003ns rand__init
          1560          2.501ms          0.959ms           1603ns rand__PRNG_u32n
          1560          3.264ms          0.443ms           2092ns rand__PRNG_intn
          1560          4.015ms          0.432ms           2574ns rand__PRNG_int_in_range
          3120          1.064ms          0.419ms            341ns rand__PRNG_f64
          3120          2.355ms          0.678ms            755ns rand__PRNG_f64n
          3120          3.686ms          0.697ms           1181ns rand__PRNG_f64_in_range
             1          0.014ms          0.001ms          14173ns rand__new_default
          1560          0.137ms          0.137ms             88ns rand__int_in_range
          3120          0.186ms          0.186ms             60ns rand__f64_in_range
       7309361        605.565ms        605.565ms             83ns main__sqrt
       2015520        120.669ms        120.669ms             60ns main__Particle_update_pos
      11340456        647.577ms        647.577ms             57ns main__Particle_correct_constraints_square
       2015520         99.668ms         99.668ms             49ns main__Particle_accelerate
           312        107.600ms         49.533ms         344871ns main__App_init_opti_list
             1      10359.919ms          0.003ms    10359918908ns main__main
             2          0.000ms          0.000ms             86ns main__ceil
          3224       4097.998ms       2954.570ms        1271091ns main__App_solve_collisions
           403       4940.062ms        378.137ms       12258219ns main__App_compute_step
           403      10075.576ms         31.644ms       25001429ns main__on_frame
            83          0.015ms          0.014ms            177ns main__on_event
           312         10.171ms          1.276ms          32600ns main__App_spawn_parti
           312          0.082ms          0.082ms            264ns main__App_check_buttons

The effect is clearly seen on functions like main__main where no time is spent as it calls other functions.
It is useful when sorting on the 3rd column as functions that do not need to be optimized directly do not show:

          5642         96.640ms         96.640ms          17129ns fontstash__Context_draw_text
       2015520         99.668ms         99.668ms             49ns main__Particle_accelerate
       2015520        120.669ms        120.669ms             60ns main__Particle_update_pos
       2965680        203.247ms        203.247ms             69ns math__sinf
       2965680        210.164ms        210.164ms             71ns math__cosf
             1      10359.505ms        223.352ms    10359504750ns sokol__sapp__run
       5931360        270.991ms        270.991ms             46ns math__radians
           403       4940.062ms        378.137ms       12258219ns main__App_compute_step
       5965212        435.274ms        435.274ms             73ns sokol__sgl__v2f
           403        447.311ms        447.311ms        1109952ns sokol__sgl__draw
       7309361        605.565ms        605.565ms             83ns main__sqrt
      11340456        647.577ms        647.577ms             57ns main__Particle_correct_constraints_square
        250380       2232.434ms       1056.884ms           8916ns gg__Context_draw_polygon_filled
           403       2058.627ms       2058.627ms        5108255ns sokol__gfx__begin_pass
          3224       4097.998ms       2954.570ms        1271091ns main__App_solve_collisions

Copy link

Connected to Huly®: V_0.6-22447

@Eliyaan Eliyaan linked an issue Mar 26, 2025 that may be closed by this pull request
2 tasks
@Eliyaan
Copy link
Contributor Author

Eliyaan commented Mar 27, 2025

The failure seems to be caused by the chacha20 and crypto PRs

@spytheman
Copy link
Member

I've tried to add a case for the new column in vlib/v/slow_tests/profile/profile_test.v, but I could not figure out a reliable way check the values that would not have a lot of false positives in the CI 🤔 .

@spytheman
Copy link
Member

My test program, follows here (it works when run manually, and shows that the new column values are very useful):

import time
fn delay1() {
	time.sleep(10 * time.millisecond)
}
fn delay2() {
	time.sleep(20 * time.millisecond)
}
fn delay3() {
	time.sleep(30 * time.millisecond)
}
fn c() {
	delay1()
}
fn b() {
	c()
	delay2()
	c()
	mut s := 0
	for i := 0; i <= 100_000; i++ {
		s += i * 2
	}
	_ := s
}
fn a() {
	b()
	b()
	delay3()
}
fn main() {
	println('start')
	a()
	//	mut s := 0; for i:=0;i<=100_000;i++ { s += i*2 }; _ := s
	println('done')
}

with the following results on my machine:

#0 13:57:27 ^ profiler-without-children-time ~/v>./v -profile - run a.v |sort -nk2
             1          0.000ms          0.000ms            210ns gc_set_warn_proc
done
start
             1          0.004ms          0.004ms           3865ns builtin_init
             4          0.006ms          0.006ms           1524ns _write_buf_to_fd
             2          0.007ms          0.001ms           3351ns _writeln_to_fd
             2          0.007ms          0.001ms           3637ns println
             1          0.012ms          0.012ms          12411ns _v_malloc
             1          0.013ms          0.000ms          12803ns memdup
             1         30.066ms          0.001ms       30065782ns main__delay3
             2         40.165ms          0.002ms       20082447ns main__delay2
             4         40.291ms          0.003ms       10072769ns main__delay1
             4         40.293ms          0.001ms       10073134ns main__c
             2         81.115ms          0.657ms       40557389ns main__b
             7        110.516ms        110.516ms       15788062ns time__sleep
             1        111.182ms          0.001ms      111181607ns main__a
             1        111.193ms          0.004ms      111192886ns main__main
#0 13:58:59 ^ profiler-without-children-time ~/v>./v -profile - run a.v |sort -nk3
             1          0.000ms          0.000ms            359ns gc_set_warn_proc
done
start
             1          0.019ms          0.001ms          19205ns memdup
             1        111.499ms          0.001ms      111499132ns main__a
             1         30.090ms          0.001ms       30089805ns main__delay3
             2          0.011ms          0.001ms           5280ns _writeln_to_fd
             2          0.011ms          0.001ms           5723ns println
             4         40.352ms          0.001ms       10088081ns main__c
             2         40.167ms          0.002ms       20083276ns main__delay2
             4         40.351ms          0.003ms       10087718ns main__delay1
             1          0.008ms          0.007ms           7840ns builtin_init
             1        111.517ms          0.007ms      111517498ns main__main
             4          0.010ms          0.010ms           2399ns _write_buf_to_fd
             1          0.018ms          0.018ms          18300ns _v_malloc
             2         81.409ms          0.890ms       40704258ns main__b
             7        110.601ms        110.601ms       15800130ns time__sleep
#0 13:59:02 ^ profiler-without-children-time ~/v>

Copy link
Member

@spytheman spytheman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent work, it is a very useful improvement for the -profile reports 🥳.

@spytheman spytheman merged commit a03d62e into vlang:master Mar 27, 2025
74 of 78 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

profile: new column with time per function call (excluding children's time)
2 participants