Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

YOLO3 workload #5

Open
manox opened this issue Aug 10, 2021 · 7 comments
Open

YOLO3 workload #5

manox opened this issue Aug 10, 2021 · 7 comments

Comments

@manox
Copy link

manox commented Aug 10, 2021

Hi @abejgonzalez,
In an chipyard NVDLA integration PR conversation (ucb-bar/chipyard#505 (comment)) you said you have been able to run a YOLO3 workload.
Is this still possible and if so, can you briefly explain how? Thank you!

@abejgonzalez
Copy link
Collaborator

IIRC I just followed along with the original documentation here: https://github.com/CSL-KU/firesim-nvdla#running-yolov3-on-nvdla. You will probably have to modify the FireMarshal workload files / FireSim config files a bit (FireMarshal: to match the FireMarshal version in CY, inherit from the nvdla workload nvdla-base in this repo)(FireSim: to match the FireSim version in CY, point to a proper FireSim HW config).

@manox
Copy link
Author

manox commented Sep 2, 2021

Thank you @abejgonzalez,
I ran into the problem, that it needs GLIBC 2.26 but there is a higher version in the generated linux. Do I need to use an older linux version now?

# cd darknet-nvdla/
# ./solo.sh 
./darknet: /lib/libpthread.so.0: version `GLIBC_2.26' not found (required by ./darknet)
./darknet: /lib/libm.so.6: version `GLIBC_2.26' not found (required by ./darknet)
./darknet: /lib/libc.so.6: version `GLIBC_2.26' not found (required by ./darknet)
./darknet: /lib/libc.so.6: version `GLIBC_2.26' not found (required by /root/darknet-nvdla/libodlalayer.so)
./darknet: /lib/libpthread.so.0: version `GLIBC_2.26' not found (required by /root/darknet-nvdla/libdarknet.so)
./darknet: /lib/libm.so.6: version `GLIBC_2.26' not found (required by /root/darknet-nvdla/libdarknet.so)
./darknet: /lib/libc.so.6: version `GLIBC_2.26' not found (required by /root/darknet-nvdla/libdarknet.so)
./darknet: /lib/libm.so.6: version `GLIBC_2.26' not found (required by /root/darknet-nvdla/libnvdla_runtime.so)
./darknet: /lib/libc.so.6: version `GLIBC_2.26' not found (required by /root/darknet-nvdla/libnvdla_runtime.so)

@abejgonzalez
Copy link
Collaborator

Frankly I don't remember since this was so long ago. I would try to update the YOLO3 workload instead of going to a lower version of Linux... Sorry I can't be of more help.

@manox
Copy link
Author

manox commented Sep 8, 2021

Thank you @abejgonzalez, I am grateful for any help. Managed to build the YOLO3 workload with the newer libraries. Now the execution hangs at the following point. Maybe you can give me a hint where it could fail here.

# ./solo.sh 
learning_rate: Using default '0.001000'
momentum: Using default '0.900000'
decay: Using default '0.000100'
policy: Using default 'constant'
max_batches: Using default '0'
layer     filters    size              input                output
    0 offset: Using default '0.000000'
shifter: Using default '0'
post_offset: Using default '0.000000'
post_scale: Using default '1.000000'
outputs 692224 num_out 5537792
    1 odla          tensor 0  416 x 416 x   4   ->    52 x  52 x 256
odla          tensor 1  416 x 416 x   4   ->    26 x  26 x 512
odla          tensor 2  416 x 416 x   4   ->    13 x  13 x 255
odla          tensor 3  416 x 416 x   4   ->    13 x  13 x 256
    2 input layer 1 tensor 3
make_split_layer input layer index 1 tensor 3
split          tensor 3   13 x  13 x 256   ->    13 x  13 x 256
    3 out layer 5 tensor 0
    4 input layer 1 tensor 2
make_split_layer input layer index 1 tensor 2
split          tensor 2   13 x  13 x 255   ->    13 x  13 x 255
    5 post_offset: Using default '0.000000'
outputs 43095 num_out 43264
    6 yolo
    7 input layer 1 tensor 1
make_split_layer input layer index 1 tensor 1
split          tensor 1   26 x  26 x 512   ->    26 x  26 x 512
    8 odla          tensor 0   26 x  26 x 512   ->    26 x  26 x 255
odla          tensor 1   26 x  26 x 512   ->    26 x  26 x 128
    9 input layer 8 tensor 0
make_split_layer input layer index 8 tensor 0
split          tensor 0   26 x  26 x 255   ->    26 x  26 x 255
   10 post_offset: Using default '0.000000'
outputs 172380 num_out 173056
   11 yolo
   12 input layer 8 tensor 1
make_split_layer input layer index 8 tensor 1
split          tensor 1   26 x  26 x 128   ->    26 x  26 x 128
   13 out layer 2 tensor 0
   14 input layer 1 tensor 0
make_split_layer input layer index 1 tensor 0
split          tensor 0   52 x  52 x 256   ->    52 x  52 x 256
   15 odla          tensor 0   52 x  52 x 256   ->    52 x  52 x 255
   16 input layer 15 tensor 0
make_split_layer input layer index 15 tensor 0
split          tensor 0   52 x  52 x 255   ->    52 x  52 x 255
   17 post_offset: Using default '0.000000'
outputs 689520 num_out 692224
   18 yolo
Loading weights from yolov3-odla.cfg...Done!
#### input image size c=4 h=416 w=416
[  726.028421] INFO: task darknet:166 blocked for more than 120 seconds.
[  726.028689]       Tainted: G           O      5.7.0-rc3-58540-g66e8cf3 #3
[  726.028915] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  726.029164] darknet         D    0   166    165 0x00000000
[  726.032899] Call Trace:
[  726.035219] [<ffffffe005482a72>] __schedule+0x18a/0x416
[  726.040546] [<ffffffe005482d40>] schedule+0x42/0xb2
[  726.045392] [<ffffffe005485b6c>] schedule_timeout+0x1ba/0x24c
[  726.051139] [<ffffffe005483f1e>] wait_for_completion+0x6e/0x140
[  726.060449] [<ffffffdf85cc1a70>] nvdla_task_submit+0x44/0xa4 [opendla]
[  726.066216] [<ffffffdf85cc1dfe>] nvdla_submit+0xa4/0xf8 [opendla]
[  726.069659] [<ffffffe00512877e>] drm_ioctl_kernel+0x6e/0xaa
[  726.075198] [<ffffffe005128a98>] drm_ioctl+0x184/0x286
[  726.080326] [<ffffffe004f19c0a>] ksys_ioctl+0x144/0x61e
[  726.085529] [<ffffffe004f1a0f4>] sys_ioctl+0x10/0x18
[  726.090517] [<ffffffe004e011a4>] ret_from_syscall+0x0/0x2

@Yuxin-Yu
Copy link

Hi @manox .Have you fix this problem about GLIBC_2.26?

@manox
Copy link
Author

manox commented Jul 20, 2023

Hi @manox .Have you fix this problem about GLIBC_2.26?

That was quite a long time ago and I don't think I followed it up. Sorry.

@Yuxin-Yu
Copy link

Hello @manox , I have resolved this issue. I previously used the official nvdla_runtime, so displays GLIBC error, but when I use the nvdla-workload/nvdla-base/build-umd.sh script to generate my own nvdla_ runtime, it runs normally.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants