Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Issue]: bh-hip hangs #3751

Open
jinz2014 opened this issue Feb 22, 2025 · 1 comment
Open

[Issue]: bh-hip hangs #3751

jinz2014 opened this issue Feb 22, 2025 · 1 comment

Comments

@jinz2014
Copy link

Problem Description

In the following code snippet, the program seems to hang when executing the SummarizationKernel kernel.

  for (step = 0; step < timesteps; step++) {
    BoundingBoxKernel<<<blocks * FACTOR1, THREADS1>>>(
        nnodes, nbodies, d_start, d_child, d_posMass, d_max, d_min,
        d_radius, d_bottom, d_step, d_blkcnt );

    ClearKernel1<<<blocks, 256>>>(nnodes, nbodies, d_child);

    TreeBuildingKernel<<<blocks * FACTOR2, THREADS2>>>(
        nnodes, nbodies, d_child, d_posMass, d_radius, d_bottom);

    ClearKernel2<<<blocks, 256>>>(nnodes, d_start, d_posMass, d_bottom);

    SummarizationKernel<<<blocks * FACTOR3, THREADS3>>>(
        nnodes, nbodies, d_count, d_child, d_posMass, d_bottom);

    SortKernel<<<blocks * FACTOR4, THREADS4>>>(
        nnodes, nbodies, d_sort, d_count, d_start, d_child, d_bottom);

    ForceCalculationKernel<<<blocks * FACTOR5, THREADS5>>>(
        nnodes, nbodies, dthf, itolsq, epssq, d_sort, d_child, d_posMass,
        d_vel, d_accVel, d_radius, d_step);

    IntegrationKernel<<<blocks * FACTOR6, THREADS6>>>(
        nbodies, dtime, dthf, d_posMass, d_vel, d_accVel);
  }
  hipDeviceSynchronize();

Operating System

22.04.5 LTS (Jammy Jellyfish)"

CPU

AMD Ryzen Threadripper 3970X 32-Core Processor

GPU

AMD Radeon RX 6900 XT

ROCm Version

rocm-6.3.2

ROCm Component

No response

Steps to Reproduce

  1. hipify the cuda code in https://github.com/zjin-lcf/HeCBench/tree/master/src/bh-cuda/main.cu

  2. build the HIP program: hipcc -O3 main.cu -o main

3 run the HIP program: ./main 10000 10

(Optional for Linux users) Output of /opt/rocm/bin/rocminfo --support

No response

Additional Information

The program may generate memory access fault: zjin-lcf/HeCBench#104

@ppanchad-amd
Copy link

Hi @jinz2014. Internal ticket has been created to investigate this issue. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants