Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Edalize and dvsim #9514

Closed
davidp135 opened this issue Dec 3, 2021 · 35 comments
Closed

Edalize and dvsim #9514

davidp135 opened this issue Dec 3, 2021 · 35 comments

Comments

@davidp135
Copy link
Contributor

Hi,

I am looking at using Questa to simulate the designs within Opentitan. From what I understand dvsim does not support Questa. Looking around the repository there are a number of references to Edalize, but I can't see anywhere it is used.

It looks like Edalize supports the Questa simulator. Is Edalize currently being used for simulations at all?

Thanks

David

@tjaychen
Copy link

tjaychen commented Dec 3, 2021

@rswarbrick
@msfschaffner
@sriyerg

others can explain it better, but i think edalize is invoked by fusesoc. So we don't use it directly as part of dvsim.

@rswarbrick
Copy link
Contributor

rswarbrick commented Dec 3, 2021

We use fusesoc to gather up all the code that we want to simulate. Normally, fusesoc would then use edalize to invoke the backend EDA tool. However, we do things differently and grab the output from fusesoc then run the tool directly from dvsim's code. I'm afraid we don't have a Questa backend.

I doubt it would be particularly hard to add. You can probably just add a hw/dv/tools/dvsim/questa.hjson, maybe based on riviera.hjson in the same directory. I'm afraid we won't be able to help too much with debug though: I'm not sure how many of us have access to a Questa licence.

@sriyerg
Copy link
Contributor

sriyerg commented Dec 3, 2021

This is just a matter of adding questa.hjson setup. It was attempted before in #4635 (abandoned), which can be used as a starting point.

@davidp135
Copy link
Contributor Author

Thanks for the information. I have taken the questa.hjson from the #4635 ticket and have been trying to modify it to make it work. Unfortunately I am getting some errors. Firstly when running, util/dvsim/dvsim.py hw/ip/uart/dv/uart_sim_cfg.hjson -i uart_smoke --fixed-seed=1

I get the following:

** Error: ../src/lowrisc_dv_cip_lib_0/seq_lib/cip_base_vseq__shadow_reg_errors.svh(316): (vlog-13216) Arg. 'ptr' of 'csr_rd_check':  Illegal assignment to type 'class uvm_pkg::uvm_object' from type '<unknown>': Types are not assignment compatible.
** Error: ../src/lowrisc_dv_cip_lib_0/seq_lib/cip_base_vseq__shadow_reg_errors.svh(320): (vlog-13216) Arg. 'ptr' of 'csr_rd_check':  Illegal assignment to type 'class uvm_pkg::uvm_object' from type '<unknown>': Types are not assignment compatible.

Commenting out the csr_rd_check function call I get the error when it is used in cip_base_vseq__tl_errors

** Error: ../src/lowrisc_dv_cip_lib_0/seq_lib/cip_base_vseq__tl_errors.svh(334): (vlog-13216) Arg. 'ptr' of 'csr_rd_check':  Illegal assignment to type 'class uvm_pkg::uvm_object' from type '<unknown>': Types are not assignment compatible.

status_field in cip_base_vseq__shadow_reg_errors.svh and csr_field in cip_base_vseq__tl_errors.svh don't appear to be initialised anywhere. Do you have any ideas on how to fix this?

If I comment out that csr_rd_check call the simulation starts to run but I get a variety of assert failures which I am looking into.

@davidp135
Copy link
Contributor Author

This is all on the most up to date Questa software (21.4.1)

@rswarbrick
Copy link
Contributor

@cindychip / @weicaiyang: I think the shadow reg sequence and TL error sequences are yours, respectively. Any ideas?

@weicaiyang
Copy link
Contributor

IIUC, below case is similar to those 2 failed cases.

int a_array[dv_base_reg_field]; // dv_base_reg_field extends from uvm_object
foreach (a_array[i]) begin
  uvm_object obj = i;
end

We create an associative array using class object as index. In your cases, those associative arrays should be empty and foreach loop should not be even started.
looks like it's a compile error, assigning a child type to a parent type class should be fine. Seems like Questa doesn't recognize i type as dv_base_reg_field

@weicaiyang
Copy link
Contributor

I created a small case here.

It's passing with VCS and Xcelium but failing with Questa. It seems like a tool issue to me.

@davidp135
Copy link
Contributor Author

Thanks for the information and test case @weicaiyang . I have raised a support call with Siemens.

It does not look like there is an easy work around. If I comment out the csr_rd_check function calls for now, the design compiles but fails simulation. Could this be due to removing the those functions? Or will removing those functions just reduce the test coverage?

@weicaiyang
Copy link
Contributor

Thanks for the information and test case @weicaiyang . I have raised a support call with Siemens.

It does not look like there is an easy work around. If I comment out the csr_rd_check function calls for now, the design compiles but fails simulation. Could this be due to removing the those functions? Or will removing those functions just reduce the test coverage?

If you're running uart_smoke, those 2 lines are not even executed. The test should pass. What kind of error do you see? If it's an assertion error, can you check if it make sense or attach a wave here?

@davidp135
Copy link
Contributor Author

Thanks, I think I am a bit further through the error process now and have reached the same error as in #4377 where @svenka3 found a custom UVM compile was required to change UVM_REG_ADDR_WIDTH down to 32.

Siemens have got back to me about the error when using a class object as an index to an associative array. This is indeed a bug on their end. Hopefully they can fix that in the next Questa release.

@davidp135
Copy link
Contributor Author

Ah actually using @svenka3 's script gets around that. I now have 2 assert failures for tlul_assert.sv:281 and lots of failures for tlul_assert.sv:238:

# UVM_ERROR @   1650568 ps: (tlul_assert.sv:281) [ASSERT FAILED] dReadyKnown_A
# UVM_ERROR @   1660984 ps: (tlul_assert.sv:238) [ASSERT FAILED] pendingReqPerSrc_M

Plus a few of these:

# UVM_ERROR @ 2650552 ps: (tl_host_driver.sv:214) uvm_test_top.env.m_tl_agent_uart_reg_block.driver [uvm_test_top.env.m_tl_agent_uart_reg_block.driver] Cannot find request matching d_source 0x9d

for d_source values 7e and 9d. Annoying I can't currently get any internal waveforms due to I think having to pass the -mfcu flag into the vlog command. I do get wiggles externally in the interfaces however:
image

@weicaiyang
Copy link
Contributor

Ah actually using @svenka3 's script gets around that. I now have 2 assert failures for tlul_assert.sv:281 and lots of failures for tlul_assert.sv:238:

# UVM_ERROR @   1650568 ps: (tlul_assert.sv:281) [ASSERT FAILED] dReadyKnown_A
# UVM_ERROR @   1660984 ps: (tlul_assert.sv:238) [ASSERT FAILED] pendingReqPerSrc_M

Plus a few of these:

# UVM_ERROR @ 2650552 ps: (tl_host_driver.sv:214) uvm_test_top.env.m_tl_agent_uart_reg_block.driver [uvm_test_top.env.m_tl_agent_uart_reg_block.driver] Cannot find request matching d_source 0x9d

for d_source values 7e and 9d. Annoying I can't currently get any internal waveforms due to I think having to pass the -mfcu flag into the vlog command. I do get wiggles externally in the interfaces however: image

Thanks for the waveform. We should not have an unknown value on d_ready after reset. The other error may be related that as well. Can you try to trace down why this unknown happens? Unfortunately, we need to dump internal signals in order to debug this

@davidp135
Copy link
Contributor Author

I can now view the internal signals and have been looking at the unknown d_ready signal after reset issue.

It turns out this has already been looked at in #4427. The fixes there still work now and the UART smoke test will pass. That is:

Add clocking to the d_ready signal at line 57 of tl_host_driver.sv

invalidate_a_channel();
 // cfg.vif.h2d_int.d_ready <= 1'b0;
cfg.vif.host_cb.h2d_int.d_ready <= 1'b0;

and the other h2d signals line 233 onwards:

  function void invalidate_a_channel();
    // cfg.vif.h2d_int.a_opcode <= tlul_pkg::tl_a_op_e'('x);
    // cfg.vif.h2d_int.a_param <= '{default:'x};
    // cfg.vif.h2d_int.a_size <= '{default:'x};
    // cfg.vif.h2d_int.a_source <= '{default:'x};
    // cfg.vif.h2d_int.a_address <= '{default:'x};
    // cfg.vif.h2d_int.a_mask <= '{default:'x};
    // cfg.vif.h2d_int.a_data <= '{default:'x};
    // // The assignment to tl_type must have a cast since the LRM doesn't allow enum assignment of
    // // values not belonging to the enumeration set.
    // cfg.vif.h2d_int.a_user <= '{instr_type:prim_mubi_pkg::mubi4_t'('x), default:'x};
    // cfg.vif.h2d_int.a_valid <= 1'b0;

    cfg.vif.host_cb.h2d_int.a_opcode <= tlul_pkg::tl_a_op_e'('x);
    cfg.vif.host_cb.h2d_int.a_param <= '{default:'x};
    cfg.vif.host_cb.h2d_int.a_size <= '{default:'x};
    cfg.vif.host_cb.h2d_int.a_source <= '{default:'x};
    cfg.vif.host_cb.h2d_int.a_address <= '{default:'x};
    cfg.vif.host_cb.h2d_int.a_mask <= '{default:'x};
    cfg.vif.host_cb.h2d_int.a_data <= '{default:'x};
    // The assignment to tl_type must have a cast since the LRM doesn't allow enum assignment of
    // values not belonging to the enumeration set.
    cfg.vif.host_cb.h2d_int.a_user <= '{instr_type:prim_mubi_pkg::mubi4_t'('x), default:'x};
    cfg.vif.host_cb.h2d_int.a_valid <= 1'b0;
  endfunction : invalidate_a_channel

@weicaiyang it looks like the lack of cb is there on purpose. Here is the waveform that you asked for with h2d.d_ready and h2d_int.d_ready as it comes out of reset:
image

If I do as you suggested and drive all h2d signals both with and without host_cb the test also runs and passes.

What do you think is the route forward here? The changes are uploaded on my forked feature/questa_support branch.

Thanks

David

@weicaiyang
Copy link
Contributor

Thanks for trying that out.
Looks like Questa doesn't handle driving non-cb block like the other simulator. From my understanding, that should be OK and it should take effect immediately without waiting for next active edge. Can you check with the tool R&D team and see how they understand this behavior? Thanks

@davidp135
Copy link
Contributor Author

Sure, Il make a simplified example and send it off.

@davidp135
Copy link
Contributor Author

Hi, I have done a bit of digging and found that the issue isn't directly caused by the use of non clocking blocks. It is actually also to do with the use of packed structs for the tilelink signals.

In tl_host_driver.sv, h2d_int is of type tlul_pkg::tl_h2d_t. In reset the signals are set non clocking, and otherwise using clocking.

So at reset all of h2d_int is set at the same time, non clocking.

Then after this d_ready is set after d_ready_delay and all other h2d signals are set after a_valid_delay. For me d_ready_delay is longer than a_valid_delay. In the case that all signals in a packed struct have not been passed through a clocking block the value of d_ready is lost when the other h2d signals are set using a clocking block:
image

These issues looked to be only occuring because the h2d_int signals are combined into a packed struct then initialised with a clocking block at different times.

I have made a simple example in eda playground here: https://www.edaplayground.com/x/Xxbf

In dut_driver.sv I drive 2 sets of signals in the same way, but one set is inside a packed struct and the other is set of discrete signals.

With Questa I get this:
image

With VCS I get this:
image

Aldec Riviera behaves the same as Questa and Cadence Xcelium the same as VCS.

Il raise a ticket with Siemens now for Questa, but am not that hopeful for them changing the behaviour, plus it looks like Riviera will also have the same issues.

I assume non clocking assignments are only ever used at reset? Will there be any detrimental effect of adding a clocking assignment after the non clocking assignment after reset?

@weicaiyang
Copy link
Contributor

I assume non clocking assignments are only ever used at reset? Will there be any detrimental effect of adding a clocking assignment after the non clocking assignment after reset?

we need the valid signals to be known value right after reset. In top-level, we have multiple clock domain. If it's X after reset, it may cause X propagation.

It will be great if we can find a way that works for all the simulator. We basically need this clocking signal (valid) to be 0 right after reset, but reset is async and it is aligned with the clock.

@davidp135
Copy link
Contributor Author

Sure ok, hopefully they will come back with something good. I have just had an email saying they are forwarding the issue to their SV expert.

I have a few other bits I am looking at now:

Firstly run_pass_patterns in common_sim_cfg.hjson

The log file output must be a slightly different format to the other tools, in that there is # at the start of every line. For example

# [uvm_test_top.env.virtual_sequencer.uart_smoke_vseq]     2
# 
# 
# TEST PASSED CHECKS
#  _____         _                                  _ _ 
# |_   _|__  ___| |_   _ __   __ _ ___ ___  ___  __| | |
#   | |/ _ \/ __| __| | '_ \ / _` / __/ __|/ _ \/ _` | |
#   | |  __/\__ \ |_  | |_) | (_| \__ \__ \  __/ (_| |_|
#   |_|\___||___/\__| | .__/ \__,_|___/___/\___|\__,_(_)
#                     |_|                               

If I change
run_pass_patterns: ["^TEST PASSED (UVM_)?CHECKS$"] to
run_pass_patterns: ["^(# )?TEST PASSED (UVM_)?CHECKS$"]
the check for pass works. Is this an acceptable change for you? I assume it does not break the VCS implementation?

Like-wise for run_fail_patterns could change from

  run_fail_patterns:   ["^UVM_ERROR\\s[^:].*$",
                        "^UVM_FATAL\\s[^:].*$",
                        "^UVM_WARNING\\s[^:].*$",
                        "^Assert failed: ",
                        "^\\s*Offending '.*'",
                        "^TEST FAILED (UVM_)?CHECKS$",
                        "^Error:.*$"]  // ISS errors

to

  run_fail_patterns:   ["^(# )?UVM_ERROR\\s[^:].*$",
                        "^(# )?UVM_FATAL\\s[^:].*$",
                        "^(# )?UVM_WARNING\\s[^:].*$",
                        "^(# )?Assert failed: ",
                        "^\\s*Offending '.*'",
                        "^(# )?TEST FAILED (UVM_)?CHECKS$",
                        "^(# )?Error:.*$"]  // ISS errors

Secondly the uart_stress_all_with_rand_reset test
In uart_sim_cfg.hjson all tests: [... pass for me in Questa (currently with the clocking change in tl_host_driver.sv) apart from
uart_stress_all_with_rand_reset through running util/dvsim/dvsim.py hw/ip/uart/dv/uart_sim_cfg.hjson -i uart_stress_all_with_rand_reset --fixed-seed=1 --tool questa

I get two errors saying:
# UVM_ERROR @ 65277719193 ps: uvm_test_top.env.m_tl_agent_uart_reg_block.sequencer [uvm_test_top.env.virtual_sequencer.uart_common_vseq.tl_seq] Response queue overflow, response was dropped The second error is the same but at 89081989658 ps.

These are both on the falling edge of the dut reset input. There are about 6 other reset events through the test which do not cause a UVM_ERROR. Perhaps since this could be linked to the support call I already have with Siemens for the clocking/non-clocking stuff. Although the tl_i waveforms do look ok to me:
image

The invalidate_a_channel function in tl_host_driver.sv currently has the cb assignments set after the non-cb:

  function void invalidate_a_channel();
    // Set wth no clocking for immediate effect
    cfg.vif.h2d_int.a_opcode <= tlul_pkg::tl_a_op_e'('x);
    cfg.vif.h2d_int.a_param <= '{default:'x};
    cfg.vif.h2d_int.a_size <= '{default:'x};
    cfg.vif.h2d_int.a_source <= '{default:'x};
    cfg.vif.h2d_int.a_address <= '{default:'x};
    cfg.vif.h2d_int.a_mask <= '{default:'x};
    cfg.vif.h2d_int.a_data <= '{default:'x};
    // The assignment to tl_type must have a cast since the LRM doesn't allow enum assignment of
    // values not belonging to the enumeration set.
    cfg.vif.h2d_int.a_user <= '{instr_type:prim_mubi_pkg::mubi4_t'('x), default:'x};
    cfg.vif.h2d_int.a_valid <= 1'b0;

    // Set with clocking to prevent data loss in Questa and Riviera
    cfg.vif.host_cb.h2d_int.a_opcode <= tlul_pkg::tl_a_op_e'('x);
    cfg.vif.host_cb.h2d_int.a_param <= '{default:'x};
    cfg.vif.host_cb.h2d_int.a_size <= '{default:'x};
    cfg.vif.host_cb.h2d_int.a_source <= '{default:'x};
    cfg.vif.host_cb.h2d_int.a_address <= '{default:'x};
    cfg.vif.host_cb.h2d_int.a_mask <= '{default:'x};
    cfg.vif.host_cb.h2d_int.a_data <= '{default:'x};
    // The assignment to tl_type must have a cast since the LRM doesn't allow enum assignment of
    // values not belonging to the enumeration set.
    cfg.vif.host_cb.h2d_int.a_user <= '{instr_type:prim_mubi_pkg::mubi4_t'('x), default:'x};
    cfg.vif.host_cb.h2d_int.a_valid <= 1'b0;
  endfunction : invalidate_a_channel

Have you got any ideas about this error?

Thanks for all the help

@rswarbrick
Copy link
Contributor

For the run_pass_patterns / run_fail_patterns question: That sounds like a good idea to me. But I'd probably suggest generalising the fail patterns further to e.g. "^.*UVM_ERROR\\s[^:].*$" on general precautionary principles! You could always push those two changes up as a PR immediately: as you say, they shouldn't break anything and they are probably merge-able now.

@sriyerg
Copy link
Contributor

sriyerg commented Jan 7, 2022

I vaguely recall Questa having some kind of switch that disables prefixing all log lines with characters, but I could be wrong. Might be worth checking.

@weicaiyang
Copy link
Contributor

I get two errors saying:
UVM_ERROR @ 65277719193 ps: uvm_test_top.env.m_tl_agent_uart_reg_block.sequencer [uvm_test_top.env.virtual_sequencer.uart_common_vseq.tl_seq] Response queue overflow, response was dropped The second error is the same but at 89081989658 ps.

when reset occurs, tl_driver will immediately call finish_item(rsp) for all the request, so that these rsp will be put to the response queue. In the meanwhile, tl_seq will get the rsp from this queue. All of these happen at the same timestamp. Seems like in Questa, driver item put happens earlier than sequence item get, so the queue is overflow.

Can you try to remove this line or set it to -1? We have another check to make sure we don't receive more items than max_outstanding_req. Probably it's ok to remove it.

set_response_queue_depth(cfg.max_outstanding_req);

@davidp135
Copy link
Contributor Author

1
I can't find a switch for questa unfortunately. Il like the idea of the more generic run/fail patterns. Unfortunately "^.*UVM_ERROR\\s[^:].*$" will see the final output report as an error:

# Number of demoted UVM_FATAL reports  :    0
# Number of demoted UVM_ERROR reports  :    0
# Number of demoted UVM_WARNING reports:    0
# Number of caught UVM_FATAL reports   :    0
# Number of caught UVM_ERROR reports   :    0

I can't think of another way, unless there is a way of checking that reports : 0 doesn't exist at the end of the string?

2
Thanks @welcalyang, removing set_response_queue_depth(cfg.max_outstanding_req); works!

@rswarbrick
Copy link
Contributor

Ah! Good point about the run/fail patterns. Boo :-)

About the response queue: I don't see any reason not to make it unbounded.

@weicaiyang
Copy link
Contributor

Ah! Good point about the run/fail patterns. Boo :-)

About the response queue: I don't see any reason not to make it unbounded.

Agree with @rswarbrick. @davidp135 can you push a PR to make these changes? Thanks

@sriyerg
Copy link
Contributor

sriyerg commented Jan 10, 2022

1 I can't find a switch for questa unfortunately. Il like the idea of the more generic run/fail patterns. Unfortunately "^.*UVM_ERROR\\s[^:].*$" will see the final output report as an error:

# Number of demoted UVM_FATAL reports  :    0
# Number of demoted UVM_ERROR reports  :    0
# Number of demoted UVM_WARNING reports:    0
# Number of caught UVM_FATAL reports   :    0
# Number of caught UVM_ERROR reports   :    0

I can't think of another way, unless there is a way of checking that reports : 0 doesn't exist at the end of the string?

2 Thanks @welcalyang, removing set_response_queue_depth(cfg.max_outstanding_req); works!

The pass / fail patterns are set less generically precisely due to the issue you are seeing. If we make it generic, we may end up with more unforeseen issues down the road. I just realized that it was Aldec Riviera simulator that had the same problem (it would prefix KERNEL in front of all log lines which interfered with the pass / fail pattern detection). The Aldec Riviera AE made the following fix here: https://github.com/lowRISC/opentitan/blob/master/hw/dv/tools/riviera/riviera_run.do

If you could reach out to Questa AE, I think they can help provide a similar fix.

@davidp135
Copy link
Contributor Author

Good news, I have found a command to remove the "# " from the start of log file lines in Questa. So the current run and fail patterns should now work.

Sure, Il set up a PR for the response queue.

I have just heard back from Siemens about the clocking + non-clocking issue. They are asking if it would be possible to always set all signals in the struct at the same time rather than individually at different times. It looks to me like this would not be possible due to the difference in behaviour between d_ready and the a_* signals in h2d_int of type tlul_pkg::tl_h2d_t. Is this correct?

@weicaiyang
Copy link
Contributor

Good news, I have found a command to remove the "# " from the start of log file lines in Questa. So the current run and fail patterns should now work.

Sure, Il set up a PR for the response queue.

I have just heard back from Siemens about the clocking + non-clocking issue. They are asking if it would be possible to always set all signals in the struct at the same time rather than individually at different times. It looks to me like this would not be possible due to the difference in behaviour between d_ready and the a_* signals in h2d_int of type tlul_pkg::tl_h2d_t. Is this correct?

right, they follows the TLUL protocol and these signals won't always change at the same time.

@sriyerg
Copy link
Contributor

sriyerg commented Jan 13, 2022

Good news, I have found a command to remove the "# " from the start of log file lines in Questa. So the current run and fail patterns should now work.

Awesome, thanks!

@davidp135
Copy link
Contributor Author

I have set up a pull request to change the set_response_queue_depth to unbounded.

With the other Questa issues (clocking and associative arrays) Mentor have not yet changed their software. Hopefully they will be in the next Questa release. Shall I leave this issue open whilst I wait for this?

@sriyerg
Copy link
Contributor

sriyerg commented Jan 26, 2022

The original issue was about supporting Questa with dvsim, which seems to be resolved. Please make a PR that enables Questa with dvsim. Anything unrelated to this should be filed as a separate issue.

@davidp135
Copy link
Contributor Author

Sure, thanks for all the help.

@sriyerg
Copy link
Contributor

sriyerg commented Jan 27, 2022

Sorry if I was not clear - I do not see a PR yet that adds hw/dv/tools/dvsim/questa.hjson which works with dvsim. Please check that into our repo and then close this issue.

@sriyerg sriyerg reopened this Jan 27, 2022
@davidp135
Copy link
Contributor Author

Ah right, sure ok. I have raised a PR with the changes. It is number 10574. Thanks

@GregAC
Copy link
Contributor

GregAC commented Mar 24, 2022

#10574 now merged

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants