Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[clang-format] Don't split "DPI"/"DPI-C" in Verilog imports #66951

Merged
merged 4 commits into from
Sep 21, 2023
Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 11 additions & 0 deletions clang/lib/Format/ContinuationIndenter.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -2270,7 +2270,18 @@ ContinuationIndenter::createBreakableToken(const FormatToken &Current,
if (State.Stack.back().IsInsideObjCArrayLiteral)
return nullptr;

// The "DPI"/"DPI-C" in SystemVerilog direct programming interface imports
// cannot be split, e.g.
// `import "DPI" function foo();`
StringRef Text = Current.TokenText;
if (Style.isVerilog()) {
const FormatToken *Prev = Current.getPreviousNonComment();
if (Prev && Prev == State.Line->getFirstNonComment() &&
Prev->TokenText == "import") {
return nullptr;
}
}

Copy link
Contributor

@owenca owenca Sep 21, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// The "DPI"/"DPI-C" in SystemVerilog direct programming interface imports
// cannot be split, e.g.
// `import "DPI" function foo();`
StringRef Text = Current.TokenText;
if (Style.isVerilog()) {
const FormatToken *Prev = Current.getPreviousNonComment();
if (Prev && Prev == State.Line->getFirstNonComment() &&
Prev->TokenText == "import") {
return nullptr;
}
}
if (Style.isVerilog() && Current.Previous &&
Current.Previous->isOneOf(tok::kw_export, Keywords.kw_import)) {
return nullptr;
}
StringRef Text = Current.TokenText;

Shouldn't we handle export as well? Also, I don't think this is Verilog specific.

Edit: let’s just fix Verilog for now.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isOneOf won't work here, since the token has the type of identifier rather than a keyword:

Unwrapped lines:
Line(0, FSC=0): identifier[T=125, OC=0, "import"] string_literal[T=125, OC=7, ""DPI-C""] identifier[T=125, OC=15, "function"] identifier[T=125, OC=24, "t"] identifier[T=125, OC=26, "foo"]
Line(0, FSC=0): l_paren[T=120, OC=29, "("] r_paren[T=125, OC=30, ")"] semi[T=125, OC=31, ";"]
Line(1, FSC=0): eof[T=125, OC=32, ""]
Run 0...
AnnotatedTokens(L=0, P=0, T=5, C=0):
 M=0 C=0 T=Unknown S=1 F=0 B=0 BK=0 P=0 Name=identifier L=6 PPK=2 FakeLParens= FakeRParens=0 II=0x56180dc60ce8 Text='import'
 M=0 C=1 T=Unknown S=1 F=0 B=0 BK=0 P=23 Name=string_literal L=14 PPK=2 FakeLParens= FakeRParens=0 II=0x0 Text='"DPI-C"'
 M=0 C=0 T=Unknown S=1 F=0 B=0 BK=1 P=23 Name=identifier L=23 PPK=2 FakeLParens= FakeRParens=0 II=0x56180dc611f8 Text='function'
 M=0 C=0 T=Unknown S=1 F=0 B=0 BK=0 P=23 Name=identifier L=25 PPK=2 FakeLParens= FakeRParens=0 II=0x56180dc9b370 Text='t'
 M=0 C=0 T=Unknown S=1 F=0 B=0 BK=0 P=23 Name=identifier L=29 PPK=2 FakeLParens= FakeRParens=0 II=0x56180dc9b3a0 Text='foo'
----
AnnotatedTokens(L=0, P=0, T=5, C=1):
 M=0 C=0 T=VerilogMultiLineListLParen S=1 F=0 B=0 BK=0 P=0 Name=l_paren L=1 PPK=2 FakeLParens= FakeRParens=0 II=0x0 Text='('
 M=0 C=0 T=Unknown S=0 F=0 B=0 BK=0 P=140 Name=r_paren L=2 PPK=2 FakeLParens= FakeRParens=0 II=0x0 Text=')'
 M=0 C=0 T=Unknown S=0 F=0 B=0 BK=0 P=23 Name=semi L=3 PPK=2 FakeLParens= FakeRParens=0 II=0x0 Text=';'
----
AnnotatedTokens(L=1, P=0, T=5, C=0):
 M=0 C=0 T=Unknown S=1 F=0 B=0 BK=0 P=0 Name=eof L=0 PPK=2 FakeLParens= FakeRParens=0 II=0x0 Text=''
----

We should leave Current.getPreviousNonComment() to handle comments in the middle of the statement (something like import /*"DPI"*/ "DPI-C" ... comes to mind).

Shouldn't we handle export as well?

Indeed, export "DPI-C" is also a valid construct, and thus, string literals after export should be exempt from breaking too.

I don't think this is Verilog specific.

I'd suggest to address known cases with targeted exemptions to avoid surprises in random places.

      if (Style.isVerilog()) {
        const FormatToken *Prev = Current.getPreviousNonComment();
        if (Prev && Prev == State.Line->getFirstNonComment() &&
            (Prev->TokenText == "import" || Prev->TokenText == "export")) {
          return nullptr;
        }
      }

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isOneOf won't work here, since the token has the type of identifier rather than a keyword:

Have you tried it? It does work because not only is import a tok::identifier, it's also a Keywords.kw_import.

We should leave Current.getPreviousNonComment() to handle comments in the middle of the statement (something like import /*"DPI"*/ "DPI-C" ... comes to mind).

I was aware of that, but we usually don't call getPreviousNonComment() unless a comment before a token makes sense in practice. Otherwise, we would have to write ugly and inefficient code to handle things like the following:

/* outer l_square */ [ /* inner l_square */ [ /* attribute */ unlikely /* inner r_square */ ] /* outer r_square */ ] // comment

I'd suggest to address known cases with targeted exemptions to avoid surprises in random places.

I think this is also relevant to (at least) C++ import statements, e.g.:

`import "clang/include/clang/Format/Format.h";`

I still prefer that we make a general fix here but will leave it to @sstwcw.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we please have any of the resolutions sooner?

This blocks quite a bit of testing.

For example, can we have this as a workaround, then let @sstwcw fix it cleanly?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please see my update in #66951 (comment).

Copy link
Contributor

@sstwcw sstwcw Sep 21, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, I don't think this is Verilog specific.

For C++, it is already handled on lines 261 and 2166. I prefer fixing the Verilog problem by annotating the import lines instead of implementing said lines again. But if you think it is too much work for you, then I am also fine with your current fix.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've taken the suggestion and added a FIXME to use the C++ import infra

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isOneOf won't work here, since the token has the type of identifier rather than a keyword:

Have you tried it? It does work because not only is import a tok::identifier, it's also a Keywords.kw_import.

Ah, right, I had tried it with tok::kw_import (and I missed the difference between this and your suggestion). Thanks for the clarification!

We should leave Current.getPreviousNonComment() to handle comments in the middle of the statement (something like import /*"DPI"*/ "DPI-C" ... comes to mind).

I was aware of that, but we usually don't call getPreviousNonComment() unless a comment before a token makes sense in practice. Otherwise, we would have to write ugly and inefficient code to handle things like the following:

/* outer l_square */ [ /* inner l_square */ [ /* attribute */ unlikely /* inner r_square */ ] /* outer r_square */ ] // comment

I don't think it would add a lot of overhead (one branch on a happy path) or hinder readability of the code a lot (getPreviousNonComment() vs Prev), but I also don't think it's super important here.

// We need this to address the case where there is an unbreakable tail only
// if certain other formatting decisions have been taken. The
// UnbreakableTailLength of Current is an overapproximation in that case and
Expand Down
6 changes: 6 additions & 0 deletions clang/unittests/Format/FormatTestVerilog.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -1253,6 +1253,12 @@ TEST_F(FormatTestVerilog, StringLiteral) {
"xxxx"});)",
R"(x({"xxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxx ", "xxxx"});)",
getStyleWithColumns(getDefaultStyle(), 23));
// import "DPI"/"DPI-C" cannot be split.
verifyFormat(R"(import
"DPI-C" function t foo
();)",
R"(import "DPI-C" function t foo();)",
getStyleWithColumns(getDefaultStyle(), 23));
// These kinds of strings don't exist in Verilog.
verifyNoCrash(R"(x(@"xxxxxxxxxxxxxxxx xxxx");)",
getStyleWithColumns(getDefaultStyle(), 23));
Expand Down