TOML Cleanup and Improvements #4565

lkishalmi · 2022-08-30T04:55:36Z

Well, this one got a bit fatter I've initially planned.

It has the same cleanup applied for Yaml and Dockerfile support.

Additionally I've found the root cause of the IndexOutOfBoundException as the Lexer state was not fully saved and restored. Had to do an ugly reflection hack for that, but seems to work.

As a second commit I added the parser Syntax Error output to the UI.

matthiasblaesing · 2022-08-30T12:30:18Z

ide/languages.toml/src/org/netbeans/modules/languages/toml/TomlLanguage.java

+    ),
+    @ActionReference(
+            path = "Loaders/text/x-toml/Actions",
+            id = @ActionID(category = "Edit", id = "org.openide.actions.PushAction"),


Typo?

Suggested change

id = @ActionID(category = "Edit", id = "org.openide.actions.PushAction"),

id = @ActionID(category = "Edit", id = "org.openide.actions.PasteAction"),

matthiasblaesing · 2022-08-30T12:39:58Z

ide/languages.toml/src/org/netbeans/modules/languages/toml/LexerInputCharStream.java

        }
        return c;
    }

+    //Marks are for buffering in ANTLR4, we do not really need them


If I read the documentation correctly, we actually need buffering/seekability of the consumed stream (the documentation of CharStream declares this as required for lookahead), but the LexerInput already handles that - isn't it?

Reading further you actually buffer the whole file in memory, which kind of defeats the purpose of the CharStream implementation. getText can only look back to places, that are protected by a mark, so the buffer is limited (assuming limited lookahead and marking).

LexerInput readText, readLength is scoped for the actual token being processed. CharStream getIndex and getText are work on stream level. Also getText is kind of an optional method. Fortunately it used in all Lexers. Just discovered to have a problem with the previos implementation on the Antlr lexer.

I'm tempted to look around the Lexing API and probably add a more ANTLR friendly interface. Will, see. I do not think that this would be the final implementation. It is kind of good enough for now.

Seeing, that ANTLR does not bother to implement its own interface (ANTLRInputStream) in an efficient way, I can't argue, that this implementation is inefficient, so ignore that.
For the tempation to change the lexer API to be "ANTLR" friedly: before going there, make sure you have a very good reason, at some point ANTLR will go away and we will retain the fallout.

Well, throwing out unmarked sections of StringBuilder could be implemented one day with mark supported.

Looked around the Lexing API yesterday. Accessing the underlying buffers is not as easy as I've thought.

matthiasblaesing · 2022-08-30T12:53:04Z

ide/languages.toml/src/org/netbeans/modules/languages/toml/TomlLexer.java


-        LexerState(org.tomlj.internal.TomlLexer lexer) {
+        LexerState(org.tomlj.internal.TomlLexer lexer) throws IllegalArgumentException, IllegalAccessException {


Using reflection is an implementation detail here - I would catch IllegalAccessException and wrap it into a RuntimeException. In the future you hopefully have access to the lexer state and thus the exceptions will not be thrown.

lkishalmi added the Editor label Aug 30, 2022

lkishalmi added this to the NB16 milestone Aug 30, 2022

lkishalmi requested review from mbien and matthiasblaesing August 30, 2022 04:55

matthiasblaesing reviewed Aug 30, 2022

View reviewed changes

lkishalmi added 2 commits August 30, 2022 17:17

TOML Support Improvement and Cleanup

15f0121

Report SyntaxErrors from TomlParser

987ec7d

lkishalmi force-pushed the toml-cleanup branch from 0fa326a to 987ec7d Compare August 31, 2022 00:18

A slightly better CharStream implementation

2931102

lkishalmi merged commit 52409ea into apache:master Sep 3, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TOML Cleanup and Improvements #4565

TOML Cleanup and Improvements #4565

lkishalmi commented Aug 30, 2022

matthiasblaesing Aug 30, 2022

matthiasblaesing Aug 30, 2022

lkishalmi Aug 31, 2022

matthiasblaesing Aug 31, 2022

lkishalmi Aug 31, 2022

matthiasblaesing Aug 30, 2022

	id = @ActionID(category = "Edit", id = "org.openide.actions.PushAction"),
	id = @ActionID(category = "Edit", id = "org.openide.actions.PasteAction"),


		LexerState(org.tomlj.internal.TomlLexer lexer) {
		LexerState(org.tomlj.internal.TomlLexer lexer) throws IllegalArgumentException, IllegalAccessException {

TOML Cleanup and Improvements #4565

TOML Cleanup and Improvements #4565

Conversation

lkishalmi commented Aug 30, 2022

matthiasblaesing Aug 30, 2022

Choose a reason for hiding this comment

matthiasblaesing Aug 30, 2022

Choose a reason for hiding this comment

lkishalmi Aug 31, 2022

Choose a reason for hiding this comment

matthiasblaesing Aug 31, 2022

Choose a reason for hiding this comment

lkishalmi Aug 31, 2022

Choose a reason for hiding this comment

matthiasblaesing Aug 30, 2022

Choose a reason for hiding this comment