Skip to content
This repository was archived by the owner on Aug 31, 2023. It is now read-only.

refactor(rome_bin): merge the CLI and Language Server #3044

Merged
merged 6 commits into from
Aug 12, 2022

Conversation

leops
Copy link
Contributor

@leops leops commented Aug 10, 2022

Summary

Fixes #3029

This change merges the existing command line client and language server binaries into a single rome_bin create that exposes all the existing CLI features but can also be spawned as a daemon and provide a backend for the features of the CLI and the editor extension.

Communication with the daemon server is done using the flavor of the JSON-RPC protocol defined in the LSP specification: https://microsoft.github.io/language-server-protocol/specifications/lsp/3.17/specification/#baseProtocol. Since this protocol defines a Content-Type header we could eventually use a non-JSON serialization format for communications between the CLI and the server, for instance bincode with an additional compression like Brotli if necessary.

The transport layer for the protocol is OS dependent, it uses a domain socket created in the global temporary directory on Unix systems, and a named pipe on Windows. This is abstracted away for client libraries by providing a rome __print_socket command that ensures the daemon is running, prints the OS-specific "socket name" to the standard output and exit. For node.js clients like the VSCode extension, the resulting socket name can simply be passed to net.connect to open a connection to the server regardless of the underlying operating system.

Importantly, despite allowing for multiple clients to connect to the same server daemon at the same time, each client will be connected to an isolated Language Server / Workspace instance. The implications of allowing concurrent access to a single workspace instance are complex, and it will require more work to get it right while the immediate goal of this change is only to provide the necessary infrastructure for JavaScript client libraries to connect to.

Test Plan

I've modified the CLI test suite to use "simulated" client-server communication, by creating an in-process instance of the server, wrapping it with an in-memory transport layer, and injecting a workspace client instance using this transport layer into the CLI. This has the advantage testing the protocol encoding and serialization while being lightweight since it doesn't require spawning a new process for every test, but doesn't actually test the OS-specific transport layer (which might be hard to test since it's technically a shared, global state on the machine running the tests)

@leops leops requested a review from a team August 10, 2022 16:41
@leops leops temporarily deployed to aws August 10, 2022 16:41 Inactive
@github-actions
Copy link

github-actions bot commented Aug 10, 2022

Copy link
Contributor

@MichaReiser MichaReiser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the main motivation behind merging the CLI and LSP binaries? Or what workflows does it simplify?

@@ -0,0 +1,26 @@
[package]
name = "rome_bin"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this crate be simply be named "rome"?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I though it could be unclear to have a crate named just rome since it doesn't really explain what it does, but I could at least rename the resulting binary


// Try to open a connection to an existing Rome server socket, or create an
// in-process Workspace server instance if no daemon process is found
let runtime = Runtime::new().expect("could not instantiate the Tokio runtime");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What are the reasons that tokio may fail? What do you think of returning a initialization error Termination instead

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The initialization of the tokio runtime can fail if it cannot create its internal OS object (epoll on Linux, kqueue on macOS or IOCP on Windows), although it's really not supposed to fail if it ever does that's out of our control so it makes sense that this error should be a Termination instead of a panic

Comment on lines +19 to +29
fn main() -> Result<(), Termination> {
let mut args = Arguments::from_env();

if args.contains("__print_socket") {
print_server_socket()
} else if args.contains("__run_server") {
run_server_session()
} else {
run_cli_session(args)
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add some documentation around how the rome_bin is intended to be used?

Is it intentional that we have such "cryptic" arguments? Why not rome start for starting the server and rome stop to shut it down?

Or a rome debug or rome environment command to collect information about the system?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this is intentional and not set on stone. Eventually we might have those commands: #3029 (comment)

}
}
_ => {
// Ignore unknown headers
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It may be helpful to log unknown headers or someone might have a very frustrating experience when they e.g. have a typo in 'Content-Type'

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added an eprintln call there, but in practice the SocketTransport should only be used to connect to a rome_lsp instance
and this branch should only ever be run if there's a version mismatch between the running daemon and the CLI (this is also the case for the Content-Type header, the current version of the server doesn't send it since its assumed to be JSON by default, but while I envision unsupported headers can simply be ignored by the CLI the socket should return an error if the server sends in a response with an encoding it doesn't support)

@leops
Copy link
Contributor Author

leops commented Aug 11, 2022

What's the main motivation behind merging the CLI and LSP binaries? Or what workflows does it simplify?

The main reason behind this merge is to allow the rome npm package we currently use to distribute the CLI to also expose a programmatic interface with import { ... } from 'rome'. This JavaScript API will internally need to defer it implementation to a prebuilt binary of our Rust code, and it didn't really make sense to distribute two different binaries in the same npm package (since pretty much all the code below the Workspace level would be duplicated between the two). But this is intended as an implementation detail, at the moment the daemon server part is not a part of the public command line interface, hence why this new merged binary consists of the public CLI (from the rome_cli crate) + the backend server logic layered on top through cryptic internal commands. But I agree this separation is artificial and we'll probably end up exposing the commands for controlling the daemon on the public CLI (making the rome_bin crate redundant)

@leops leops force-pushed the feature/daemon-server branch from 0fa4663 to 1343de0 Compare August 11, 2022 09:35
@cloudflare-workers-and-pages
Copy link

cloudflare-workers-and-pages bot commented Aug 11, 2022

Deploying with  Cloudflare Pages  Cloudflare Pages

Latest commit: 3ec3ddd
Status: ✅  Deploy successful!
Preview URL: https://8640dd3a.tools-8rn.pages.dev
Branch Preview URL: https://feature-daemon-server.tools-8rn.pages.dev

View logs

@leops leops temporarily deployed to aws August 11, 2022 09:35 Inactive
@leops leops temporarily deployed to aws August 11, 2022 09:52 Inactive
}

async fn syntax_tree_request(&self, params: SyntaxTreePayload) -> LspResult<Option<Value>> {
async fn syntax_tree_request(&self, params: SyntaxTreePayload) -> LspResult<String> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the reason of this change?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this might not actually be needed anymore, the code registering the custom methods on the LSP server went through a few iterations before I settled on this design, and in one of them the Rust type system had troubles inferring the correct generic types for this return value

/// completion
pub async fn accept<I, O>(self, stdin: I, stdout: O)
where
I: AsyncRead + Unpin,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I read a lot about Unpin and now here we use it. Would you mind explain a bit what's the use of Unpin here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Unpin requirement on the stdin type here is inherited from tower_lsp, I guess because it needs to move the stream inside an async block (the state of a future is pinned while it's executing, so moving any value inside of it would require its type to be unpin-able)


// supports_feature is special because it returns a bool instead of a Result
builder = builder.custom_method("rome/supports_feature", |server: &LSPServer, params| {
ready(Ok(server.session.workspace.supports_feature(params)))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does ready do?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The custom_method builder expects the provided callback to return a Future, so ready is used to construct a future that's immediately ready and returns the provided value (this would be equivalent to Promive.resolve(value) in JavaScript to create a promise that immediately resolves to the provided value)

@@ -5,7 +5,7 @@ use serde::{Deserialize, Serialize};
use tower_lsp::lsp_types::{TextDocumentIdentifier, Url};
use tracing::info;

pub const SYNTAX_TREE_REQUEST: &str = "rome/syntaxTree";
pub const SYNTAX_TREE_REQUEST: &str = "rome_lsp/syntaxTree";
Copy link
Contributor

@ematipico ematipico Aug 11, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I think it would make sense to have two constants, one for rome/ and the other for rome_lsp/. So we can document them and teach when to use one or the other.

if line != "\r\n" {
return Err(RomeError::IoError(io::Error::new(
io::ErrorKind::Other,
"unexpected byte sequence",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to have a better messages? From the website:

Unique and specific error messages. No generic error messages. This not only helps users understand what went wrong, but should provide maintainers with a unique call site and the necessary information to debug.

For example, here we are using a IoError, which is really generic and it's used in many callsites. While it makes sense to have this category, here we don't explain that the error occurred inside our internal transport protocol,

For this particular case we don't suggest why the sequence in unexpected, we don't print what we expected and how the maintainer/user should fix the error. Would it be possible to put more info/thought around the error messages of the protocol?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I overhauled the way errors are emitted in the socket transport, using anyhow to print structured error messages:

remote connection read task exited with an error

Caused by:
    0: failed to parse header from the remote workspace
    1: invalid value for Content-Type expected "application/vscode-jsonrpc", got "application/bincode"

@ematipico
Copy link
Contributor

Now that we merged LSP and CLI, and tokio will be part of the CLI, I am curios to know how big is now our final binary. Any stats?

@leops
Copy link
Contributor Author

leops commented Aug 11, 2022

Now that we merged LSP and CLI, and tokio will be part of the CLI, I am curios to know how big is now our final binary. Any stats?

This is only for Windows right now, but here are the binary size I'm getting:

Binary Approx. size Exact size
rome_cli 4,54 MiB 4 761 088 bytes
rome_lsp 8,17 MiB 8 573 952 bytes
rome_cli + rome_lsp 12,71 MiB 13 335 040 bytes
rome_bin 9,62 MiB 10 094 080 bytes

@leops leops linked an issue Aug 11, 2022 that may be closed by this pull request
@leops leops temporarily deployed to aws August 11, 2022 15:05 Inactive
@leops leops temporarily deployed to aws August 11, 2022 15:30 Inactive
Copy link
Contributor

@ematipico ematipico left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for reviewing the error part! It looks way better now

@leops leops force-pushed the feature/daemon-server branch from 94efb88 to 3ec3ddd Compare August 12, 2022 07:37
@leops leops temporarily deployed to aws August 12, 2022 07:37 Inactive
@leops leops merged commit c2927dc into main Aug 12, 2022
@leops leops deleted the feature/daemon-server branch August 12, 2022 12:25
IWANABETHATGUY pushed a commit to IWANABETHATGUY/tools that referenced this pull request Aug 22, 2022
* refactor(rome_bin): merge the CLI and Language Server

* address PR review

* rename `run_cli` to `CliSession::run`

* fix CI

* address PR review

* improve the printing of transport errors
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Implement the low-level transport for the JavaScript API
3 participants