Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

specify RPATH when linking to libpq #225

Merged
merged 15 commits into from
Sep 15, 2021
Merged

specify RPATH when linking to libpq #225

merged 15 commits into from
Sep 15, 2021

Conversation

davepacheco
Copy link
Collaborator

@davepacheco davepacheco commented Sep 8, 2021

See #213 for context.

I don't think we should land this PR, but I wanted to keep it here for future reference. This was my best attempt at a generalizable pattern for linking to a native library that's not on the runtime linker's default search path. The basic idea is to have the *-sys crate that wraps the native library expose metadata that indicates where it found the native library. Then we use this at the "top level" (here in omicron) to pass an appropriate -R flag to the linker (wrapped in a -W flag because we're really passing it to gcc). This change depends on the "oxide" branch of https://github.com/oxidecomputer/pq-sys, which adds that metadata. (This dependency shows up as a "patch" section in the top level Cargo.toml.)

The good things about this approach:

  • We're still able to leverage the pq-sys crate's logic for locating the library and the resulting RPATH entry is necessarily the same.
  • The logic that adds the RPATH is at the top level, not in pq-sys. That's where it belongs -- see this comment.
  • This approach gives us the flexibility to decide that official builds will put the library at some known place on the deployed system, even if that doesn't match what where it is on the build machine. (We'd need to update the top-level build script logic to do this, and then pass the env var to pq-sys that causes it to find the library we want, but this seems straightforward.)

There's a bunch of ugly here:

  • For every omicron crate that this applies to, I had to add a build.rs script that checks for the pq-sys metadata and uses the cargo:rustc-link-arg option to emit the -R flag that we want.
  • The cargo:rustc-link-arg option is currently unstable, so we need to be on "nightly". It's stabilized in Rust 1.56, which is a couple of months away.
  • Metadata emitted by Cargo build scripts is only available to immediate dependents. We don't depend on pq-sys -- we depend on Diesel. To work around this, I added dependencies on pq-sys wherever we're currently depending on diesel. This relies on our dependency matching Diesel's, too -- if Diesel updated to one that was incompatible with our semver, I'm not sure what would happen.
  • All of this is copy/pasted across most of the crates in this repo. That's because omicron-common depends on diesel (see Use fewer features for the diesel dependency in omicron-common #221?), and most other things depend on omicron-common. We're deliberately trying to set RPATH at the top level, so naturally all of the affected top-level crates need this.

If we were to go with this approach, we could:

  • cleanup the duplicated logic in the build.rs files by putting it into a common crate and using that as a build-dependency
  • fix the XXX comments to explain the situation.
  • double-check the "nightly" toolchain we're using. As I write this up, I realize that I removed the notes about the unstable "asm" feature because my build worked without them, but that may be because I wasn't actually using propolis.

The real killer, though, is that "cargo:rustc-link-arg" is not used when building rustdoc tests. The net result is that builders of omicron still need to set RUSTDOCFLAGS if they want to run the test suite. All this ugliness just prevents them from having to set RUSTFLAGS.

@@ -145,8 +145,10 @@ async fn collection_task(
struct CollectionTask {
// Channel used to send messages from the agent to the actual task. The task owns the other
// side.
#[allow(dead_code)]
Copy link
Collaborator Author

@davepacheco davepacheco Sep 14, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bnaecker I added this here because these unused fields were generating clippy warnings, which were treated as errors by the clippy-lint target. It looks like these are unused but I wasn't sure what the right answer was here. It'd be good to remove these allow directives, one way or another.

I'm not sure why these only generated warnings now. I did update to a newer nightly, and maybe that one has different behavior here.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No problem, thanks for cc-ing me. I'm not sure why they never triggered errors before, either, but the updated toolchain seems likely.

These fields are there so that we can communicate with the collection task. For example, updating the interval on which metrics are collected from a producer is accomplished by sending something on the inbox channel. The join handle might be used if we want to stop collecting metrics from a producer.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't seem like either one is actually used in the code today, though. At least, I think that's what clippy is saying and I couldn't find a case that contradicted it.

@davepacheco
Copy link
Collaborator Author

After exploring the static linking approach under #234, I've come back around and think this is the most promising direction.

The problem I mentioned earlier regarding rustdoc tests is rust-lang/cargo#9895. It appears confirmed as a bug and I believe someone is working on fixing it. With that fixed, this will be a complete solution, even if it's ugly. In the meantime, we can just tell people to set RUSTDOCFLAGS. This is annoying, but at least the failure mode is only that test binaries are broken.

In terms of the ugliness: on @cbiffle's suggestion I refactored the build script logic into a new omicron-rpaths package in the same workspace. The remaining ugliness is (1) that nearly every crate in the workspace grows a dummy dependency on pq-sys in order to get its build script metadata, and (2) that same list of crates also grows an identical, tiny build.rs that just calls out to omicron-rpaths, and (3) we're on nightly until 1.56, but we were anyway for "asm".

The only thing here that I know needs to be updated before landing this change is the patch in the top level Cargo.toml. We should probably update the "oxide" branch in pq-sys and use that instead.

@davepacheco
Copy link
Collaborator Author

davepacheco commented Sep 15, 2021

Pending the CI checks and the change I mentioned in my last comment, I think this is ready for review.

Reviewers: I would suggest:

@bnaecker
Copy link
Collaborator

bnaecker commented Sep 15, 2021 via email

Copy link
Collaborator

@smklein smklein left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for working in such depth to fix this. Frankly I had no idea this linkage would be so hairy prior to reading your work in #213 , but hopefully this makes life easier when we inevitably need to link against another native library in the future.

@luqmana I'm super interested in your thoughts here - from my point of view, this whole case is a good reason why rpaths should become a first-class citizen in cargo. Do you think there's anything we could/should do to signal boost this to the cargo team?

I still think an "ideal solution" - which would likely require language/cargo changes to be supported - would be one where a "-sys" crate can define explicit rpaths that are transitively picked up by anyone who depends, directly or indirectly, on that "-sys" crate. That said, I understand the concern about "implicit changes to linker-flags", but having a section like:

[build]
transitive_rpath = true

In a workspace seems much more natural than altering the dependency hierarchy to acquire / act on "DEP_" variables.

Regardless, thank you @dap - this solution looks like it'll cover our native linking use cases while protecting us from unexpected missing rpaths for the foreseeable future.

Comment on lines +69 to +79
* So we need to emit the linker argument here. How do we know what value to
* use? We take the approach used by most build systems: we use the path where
* the library was found on the build machine. But how do we know where that
* was? *-sys packages have non-trivial mechanisms for locating the desired
* library. We don't want to duplicate those here. Instead, we make use of
* metadata emitted by those build scripts, which shows up as "DEP_*"
* environment variables for us.
*
* **Important note:** In order for us to have metadata for these dependencies,
* we must *directly* depend on them. This may mean adding an otherwise unused
* dependency from the top-level package to the *-sys package.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting hack - I recall reading about the "DEP_" variables and playing around with them, but thought this was a non-starter because of the lack of transitiveness. Good idea just adding the direct dependency!

Comment on lines +143 to +145
* expect these to always be set, and if they're not, it's most likely that
* somebody has forgotten to include a required dependency. We want to tell
* them that rather than silently produce unrunnable binaries.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At build time especially, this seems like a reasonable inclination.

Copy link
Contributor

@luqmana luqmana left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @davepacheco for the incredible depth in researching all the possible solutions! I think this is a reasonable approach right now barring any specific support from cargo/rust. Mirroring @smklein sentiments, this got incredibly hairy fast.

I think it'll probably be some time before we can realistically expect a first-class cargo solution there given the limited bandwidth of the cargo team (correct me if I'm wrong @steveklabnik). To really get it accelerated I'd think maybe writing up a pre-RFC and posting to rust-internals would be good. Our own issues and the ones linked above definitely indicate this isn't just a pain point for us. If we can come up with a reasonably scoped surface I'd be happy to help implement/drive it forward.

@davepacheco
Copy link
Collaborator Author

Thanks @luqmana and @smklein for taking a look and for all the help along the way.

I'd love to move the Cargo solution forward, too. There's a lot of good context in the Cargo issue. I'm not sure we have consensus internally on the best approach. It might be helpful to write down the goals and see if we can't find a common solution. I think @smklein wants a package to be able to encapsulate its RPATH requirements (correct me if that's wrong), and other folks (both on the Cargo issue and internally) want to specify this from the top level build. It'd help to write down the use cases where each is desirable. I expect we can come up with a solution where, for folks using pkg-config and its ilk that assume the build machine mirrors the production machine, this all "just works" (i.e., top-level builders don't need to do anything), while still allowing the folks who want to control this from the top level to do so.

@davepacheco davepacheco marked this pull request as ready for review September 15, 2021 22:43
@davepacheco davepacheco merged commit e34626f into main Sep 15, 2021
@davepacheco davepacheco deleted the issue-213-2 branch September 15, 2021 22:44
@smklein
Copy link
Collaborator

smklein commented Sep 15, 2021

I'd love to move the Cargo solution forward, too. There's a lot of good context in the Cargo issue. I'm not sure we have consensus internally on the best approach. It might be helpful to write down the goals and see if we can't find a common solution. I think @smklein wants a package to be able to encapsulate its RPATH requirements (correct me if that's wrong), and other folks (both on the Cargo issue and internally) want to specify this from the top level build. It'd help to write down the use cases where each is desirable. I expect we can come up with a solution where, for folks using pkg-config and its ilk that assume the build machine mirrors the production machine, this all "just works" (i.e., top-level builders don't need to do anything), while still allowing the folks who want to control this from the top level to do so.

I spent a while chatting with @rmustacc on this topic today, as I understand it:

The advantage of specifying the rpath at a binary / top-level point-of-view is that, at the point a binary is being produced, the target machine should be "known" - as should the target rpath. After all, the .dynamic section of the ELF file needs to be filled with something at this point. I'm definitely convinced that a binary which depends on pq-sys should be able to influence the rpath in a way that, in isolation, pq-sys may not be able to do on its own. One solution is very similar to the result of this PR: all top-level targets explicitly specify the native libraries on which they depend, even indirectly.

I 100% understand the need for top-level targets to be able to determine where rpaths should be uncovered - but it still seems like a bummer that these targets need to basically have hard-coded knowledge about all used native libraries, even if they're the result of a transitive crate dependency. That's the key part I'm wondering about - if the top-level build was responsible for specifying the rpath, but internal "*-sys" crates could emit the required rpaths in some well-known format that could be aggregated somewhere, I think this could accomplish the same goal.

Like, right now, the DEP_PQ_LIBDIRS environment variable gets passed from the pq-sys crate up to the top-level crates, but only because:

  • The top-level crates directly depend on pq-sys (artificially! just to see the environment variable)
  • omicron-rpaths has hardcoded this environment variable (within RPATH_ENV_VARS).

I could imagine a feature where - similar to how how the DEP_* variables are propagated upward from dependencies - a set of rpaths are bubbled up transitively from dependencies as an environment variable.

The top-level build could opt-in to using this environment variable (... or whatever mechanism cargo prefers), passing it to the linker, but IMO this would have some advantages over what we're being forced to do now:

  • No more direct dependencies on the "*-sys" crates! "omicron-nexus" wouldn't need to enumerate pq-sys explicitly, it could just indicate that "for each RPATH in TRANSITIVE_RPATHS, pass 'em to cargo:rustc-link-arg=-Wl,-R...".
  • Dependencies could change which "*-sys" crates they're using, and the top-level targets wouldn't need to be modified (explicitly).

This would basically change the framing of the problem to: "when you build a top-level target, you see all the rpaths recommended by your dependencies. What do you wanna do with 'em?"

With this setup, there would be the option to customize behavior arbitrarily, but also the potential for really concise semantic sugar to say "... yeah, give them all to the linker".

@davepacheco
Copy link
Collaborator Author

I think the basic idea makes sense. It's the details I'm not sure about: exactly what information do the *-sys crates expose, how do they get that information, how do they expose it, how do top level crates see that, and how do top level crates use it to control the final RPATH?

I still think it's essential that we articulate the two different use cases. Correct me if this is wrong, but I believe the one you're describing is: "my build machine matches my production machine, and I don't want to the top level crate to have to think about the RPATHs for native libraries that are pulled in by transitive dependencies." I think that's a common goal and important to support.

I'd like to have a concrete example of what I'll call the "release engineering" case. A simple way to package up Nexus might be to deliver /opt/oxide/bin/nexus with an RPATH pointing to /opt/ooce/pgsql-13/lib/amd64 so it can find libpq. That's easy, though. More interesting would be if we wanted to avoid depending on /opt/ooce and instead build an artifact that contains:

/opt/oxide/bin/nexus (RPATH: $ORIGIN/../lib)
/opt/oxide/lib/libpq.so

Now suppose some Nexus saga wants to use pg_dump (a PostgreSQL command built on libpq), and for whatever reason it needs a newer (or older, or just different) build of libpq. We might want to ship:

/opt/oxide/bin/nexus (RPATH: $ORIGIN/../lib/nexus)
/opt/oxide/bin/pg_dump (RPATH: $ORIGIN/../lib/pg_dump)
/opt/oxide/lib/nexus/libpq.so
/opt/oxide/lib/pg_dump/libpq.so

(openssl might be a better example than libpq.)

Is that the release engineering case? Is that too contrived?

@smklein
Copy link
Collaborator

smklein commented Sep 16, 2021

I think that's a super reasonable case - and while I want to make the "host = target" case easy, anything proposed should definitely be able to support that case of "the target machine may not look like my own".

This is super handwavey, but I think the case you mentioned could basically still be handled effectively if the rpath information emitted by "*-sys" crates was more structured.

As an example, suppose the "*-sys" crate could emit:

  • "This is a native library I need (by name, not path)"
  • "This is the path to the native library (on your host machine)"

At a top-level, a resolver in the build.rs file could be used to identify "when you're building this binary, you need libpq.so. Here's where you should look for that path on your target machine". This "resolver" would be bespoke for whatever target environment we're talking about - as an example, this could be where, for helios, we say "look in /opt/oxide/wherever" - but presumably that mapping of "libpq.so -> path on target to libpq.so" needs to happen somewhere.

Regarding "how does it get exposed up", I dunno. The rust book section on native libraries suggests the "key-value pairs" are basically the go-to.

Also, regarding "*-sys packages", the book mentions the following:

The set of *-sys packages provides a common set of dependencies for linking to native libraries. There are a number of benefits earned from having this convention of native-library-related packages:

  • Common dependencies on foo-sys alleviates the rule about one package per value of links.
  • Other -sys packages can take advantage of the DEP_NAME_KEY=value environment variables to better integrate with other packages. See the "Using another sys crate" example.
  • A common dependency allows centralizing logic on discovering libfoo itself (or building it from source).
  • These dependencies are easily overridable.

The language here implies - to me, at least - that the goal of the "*-sys" packages is to provide a single, centralized location regarding "how do we find the underlying native library" - or in their words:

centralizing logic on discovering libfoo itself (or building it from source)

emesare added a commit to Vector35/binaryninja-api that referenced this pull request Dec 31, 2024
When linking you must depend on the -sys crate.

This is because linker arguments (what says where to find binaryninjacore) are NOT transitive. The top level application crate MUST provide it.

For more information see:
- rust-lang/cargo#9554 (comment)
- oxidecomputer/omicron#225
emesare added a commit to Vector35/binaryninja-api that referenced this pull request Jan 25, 2025
Moves a bunch of stuff out of src/types.rs that did not belong:
- Confidence
- Variable
- Function specific stuff

- Refactored InstructionInfo, see the msp430 and riscv examples.
- Renamed Function::from_raw to Function::ref_from_raw and fixed places where the ref was incremented twice
- Fixed FunctionRecognizer leaking functions (see above)
- Fixed some apis incorrectly returning Result where Option is clearer
- Started to move destructured types to the From trait for converting to an from ffi types, see Location for an example
- Started to remove bad module level imports (importing std modules like mem all over the place)
- Moved some wrapper handle types to named handle field (this improves readability), see CoreArchitecture for an example
- Removed some unintuitive getters, this is bad practice for Rust code, just access the field directly, see DataVariable for an example
- General code cleanup, purposely did not run rustfmt, that will be a single seperate commit

More rust cleanup

- Fixed invalid views being able to invoke UB when dealing with databases
- Cleaned up some helper code in dwarf_import
- Fixed inverted is_null checks causing crashes! Oops!

More rust cleanup

Still a WIP, I think branch info is still invalid, need to figure out the issue there.

- Fixed some invalid Ref lifetimes when constructing indirectly, see Array<DataVariable> for example
- Added some more comments
- Removed some "magic" functions like MLIL Function::ssa_variables

There are still a bunch of invalid lifetimes that aren't crashing us due to the usage of those API's not living long enough. But they ARE an issue.

More rust cleanup

Trying to comment more TODO's as I go along.

- Renamed llil::Function to llil::LowLevelILFunction for clarity and consistency
- Take base structures by ref in StructureBuilder::set_base_structures to prevent premature drops
- Added more raw to wrapper conversions
- Removed UB prone apis
- Getting rid of more core module references, use std!
- Removed improper Guard usage in array impls for wrapper types with no context
- Untangling the UB of the Label api, still only half done :/

More rust cleanup

- Misc formatting
- Made Logger ref counted
- Fixed leaking name of logger every time something was logged
- Fixed the last (hopefully) of the unresolved labels
- Simplified some code

Fix leak in DebugInfo::AddType

componentArray was never freed

Add more HLIL related functions to rust

More rust cleanup

improve the CustomBinaryView init process

Canonicalize path in `create_delete_empty` test

Link core in rust

When linking you must depend on the -sys crate.

This is because linker arguments (what says where to find binaryninjacore) are NOT transitive. The top level application crate MUST provide it.

For more information see:
- rust-lang/cargo#9554 (comment)
- oxidecomputer/omicron#225

Remove vendored pdb crate

Use cargo to manage the git repo ref instead

Fix misc rustdoc warnings

Move actual plugins out of `rust/examples` and into `plugins`

This is where all shipped public plugins that are not arch/view/platform/lang will be at from now on

Originally they were in the rust workspace, meaning they all shared a Cargo.lock which is ill-advised.

More rust cleanup

- More clarification on plugin/executable requirements
- Made examples actually rust examples
- Add Display impl for InstructionTextToken
- Renamed feature "noexports" to "no_exports"

Move under_construction.png to assets directory

This really did bother me

Remove unneeded `extern crate bindgen`

Replace nop'd log::info with println

We don't register a compatible log sink so they will just get sent into the void

Move inline tests into tests directory

This is being done in the hopes of curbing the multi-thousand line files that will occur once we flesh out the tests

Format rust code

Update rust ci

Still need to add support for running tests in ci

More rust cleanup

- Architecture id's are now typed accordingly
- Fix some clippy lints
- Make instruction index public in LLIL
- Removed useless helper functions
- LLIL expressions and instruction indexes are now typed accordingly

Generate binaryninjacore-sys documentation

This should show binaryninjacore-sys alongside binaryninja crate

More rust cleanup

- Remove lazy_static dependency
- Remove hacky impl Debug for Type and just use the display impl
- Add more debug impls
- Reorder some top level namespace items
- Add first type test

Remove unneeded script helper in rust api

More rust cleanup

- Added main thread handler api
- Register a headless main thread handler by default in initialization
- Refactor QualifiedName to be properly owned
- Loosened some type constraints on some apis involving QualifiedName
- Fixed some apis that were crashing due to incorrect param types
- Removed extern crate cruft for log crate
- Simplified headless initialization using more wrapper apis
- Fixed segments leaking because of no ref wrapper, see BinaryViewExt::segment_at
- Added rstest to manage headless init in unit tests
- Added some more unit tests
- Refactored demangler api to be more ergonomic
- Fixed minidump plugin not building

More rust cleanup

- Fixup usage of QualifiedName in plugins
- Make QualifiedName more usable

Implement rust TypeParser

fix Platform TypeParser related functions

separate User and System implementations of TypeParserResult

Implement rust TypeContainer

More rust cleanup

- Hopefully fixed the rust.yml CI
- Added TypePrinter API (this is still WIP and will crash)
- Added TypeParser API
- Added TypeContainer API
- More work around QualifiedName apis

Oh your suppose to do this

Add workflow_dispatch trigger to rust.yml

More rust fixes

- Swapped some usage of raw 255 to MAX_CONFIDENCE, no one likes magic numbers
- New InstructionTextToken API, complete with owned data, this still needs a lot of testing.
- InstructionTextTokenKind describes a destructured InstructionTextToken, this should make token usage much clearer, some docs pending
- Added some misc Default and Debug impls
- Updated TypePrinter and architectures to use new InstructionTextToken API

Misc formatting changes

More rust cleanup

- Fixed MANY memory leaks (most due to QualifiedName)
- Made StructureBuilder more builder and less structure
- Fixed CMakeLists.txt that were globbing entire api, resulting in 100 second slowdown on cmake generation
- Added more Debug impl's
- Added some more tests
- Fixed TypeParserResult UB
- Moved the From impls to blank impl for clarity, we have multiple different variants of core to rust for some structures, it should be explicit which one you are choosing.
- PossibleValueSet should now be able to allocate so we can go from rust to core with those variants that require allocation
- Misc doc code formatting

Misc clippy lints and clippy CI

Co-authored-by: Michael Krasnitski <michael.krasnitski@gmail.com>

Fix typo in rust CI

Misc rust formatting

Fix misc typos and add typos to rust CI

Add cargo workspace

This will help tooling and external contributors get a map of the rust crates within binaryninja-api

More rust cleanup

- Format all rust plugins
- Fix some tests that were out of date
- Simplify WARP tests to only binaries, building object files from source is a pain
- Link to core in all rust plugins
- Fix some memory leaks
- Update warp insta snapshots
- Fix some misc clippy lints

Run rust tests in CI

This commit also coincides with the creation of the "testing" environment which exposes a BN_SERIAL secret for pulling a headless Binary Ninja

Install missing wayland dependency in github CI

Apparently its needed for linux file picker for the WARP integration

Set the BINARYNINJADIR so rust can find binaryninjacore in CI

The chances of this working are low

Misc remove unused dependency

Rust misc formatting fixes

Improve initialization in rust headless scripts

Provide sensible errors and validation to rust headless scripts, solves #5796

Add BN_LICENSE environment variable to rust CI

We pass the serial to download binary ninja, but we never provided the license for core initialization

Fix typo

More rust cleanup

- Improved binary view initialization (see init_with_opts)
- Allow floating license to free itself before initialization
- Add initialization unit test
- Add better Debug impls for some common types
- Use Path api for opening binary views, this is not breaking as it uses the AsRef impl
- Bump rayon dependency and constrain dependencies to x.x

Update readme and include that directly in the rustdocs

More rust documentation changes

Add format comment to InitializationOptions::with_license

Misc formatting and clippy lint allow

More rust cleanup

- Remove under_construction.png from the build.rs it has been removed
- Use InstructionIndex in more places
- Add missing PartialOrd and Ord impls for id types

More rust cleanup

- Make workflow cloning explicit
- Add workflow tests
- Add missing property string list getting for settings
- Remove IntoActivityName (see #6257)

More rust cleanup

This commit is half done

Misc rust formatting

More rust cleanup

- Renamed common name conflictions (I will put my justification in the PR)
- Fixed invalid instruction retrieval for LLIL
- Added common aliases for llil function, instruction and expression types (see my comment in the PR)
- Refactored the instruction retrieval for LLIL, MLIL and HLIL
- Added instruction index types to MLIL and HLIL
- Moved llil module to lowlevelil module (mlil and hlil will be moved as well)
- Added preliminary LLIL unit testing

Fix typos

Misc clippy fixes

More rust cleanup

- Normalized modules
- Split some code out into own files
- Fixed some UB in type archive and projects
- Improved API around type archives and projects substantially
- Added ProgressExecutor abstraction for functions which have a progress callback
- Improved background task documentation and added unit tests
- Added worker thread api and unit tests
- Moved some owned types to ref types, this is still not complete, but this is the path forward.
- Add external location/library accessors to the binary view
- Added some misc documentation
- Replaced mod.rs with the module name source file

Still need to normalize some paths and also update some documentation surrounding that change.

Update some tests and examples

Fix background task tests colliding

We were creating multiple background tasks with the same progress text on multiple threads

More rust cleanup

- Fixed progress executor freeing itself after one iteration
- Updated the last of the doc imports
- Moved mainthread to main_thread
- Made project creation and opening failable

We could probably AsRef<ProgressExecutor> to get around the allocation and free inside the function bodies, not high priority as those functions are long running anyways.

Move binary view initialization into function body for LLIL test

Normalize test file names

More rust cleanup

- Updated README to clarify offline documentation
- Refactored settings api
- Added settings example to show dumping settings value and specific properties
- Use the workspace to depend on binaryninja and binaryninjacore-sys
- Remove binaryninjacore-sys as a workspace member (its not really required)

Update workflow test to new settings api

More rust cleanup

- Rename Label to LowLevelILLabel
- Update the functions label map automatically

This fixed a major api blunder where the label map is returned as a reference and originally resulted in UB prone lifetime semantics. It was temporarily "fixed" with explicit label updates in the architecture implementation code. But that was less than ideal and was easy to mess up. Now the label map will be updated automatically as the location of labels is now tracked.

Misc clippy lints

More rust cleanup

- Get rid of RawFunctionViewType
- Add better Debug impl for Function

More rust cleanup

- Fixed the documentation icon using local files (thank you @mkrasnitski)
- Fixed labels being updated and overwriting the label location used to update the label map

More rust cleanup

- Added unit tests for MLIL and HLIL
- "Fixed" MLIL, LLIL, and HLIL having issues regarding Instruction vs Expression indexes
- Renamed CallingConvention to CoreCallingConvention and removed architecture generic
- Renamed CallingConventionBase to CallingConvention
- Simplified calling convention code and fixed some bugs with implicit registers
- Added impl Debug to MLIL and HLIL instructions

Still need to at some point add an Expression to MLIL and HLIL. We also might want to look into having the Instruction kind just return the expression kind.

Misc clippy lint

More rust cleanup

- Allow calling conventions to be registered for multiple architectures
- Swapped a unreachable statement to an unimplemented statement

More rust cleanup

- Fixed the issue with PDB types, this has caused me an insane amount of grief
- Fixed LLIL visit_tree missing LLIL_LOAD expressions
- Added LLIL visitor test
- Made all WARP file pickers use the rfd crate

Use the dev branch of Binary Ninja in rust CI

Misc rust fmt

More rust cleanup

- Refactored BinaryReader and BinaryWriter
- Added some warnings to high_level_il and medium_level_il modules that they are unstable
- Add BinaryReader and BinaryWriter tests
- Changed BinaryView len to return u64 (that is what the core returns)
- Misc formatting changes
- Remove extern uses in lib.rs

Add impl Debug for BinaryReader and BinaryWriter

Turn off broken tests

Add more info to the rust README.md

More rust cleanup

- Make EdgeStyle type not wrap raw
- Regression tests for WARP will run on all bins in the out dir now

impl rust Collaboration and Remote API

Fix typo

Update collaboration API

Makes collaboration more in line with the refactor. Still a lot of work to do.

Namely still need:
- Proper errors
- _with_opts functions
- More ergonomic api
- Better connection procedure
- Updated documentation
- A LOT of unit tests
- An example
- Typed id's for everything (i dont want BnString as the id!!!)
- NEED to refactor the progress callbacks into the new progress api, but we should pull in some of the stuff the collab progress has
- Elimination of apis that are dumb helpers

Separate out the rust testing and use pull_request_target

pull_request_target allows PR's to access the headless license, for this to be safe we need to prevent people from running the job.

To prevent the job from being ran we add an environment requirement on testing that a reviewer must review the code and then manually approve it to run.

More rust cleanup

- Use GroupId instead of u64
- Use ProgressCallback in place of ProgressExecutor
- Misc cleanup of FileMetadata
- Add `save_to_path` and `save_to_accessor` to save modified binaries
- Added binary_view unit tests
- Added collaboration unit tests
- Fixed a few issues with the collaboration apis
- Renamed Command registration functions so that there is no import ambiguity
- Split out RemoteUndoEntry
- Collaboration apis now have a explicit `_with_progress` set of apis
- Misc clippy lint fixes

Fix some typos

More rust cleanup

- Add extra info to README.md
- Refactor components api
- Add components unit test
emesare added a commit to Vector35/binaryninja-api that referenced this pull request Jan 25, 2025
Moves a bunch of stuff out of src/types.rs that did not belong:
- Confidence
- Variable
- Function specific stuff

- Refactored InstructionInfo, see the msp430 and riscv examples.
- Renamed Function::from_raw to Function::ref_from_raw and fixed places where the ref was incremented twice
- Fixed FunctionRecognizer leaking functions (see above)
- Fixed some apis incorrectly returning Result where Option is clearer
- Started to move destructured types to the From trait for converting to an from ffi types, see Location for an example
- Started to remove bad module level imports (importing std modules like mem all over the place)
- Moved some wrapper handle types to named handle field (this improves readability), see CoreArchitecture for an example
- Removed some unintuitive getters, this is bad practice for Rust code, just access the field directly, see DataVariable for an example
- General code cleanup, purposely did not run rustfmt, that will be a single seperate commit

More rust cleanup

- Fixed invalid views being able to invoke UB when dealing with databases
- Cleaned up some helper code in dwarf_import
- Fixed inverted is_null checks causing crashes! Oops!

More rust cleanup

Still a WIP, I think branch info is still invalid, need to figure out the issue there.

- Fixed some invalid Ref lifetimes when constructing indirectly, see Array<DataVariable> for example
- Added some more comments
- Removed some "magic" functions like MLIL Function::ssa_variables

There are still a bunch of invalid lifetimes that aren't crashing us due to the usage of those API's not living long enough. But they ARE an issue.

More rust cleanup

Trying to comment more TODO's as I go along.

- Renamed llil::Function to llil::LowLevelILFunction for clarity and consistency
- Take base structures by ref in StructureBuilder::set_base_structures to prevent premature drops
- Added more raw to wrapper conversions
- Removed UB prone apis
- Getting rid of more core module references, use std!
- Removed improper Guard usage in array impls for wrapper types with no context
- Untangling the UB of the Label api, still only half done :/

More rust cleanup

- Misc formatting
- Made Logger ref counted
- Fixed leaking name of logger every time something was logged
- Fixed the last (hopefully) of the unresolved labels
- Simplified some code

Fix leak in DebugInfo::AddType

componentArray was never freed

Add more HLIL related functions to rust

More rust cleanup

improve the CustomBinaryView init process

Canonicalize path in `create_delete_empty` test

Link core in rust

When linking you must depend on the -sys crate.

This is because linker arguments (what says where to find binaryninjacore) are NOT transitive. The top level application crate MUST provide it.

For more information see:
- rust-lang/cargo#9554 (comment)
- oxidecomputer/omicron#225

Remove vendored pdb crate

Use cargo to manage the git repo ref instead

Fix misc rustdoc warnings

Move actual plugins out of `rust/examples` and into `plugins`

This is where all shipped public plugins that are not arch/view/platform/lang will be at from now on

Originally they were in the rust workspace, meaning they all shared a Cargo.lock which is ill-advised.

More rust cleanup

- More clarification on plugin/executable requirements
- Made examples actually rust examples
- Add Display impl for InstructionTextToken
- Renamed feature "noexports" to "no_exports"

Move under_construction.png to assets directory

This really did bother me

Remove unneeded `extern crate bindgen`

Replace nop'd log::info with println

We don't register a compatible log sink so they will just get sent into the void

Move inline tests into tests directory

This is being done in the hopes of curbing the multi-thousand line files that will occur once we flesh out the tests

Format rust code

Update rust ci

Still need to add support for running tests in ci

More rust cleanup

- Architecture id's are now typed accordingly
- Fix some clippy lints
- Make instruction index public in LLIL
- Removed useless helper functions
- LLIL expressions and instruction indexes are now typed accordingly

Generate binaryninjacore-sys documentation

This should show binaryninjacore-sys alongside binaryninja crate

More rust cleanup

- Remove lazy_static dependency
- Remove hacky impl Debug for Type and just use the display impl
- Add more debug impls
- Reorder some top level namespace items
- Add first type test

Remove unneeded script helper in rust api

More rust cleanup

- Added main thread handler api
- Register a headless main thread handler by default in initialization
- Refactor QualifiedName to be properly owned
- Loosened some type constraints on some apis involving QualifiedName
- Fixed some apis that were crashing due to incorrect param types
- Removed extern crate cruft for log crate
- Simplified headless initialization using more wrapper apis
- Fixed segments leaking because of no ref wrapper, see BinaryViewExt::segment_at
- Added rstest to manage headless init in unit tests
- Added some more unit tests
- Refactored demangler api to be more ergonomic
- Fixed minidump plugin not building

More rust cleanup

- Fixup usage of QualifiedName in plugins
- Make QualifiedName more usable

Implement rust TypeParser

fix Platform TypeParser related functions

separate User and System implementations of TypeParserResult

Implement rust TypeContainer

More rust cleanup

- Hopefully fixed the rust.yml CI
- Added TypePrinter API (this is still WIP and will crash)
- Added TypeParser API
- Added TypeContainer API
- More work around QualifiedName apis

Oh your suppose to do this

Add workflow_dispatch trigger to rust.yml

More rust fixes

- Swapped some usage of raw 255 to MAX_CONFIDENCE, no one likes magic numbers
- New InstructionTextToken API, complete with owned data, this still needs a lot of testing.
- InstructionTextTokenKind describes a destructured InstructionTextToken, this should make token usage much clearer, some docs pending
- Added some misc Default and Debug impls
- Updated TypePrinter and architectures to use new InstructionTextToken API

Misc formatting changes

More rust cleanup

- Fixed MANY memory leaks (most due to QualifiedName)
- Made StructureBuilder more builder and less structure
- Fixed CMakeLists.txt that were globbing entire api, resulting in 100 second slowdown on cmake generation
- Added more Debug impl's
- Added some more tests
- Fixed TypeParserResult UB
- Moved the From impls to blank impl for clarity, we have multiple different variants of core to rust for some structures, it should be explicit which one you are choosing.
- PossibleValueSet should now be able to allocate so we can go from rust to core with those variants that require allocation
- Misc doc code formatting

Misc clippy lints and clippy CI

Co-authored-by: Michael Krasnitski <michael.krasnitski@gmail.com>

Fix typo in rust CI

Misc rust formatting

Fix misc typos and add typos to rust CI

Add cargo workspace

This will help tooling and external contributors get a map of the rust crates within binaryninja-api

More rust cleanup

- Format all rust plugins
- Fix some tests that were out of date
- Simplify WARP tests to only binaries, building object files from source is a pain
- Link to core in all rust plugins
- Fix some memory leaks
- Update warp insta snapshots
- Fix some misc clippy lints

Run rust tests in CI

This commit also coincides with the creation of the "testing" environment which exposes a BN_SERIAL secret for pulling a headless Binary Ninja

Install missing wayland dependency in github CI

Apparently its needed for linux file picker for the WARP integration

Set the BINARYNINJADIR so rust can find binaryninjacore in CI

The chances of this working are low

Misc remove unused dependency

Rust misc formatting fixes

Improve initialization in rust headless scripts

Provide sensible errors and validation to rust headless scripts, solves #5796

Add BN_LICENSE environment variable to rust CI

We pass the serial to download binary ninja, but we never provided the license for core initialization

Fix typo

More rust cleanup

- Improved binary view initialization (see init_with_opts)
- Allow floating license to free itself before initialization
- Add initialization unit test
- Add better Debug impls for some common types
- Use Path api for opening binary views, this is not breaking as it uses the AsRef impl
- Bump rayon dependency and constrain dependencies to x.x

Update readme and include that directly in the rustdocs

More rust documentation changes

Add format comment to InitializationOptions::with_license

Misc formatting and clippy lint allow

More rust cleanup

- Remove under_construction.png from the build.rs it has been removed
- Use InstructionIndex in more places
- Add missing PartialOrd and Ord impls for id types

More rust cleanup

- Make workflow cloning explicit
- Add workflow tests
- Add missing property string list getting for settings
- Remove IntoActivityName (see #6257)

More rust cleanup

This commit is half done

Misc rust formatting

More rust cleanup

- Renamed common name conflictions (I will put my justification in the PR)
- Fixed invalid instruction retrieval for LLIL
- Added common aliases for llil function, instruction and expression types (see my comment in the PR)
- Refactored the instruction retrieval for LLIL, MLIL and HLIL
- Added instruction index types to MLIL and HLIL
- Moved llil module to lowlevelil module (mlil and hlil will be moved as well)
- Added preliminary LLIL unit testing

Fix typos

Misc clippy fixes

More rust cleanup

- Normalized modules
- Split some code out into own files
- Fixed some UB in type archive and projects
- Improved API around type archives and projects substantially
- Added ProgressExecutor abstraction for functions which have a progress callback
- Improved background task documentation and added unit tests
- Added worker thread api and unit tests
- Moved some owned types to ref types, this is still not complete, but this is the path forward.
- Add external location/library accessors to the binary view
- Added some misc documentation
- Replaced mod.rs with the module name source file

Still need to normalize some paths and also update some documentation surrounding that change.

Update some tests and examples

Fix background task tests colliding

We were creating multiple background tasks with the same progress text on multiple threads

More rust cleanup

- Fixed progress executor freeing itself after one iteration
- Updated the last of the doc imports
- Moved mainthread to main_thread
- Made project creation and opening failable

We could probably AsRef<ProgressExecutor> to get around the allocation and free inside the function bodies, not high priority as those functions are long running anyways.

Move binary view initialization into function body for LLIL test

Normalize test file names

More rust cleanup

- Updated README to clarify offline documentation
- Refactored settings api
- Added settings example to show dumping settings value and specific properties
- Use the workspace to depend on binaryninja and binaryninjacore-sys
- Remove binaryninjacore-sys as a workspace member (its not really required)

Update workflow test to new settings api

More rust cleanup

- Rename Label to LowLevelILLabel
- Update the functions label map automatically

This fixed a major api blunder where the label map is returned as a reference and originally resulted in UB prone lifetime semantics. It was temporarily "fixed" with explicit label updates in the architecture implementation code. But that was less than ideal and was easy to mess up. Now the label map will be updated automatically as the location of labels is now tracked.

Misc clippy lints

More rust cleanup

- Get rid of RawFunctionViewType
- Add better Debug impl for Function

More rust cleanup

- Fixed the documentation icon using local files (thank you @mkrasnitski)
- Fixed labels being updated and overwriting the label location used to update the label map

More rust cleanup

- Added unit tests for MLIL and HLIL
- "Fixed" MLIL, LLIL, and HLIL having issues regarding Instruction vs Expression indexes
- Renamed CallingConvention to CoreCallingConvention and removed architecture generic
- Renamed CallingConventionBase to CallingConvention
- Simplified calling convention code and fixed some bugs with implicit registers
- Added impl Debug to MLIL and HLIL instructions

Still need to at some point add an Expression to MLIL and HLIL. We also might want to look into having the Instruction kind just return the expression kind.

Misc clippy lint

More rust cleanup

- Allow calling conventions to be registered for multiple architectures
- Swapped a unreachable statement to an unimplemented statement

More rust cleanup

- Fixed the issue with PDB types, this has caused me an insane amount of grief
- Fixed LLIL visit_tree missing LLIL_LOAD expressions
- Added LLIL visitor test
- Made all WARP file pickers use the rfd crate

Use the dev branch of Binary Ninja in rust CI

Misc rust fmt

More rust cleanup

- Refactored BinaryReader and BinaryWriter
- Added some warnings to high_level_il and medium_level_il modules that they are unstable
- Add BinaryReader and BinaryWriter tests
- Changed BinaryView len to return u64 (that is what the core returns)
- Misc formatting changes
- Remove extern uses in lib.rs

Add impl Debug for BinaryReader and BinaryWriter

Turn off broken tests

Add more info to the rust README.md

More rust cleanup

- Make EdgeStyle type not wrap raw
- Regression tests for WARP will run on all bins in the out dir now

impl rust Collaboration and Remote API

Fix typo

Update collaboration API

Makes collaboration more in line with the refactor. Still a lot of work to do.

Namely still need:
- Proper errors
- _with_opts functions
- More ergonomic api
- Better connection procedure
- Updated documentation
- A LOT of unit tests
- An example
- Typed id's for everything (i dont want BnString as the id!!!)
- NEED to refactor the progress callbacks into the new progress api, but we should pull in some of the stuff the collab progress has
- Elimination of apis that are dumb helpers

Separate out the rust testing and use pull_request_target

pull_request_target allows PR's to access the headless license, for this to be safe we need to prevent people from running the job.

To prevent the job from being ran we add an environment requirement on testing that a reviewer must review the code and then manually approve it to run.

More rust cleanup

- Use GroupId instead of u64
- Use ProgressCallback in place of ProgressExecutor
- Misc cleanup of FileMetadata
- Add `save_to_path` and `save_to_accessor` to save modified binaries
- Added binary_view unit tests
- Added collaboration unit tests
- Fixed a few issues with the collaboration apis
- Renamed Command registration functions so that there is no import ambiguity
- Split out RemoteUndoEntry
- Collaboration apis now have a explicit `_with_progress` set of apis
- Misc clippy lint fixes

Fix some typos

More rust cleanup

- Add extra info to README.md
- Refactor components api
- Add components unit test

Add testing and documentation to contributing section in README.md

Fix misc doc comments
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants