Need Moar Glass

The age of the ELF and DWARF is upon us.

The predominant ideology around public-permissionless blockchains is absolute code transparency, going as far as to question the right of closed-source projects to exist.

Even though data on platforms like Solana is inherently accessible to everyone, there's the harsh reality of (accidental) code obfuscation: The disassembly of a Solana program release build looks like you would expect it to be shown in an average movie hacker scene.

Ahhh, I'm disassemblinggg…

And for a while, this has been the extent of the insights gained by looking at Solana programs. Now add 9 digits worth of assets under management to the mix. You start to face a problem: How can we ensure that programs consisting of tens of thousands of BPF instructions behave as we expect them to?

What we can see today

So, for the past months, buildoors pushed substantial upgrades to low-level tooling for Solana. Here are some of the tools that are available today to untangle on-chain programs.

Static Analysis

Ghidra is an open-source framework for reverse engineering developed by the NSA's Research Directorate (yeah, that NSA). Originally used for malware analysis, its wide range of supported architectures makes it ubiquitous wherever source code is missing.

A notable gap is BPF support, the instruction set used by the Solana VM. SolDragon by Neodyme is starting to fix that. While not ready yet (as of 2022-04), it brings us closer to a decompiler and binary differ.

GitHub - neodyme-labs/SolDragon: Solana Ghidra Stuff (WIP)
Solana Ghidra Stuff (WIP). Contribute to neodyme-labs/SolDragon development by creating an account on GitHub.

Solana security.txt

Observability extends to the human element too.

You'd think that DeFi firms treated whitehats like their kneecaps depended on them (they kinda do). To the horrors of the researchers at Neodyme, disclosing vulnerabilities to a program deployer can be quite the challenge.

And so, the solana-security-txt standard was born. It ships a Rust macro to embed standardized contact info into Solana programs. The result is a dead-simple process to tell web3 devs what's wrong, analogous to the securitytxt.org standard for websites.

GitHub - neodyme-labs/solana-security-txt: security.txt for Solana Contracts
security.txt for Solana Contracts. Contribute to neodyme-labs/solana-security-txt development by creating an account on GitHub.

Reproducible builds

Open-source smart contracts are not quite enough. We need to verify that the provided source code translates exactly into the bytecode deployed on-chain.

The Anchor framework recently gained the Verifiable Builds to catch even the tiniest bit flip.

Verification Alert shown in the Solana Explorer

Is this enough?

The aforementioned tools give us better visibility into Solana byte code and strong assurance that said bytes are derived from specific source code.

Yet, we need more glass!

Let's start at the byte code this time; consider this fictional scenario: You got some alpha about a vulnerability at the 28306th instruction of some contract, and go disassemble.

rbpf-cli -u disassembler ./program.elf

You look bewildered at this mix of bytes. You grep the source code for 0xf0f0f0f0f0f0f0f and nothing comes up. What's going on here? Why would a smart contract use such a weird integer?

It slowly dawns on you that the state of Solana low-level is still all-or-nothing.

*record scratch*

Intermediate Build Steps

The above was a dramatic re-enactment of when I realized that we don't actually have a way to correlate BPF sub-routines to Rust functions.

Unfortunately, we're still ignoring the intermediate build steps and valuable DWARF debug symbols that would have mapped byte code to line numbers.
We also can't interactively debug BPF code yet.

Introducing bpf.wtf

Shining light on Solana ELFs using DWARFs

Anatoly said it; so we're making it happen

Projects

The bpf.wtf team formed after we had noticed we're hacking on the same ideas. As a loose group of devs, we'll ship various non-profit open-source contributions and public goods for Solana program devs & security researchers.

The following is a non-exhaustive list of the things we'll be working on to build help support the security research ecosystem.

bpf.sol

Starting with bpf.sol – a series of writeups of the internals making up the Solana program runtime. We'll try to release posts every two weeks each documenting a part of the virtual machine as we descend down the stack.

bpf.wtf - Private Site Access

DWARF symbols

DWARF is the industry standard for debug info in ELF executables. It enriches a binary with various info that gets lost when compiling to machine code, such as symbol names, data type info, and mappings to source code (line numbers).

So to kick off, we've fixed Solana's LLVM fork to re-enable DWARF support for the  bpfel+solana target. The ability to create debug symbols for Solana C or Rust on-chain programs is a first major upgrade in visibility.

[SOL] Debug sections relocations by jawilk · Pull Request #32 · solana-labs/llvm-project
This is an expansion of #25I was discussing with @terorie how we might get complete debug info back in.After inspecting some objdumps I found ELF::R_BPF_64_ABS32 relocations in .debug_* sections ...
🧠
And now we know that the BPF snippet above was __popcountdi2!

The VM maintainers at Solana Labs have helped us get our first contribution merged. Expect full debug info support in the next release of solana-bpf-tools.

Even with DWARF support, debug info is stripped from release builds by default because of binary bloat. In fact, some Rust programs produce larger release+debuginfo builds than just debug builds.

GDB and LLDB support

Next up, @wj has been working on a proof-of-concept connecting the rbpf virtual machine to a debugger via the GDB remote serial protocol.

Devs and hackers will be able to introspect every aspect of program execution (registers, stack, memory, read-only data) with per-instruction granularity.

We expect to polish and ship this feature in Q2/2022.

A new age is upon us

This integration involves work on two fronts.

  • First the "backend", i.e. the target being debugged, needs to be modified to accept debugger commands: Setting breakpoints, interrupting execution, etc. User Sladuca managed to do a lot of progress last year though development appears to have stopped: https://github.com/solana-labs/solana/issues/14756
  • The frontends (GDB and LLDB) have to be taught the machine architecture and ABI details like stack frame layouts and calling convention.

The bpf.wtf project is continuing on both fronts, mainly focusing on LLDB.

Visual debugger frontend

Let's get with the times – the GDB command line is nice, but off-putting to noobies (like me) due to its learning curve. One of our major milestones is to integrate a GUI-based debugger frontend with the Solana VM.

One such option is CodeLLDB for example, a Visual Studio Code plugin.

CodeLLDB - Visual Studio Marketplace
Extension for Visual Studio Code - A native debugger powered by LLDB. Debug C++, Rust and other compiled languages.

Solana VM on WebAssembly

💾
sqlana – it's all just database engineering

It's well known that Solana validators are beasts – infra people run them with 512GB of memory and powerful server CPUs. Still, the Sealevel smart contract runtime takes less than a millisecond of CPU time to actually execute code. The vast majority of validator resources are spent on moving accounts from/to memory. In theory, any isolated on-chain program can easily run on a Raspberry Pi.

With a bit of effort, we were able to wrap the Sealevel runtime in an (unreleased) portable C library creatively named libsealevel.

Client-side execution of contracts further enables ways to simulate transactions that are impossible with the RPC API. This includes features necessary for interactive debugging (single-stepping, breakpoints, machine introspection).

The team managed to create a build of the rbpf virtual machine targeting wasm32-unknown-unknown (wasm-pack) through a bit of refactoring. Once done, a Wasm build with a JS wrapper will be released as an NPM package.

wasm-pack building rbpf

Interactive Solana Explorer Debugger

To recap, we're working on …

  • Debug info (DWARF)
  • Debugging of historical executions
  • A visual debugging frontend
  • Contract execution outside of blockchain nodes

As you may have noticed, these milestones set us up to achieve our final goal:
>simulating/debugging Solana programs directly in the public Solana explorer.

As an unpaid/not-for-profit team of nerds, we're not sure if we'll ever reach the endgame. If it works out, you'll see us on Breakpoint. :)


Anyways, thanks for checking out our work! We hope you're as hyped as we are.

Follow us on Twitter for the latest developments and occasional shitposts. If you want to help out, please DM us.

Subscribe to bpf.wtf

Don’t miss out on the latest issues. Sign up now to get access to the library of members-only issues.
jamie@example.com
Subscribe