Updating Desktop Rust

Desktop software can be fast, local and secure--critical for good legal technology.

But, unlike web apps, desktop applications go stale and require updates for each new build.1

Each such update is a moment of truth for the application’s developers. For some period during the update the application is almost unavoidably non-functional. If things go wrong, the user may be stuck with the current version forever. Worse, a bad update may brick the application altogether. And you’ve worked really, really hard to deliver great software in the first place.2

There are some Rust crates addressing this, but because it is so central to the user experience, this is again an operation Tritium wants to own.

So what’s a good approach? Let’s consider a few examples.

Background Update Daemon

Some applications run a background service which manages updates.

Adobe, for example, seems to have a separate auto-update binary called Adobe Acrobat Update Service that runs as a daemon. A separate service running at predictable intervals allows them to ensure pristine conditions for updates at off-peak hours, perhaps with even timing staggered to ensure sufficient bandwidth on the update server. They can manage atomic updates and rollbacks while the user sleeps.

Oh, and they also get routine passive telemetry about their installed base as a side effect.

But for legal technology, privacy is paramount, and users don’t expect Tritium to phone home when idle. An integrated drafting environment needs to be trusted with reading, editing and redlining confidential and trade secret documents. Surprising the user with separate named processes in Task Manager risks damaging that trust.

It also seems over-expansive from a security perspective to keep a possibly elevated, network-accessible application running in the background just to ensure your application remains up-to-date.

This approach is out.

Asynchronous Background Thread

The excellent Zed editor adopts a more nuanced approach.

Instead of a separate daemon, as far as I can tell, Zed spawns an auto-updater child thread that periodically phones home to check for updates while the application runs.

Here’s some of the Zed code. From auto_update.rs:

...

const POLL_INTERVAL: Duration = Duration::from_secs(60 * 60);

...

impl AutoUpdater {

    pub fn start_polling(&self, cx: &mut Context<Self>) -> Task<Result<()>> {
        cx.spawn(async move |this, cx| {
            loop {
                this.update(cx, |this, cx| this.poll(UpdateCheckType::Automatic, cx))?;
                cx.background_executor().timer(POLL_INTERVAL).await;
            }
        })
    }

    pub fn poll(&mut self, check_type: UpdateCheckType, cx: &mut Context<Self>) {
        if self.pending_poll.is_some() {
            return;
        }

        cx.notify();

        self.pending_poll = Some(cx.spawn(async move |this, cx| {
            let result = Self::update(this.upgrade()?, cx.clone()).await;
            this.update(cx, |this, cx| {
                this.pending_poll = None;
                if let Err(error) = result {
                    this.status = match check_type {
                        // Be quiet if the check was automated (e.g. when offline)
                        UpdateCheckType::Automatic => {
                            log::info!("auto-update check failed: error:{:?}", error);
                            AutoUpdateStatus::Idle
                        }
                        UpdateCheckType::Manual => {
                            log::error!("auto-update failed: error:{:?}", error);
                            AutoUpdateStatus::Errored {
                                error: Arc::new(error),
                            }
                        }
                    };

                    cx.notify();
                }
            })
            .ok()
        }));
    }

...

}

That’s pretty straightforward. We check for an update once an hour and handle errors.

Ok, so we have an update downloaded and validated. What’s next?

On POSIX systems, in theory, we can just update the files in place. Except for dynamically loaded assets, this should work just fine. The OS treats file locking as advisory and open file descriptors are irrelevant, so even if the file is running you overwrite it in place and the user will get the freshest version on the next launch. You’ll want to at least handle those assets separately such that, for example, your user isn’t running version 1.2 in memory while reading the updated version 1.3 config from disk. But maybe it’s as simple as unpacking things and waiting for a restart.

But what about Windows?

Windows generally treats running binaries as locked. Any attempt to overwrite a running binary should throw an exception. Again, the Zed source provides a reasonable solution.

From auto_update_helper:

pub(crate) fn run() -> Result<()> {
    let helper_dir = std::env::current_exe()?
        .parent()
        .context("No parent directory")?
        .to_path_buf();
    init_log(&helper_dir)?;
    let app_dir = helper_dir
        .parent()
        .context("No parent directory")?
        .to_path_buf();

    log::info!("======= Starting Zed update =======");
    let (tx, rx) = std::sync::mpsc::channel();
    let hwnd = create_dialog_window(rx)?.0 as isize;
    let args = parse_args(std::env::args().skip(1));
    std::thread::spawn(move || {
        let result = perform_update(app_dir.as_path(), Some(hwnd), args.launch);
        tx.send(result).ok();
        unsafe { PostMessageW(Some(HWND(hwnd as _)), WM_TERMINATE, WPARAM(0), LPARAM(0)) }.ok();
    });
    unsafe {
        let mut message = MSG::default();
        while GetMessageW(&mut message, None, 0, 0).as_bool() {
            DispatchMessageW(&message);
        }
    }
    Ok(())
}

This is compiled into a separate binary containing a main() that calls run() on Windows and is no-op otherwise. It is also immediately recognizable for the Windows API naming conventions.

Why a separate binary?

That’s the trick to freeing the locked Zed.exe for updating while maintaining flow control.

Once an update is ready, the auto_update_helper is queued for the next startup. It runs perform_update which loops through a const list of JOBS like installing new files and removing old ones, including the now unlocked Zed.exe. Once complete, the helper hands the process back off to Zed with:

if launch {
    #[allow(clippy::disallowed_methods, reason = "doesn't run in the main binary")]
    let _ = std::process::Command::new(app_dir.join("Zed.exe")).spawn();
}

And they're back to work.

Speedbump

The Zed auto-updater code is robust to errors and generally stays out of the way.

It also empowers the user to ignore broken updates and just continue using the current version without any concern for its staleness.

However, when you’re early in development on a privacy-sensitive legal tech, staleness can be a problem. You can’t leave users exposed to versions containing a security flaw, for example. More likely, glaring bugs could be damaging to establishing trust.

Tritium thus adopts what I’ll call the “speedbump” auto-updater approach.

Tritium makes a one-time asychronous check for updates every time the community user starts the application. If an update is available, Tritium downloads the update, exits and automatically deploys the update right then and there. As with the Zed update, Tritium hands this off to a helper process which also retains the command-line arguments to pass to the new version on restart. If things go well, the user should be back in action with the freshest version in a few seconds. Nothing more than a small speedbump.

The application is stuck in this state if things go wrong, however, and the user will have to manually uninstall and reinstall it.

Because it happens on launch every time, it also serves as a potential flow interruption. If things don’t work or take too long, we’re going to have an unhappy user. Lawyers are not patient or generally much interested in buggy technology. They want their word processor and redlining to work, and they don’t really care why.

But putting this UX bottleneck front-and-center follows a principle outlined in a great book The Goal. Understanding that users will hit the “speedbump” on every boot ensures that it receives first-class performance attention and is always top of mind in development. It becomes embedded in the user experience. We thus won’t be able to get sloppy and deploy large updates silently in the background or neglect the update experience.

Unlike Zed, Tritium uses the helper binary on all platforms for simplicity. But what about the updater itself?

How do we update that?

async fn do_update() {
    ...

    let tmp_updater = tmp_directory.join(UPDATER_EXE);
    let Ok(current_exe) = std::env::current_exe() else {
        log::error!("Failed to determine current executable path");
        return;
    };
    let Some(working_dir) = current_exe.parent() else {
        log::error!(
            "Failed to determine working directory from current executable path: {}",
            current_exe.display()
        );
        return;
    };

    ...

    let Ok(_) = std::fs::write(&tmp_file, &bytes) else {
        log::error!(
            "Failed to write update file to temporary directory: {}",
            tmp_file.display()
        );
        return;
    };
    let mut updater_exe = working_dir.to_path_buf();
    updater_exe.push(UPDATER_EXE);
    log::info!(
        "Copying updater executable from {} to {}",
        updater_exe.display(),
        tmp_updater.display()
    );
    match std::fs::copy(&updater_exe, &tmp_updater) {
        Err(error) => {
            log::error!("Failed to copy updater executable: {}", error);
            return;
        }
        _ => {}
    };

    ...

    let Ok(_) = std::process::Command::new(tmp_updater)
        .arg(current_exe.to_path_buf().as_os_str())
        .arg(tmp_file.to_path_buf().as_os_str())
        .spawn()
    else {
        log::error!("Failed to spawn updater process.");
        return;
    };
    // do the update in a separate process which will restart the app
    std::process::exit(0);
}

We rename the legacy updater binary, run it under its new name and deploy the updated version which cleans up after itself next time.

Magic.

Thanks for reading.


[1] This applies to a much more limited extent to enterprises who orchestrate their own updates using more sophisticated tools. Commercial Tritium licensees have access to a customizable installer and version-pinned binaries as described in the documentation.

[2] The Zed team recently made a mistake and disabled Zed's auto-updates by default: as written about here. I grieved for them. Developers work extremely hard to create software that is compelling enough to be downloaded and installed that it is an absolutely brutal to potentially orphan a large number of users with a legacy version. Their plight partially inspired this post.