Recording object snapshots by (ab)using JavaScript proxies

JavaScript proxies are insanely powerful, and I want to talk about how I (ab)use them to keep track of snapshots of an object within itself, as transparently as possible.

I don’t recall when I first learned about Proxy, but I remember that I disregarded it as yet another JavaScript utility I’d never use. It was only after reading Svelte 5’s source code (I was curious to see how they implemented $state()) that I realised how powerful Proxies are. Make sure to read the questions and answers if you think what I’m doing is “going too far”.

As far as I know, nothing like this has been done before (I created this sometime last year and briefly searched for something then), which is why I’m writing this post to explain the idea behind it. If you know of any previous solutions to this problem, please let me know so I can point to them here.

A brief overview of a Proxy

A Proxy is an object that wraps another and provides its own implementation for operations that are important to an object, such as setting or getting properties. This allows a Proxy to completely redefine the semantics of certain operations.

const obj = new Proxy({}, {
  set(_target, _prop, _value) {
    throw new Error("Can't do that!");
  }
});

obj.hello = "world"; // Throws "Error: Can't do that!".

For this post, the operations I’m interested in redefining are get to return properties of an object and has to determine whether the object has a property.

Defining a snapshot

Imagine we have the following object:

const obj = {
  hello: "world",
  children: [1],
  properties: {
    visible: true,
    type: "text",
  },
};

And we want to apply an update operation on it:

function update(obj) {
  obj.children.push(2);
  delete obj.properties;
  return obj;
}

update(obj);
// {
//   hello: "world",
//   children: [1, 2],
// };

This looks alright, but what if we want to go back to the initial object? It might be somewhat possible to reverse obj.children.push(2), but it’s impossible to directly reverse delete obj.properties. We’d need to keep track of what we deleted just so we can reverse that operation. Writing code to do this and keep track of all the intermediary structures gets a bit messy, and still requires wasted computation doing and redoing the same operations as you switch through different versions of the object.

It’d be easier to have snapshots of the object that we can switch to whenever we want. Each snapshot captures the state of the object at some point in time:

const obj = {
  hello: "world",
  children: [1],
  properties: {
    visible: true,
    type: "text",
  },
};

obj.snapshot(1);

update(obj);
// {
//   hello: "world",
//   children: [1, 2],
// };
obj.snapshot(2);

obj.restore(1);
// {
//   hello: "world",
//   children: [1],
//   properties: {
//     visible: true,
//     type: "text",
//   },
// };

obj.restore(2);
// {
//   hello: "world",
//   children: [1, 2],
// };

Why not make everything immutable?

What if instead of mutating an object, we created a new one on every update operation, and then made everything immutable?

This works in some scenarios, but unfortunately sometimes it’s easier to keep the same object references around, so we can’t create new root objects (to use the previous example, it’s fine to create an entirely new children array, but we want obj to always point to the same object instance).

This is often the case in user interfaces and when dealing with components: passing a reference to the whole object (which gets mutated whenever needed) is easier than making sure every component or widget or smaller unit knows about the latest root object.

Can we keep the convenience of a single object reference and still manage snapshots within the object?

A simple Proxy

Based on the example above, we want to give an object two functions to deal with snapshots: snapshot() takes a snapshot and associates it with an id, and restore() restores the object’s data to a snapshot.

The shape of our Proxy looks like this:

function snapshottableProxy(obj) {
  // We'll track "id -> snapshot data" here.
  const snapshots = new Map();

  return new Proxy(obj, {
    get(target, prop, receiver) {
      if (prop === 'snapshot') {
        return function snapshot(id) {
          // Take a snapshot and add it to `snapshots`.
        };
      } else if (prop === 'restore') {
        return function restore(id) {
          // Get the snapshot data from `snapshots` and restore it.
        };
      }

      // For all other properties.
      return Reflect.get(target, prop, receiver);
    }
  });
}

Note that we overwrote only the behaviour of getting a property to return specific functions when we try to access obj.snapshot or obj.restore. This is very sneaky — those aren’t actual properties of the object, and will always shadow actual properties named snapshot or restore on the object!

const obj = snapshottableProxy({});

console.log(Object.keys(obj)); // Prints "[]".
console.log('restore' in obj); // Prints "false".
console.log(obj.restore); // Prints "[Function: restore]".
obj.restore = "hello";
console.log(obj.restore); // Prints "[Function: restore]".
console.log(Object.keys(obj)); // Prints "[ 'restore' ]".

For the purposes of this post I’ll ignore these details to focus only on the idea of keeping track of snapshots and restoring them. Once we get to the end, I’ll talk about another reason why I do this (that isn’t only to maintain focus).

This is also partially why all the code in this post is JavaScript rather than TypeScript — if you’re trying to do this with TypeScript you’ll have to fight/override the behaviour of the compiler in some parts of the code.

Taking a snapshot

To save snapshot data, we’ll definitely need to obtain copies of every value the object has. If we reuse the same references (e.g. for arrays, nested objects, or strings), the snapshot’s data will change as the original value does, and we don’t want this.

The code below is only for the snapshot() function we return in the proxy.

function snapshot(id) {
  snapshots.set(id, structuredClone(obj));
}

Here’s a pointer if you’re curious about what structuredClone does (node.js also has an implementation for that).

Restoring a snapshot

It gets a bit trickier because we need to figure out which properties to add, which to update, and which to remove from the current object.

The code below is only for the restore() function we return in the proxy.

function restore(id) {
  const data = snapshots.get(id);

  if (data === undefined) {
    throw new Error("No data for this snapshot.");
  }

  // Removing properties we don't need anymore.
  for (const propName of Object.keys(obj)) {
    if (!(propName in data)) {
      delete obj[propName];
    }
  }

  // Updating and adding properties from the snapshot.
  for (const [propName, value] of Object.entries(data)) {
    obj[propName] = structuredClone(value);
  }
}

The only interesting note to make of this code is about line 17. We need to clone the value we’re restoring to avoid making a reference to the data inside the snapshot itself. If we write that line instead as obj[propName] = value, see what will happen once we mutate an object after restoring it:

const obj = snapshottableProxy({});
obj.children = [1];
obj.snapshot(1);
obj.restore(1); // After we do this, `obj.children` will point to the `children` in the snapshot data itself!

obj.children.push(2); // We're altering the snapshot data!

obj.restore(1); // We'd expect that `obj.children` is `[1]` after we do this, but...
console.log(obj.children); // Prints "[ 1, 2 ]".

Using the proxy

That’s really all you need for a simple proxy. Let’s check it out in action:

const obj = snapshottableProxy({
  hello: "world",
  children: [1],
  properties: {
    visible: true,
    type: "text",
  },
});
obj.snapshot(1);

update(obj); // This is the same `update()` function from an earlier example.
// {
//   hello: "world",
//   children: [1, 2],
// };
obj.snapshot(2);

obj.restore(1);
// {
//   hello: 'world',
//   children: [ 1 ],
//   properties: { visible: true, type: 'text' }
// }

obj.properties = {
  visible: false,
  type: "unit",
};
// We can even start a new "branch" of the object data. In this case, we started from version 1 (because of the `restore()` call above), and are saving it as version 3.
obj.snapshot(3);

// The earlier branch hasn't changed!
obj.restore(2);
// {
//   hello: "world",
//   children: [1, 2],
// };

obj.restore(3);
// {
//   hello: 'world',
//   children: [ 1 ],
//   properties: { visible: false, type: 'unit' }
// }

Figuring out whether we have a snapshottable proxy

Some code may care about whether it’s operating on a proxy that knows how to snapshot itself or on a regular object. The cleanest way to do this is to add a marker on the object, but since we’re dealing with proxies:

const SNAPSHOTTABLE_MARKER = Symbol("snapshottable object");

function snapshottableProxy(obj) {
  // ... irrelevant code skipped.

  return new Proxy(obj, {
    // ... irrelevant code skipped.
    has(target, prop) {
      if (prop === SNAPSHOTTABLE_MARKER) {
        return true;
      }

      return Reflect.has(target, prop);
    }
  });
}

And the code to check if an object is a proxy:

const obj = snapshottableProxy({});
console.log(SNAPSHOTTABLE_MARKER in obj); // Prints "true".
console.log(SNAPSHOTTABLE_MARKER in {}); // Prints "false".

This carries the same note from before: this code only overrides an operation on the object. It doesn’t actually add an extra property on it, just like how we don’t make the snapshot() and restore() functions actual properties on the object. Next section explains why we do this.

Why we keep this behaviour only on the Proxy

By now you probably understand why we don’t turn snapshot(), restore(), or the snapshottable marker into properties of the object itself: if these were actual properties on the object, the snapshot() code would also add them to the snapshot data!

Given that these properties are functions and symbols, structuredClone() won’t work with them. This also means that trying to snapshot objects that have functions will fail:

const obj = snapshottableProxy({});
obj.someOperation = function someOperation() {
  console.log("hello world");
};
obj.snapshot(1);
// Throws "[DataCloneError]: function someOperation() {
//   console.log("hello world");
// } could not be cloned."

It’s important to make sure that the actual object that is being proxied only has properties that can be snapshotted, and that the entire behaviour of the proxy remains inside the proxy handler.

Not-so-simple cases

That’s really all that sits at the core of this “snapshottable object” idea, and everything else is just extra details. I think this is a powerful solution to allow switching between multiple versions of some data, and (unexpectedly) becomes very powerful with Svelte’s reactive $state().

There are some behaviour changes that might be interesting to implement depending on what you want to accomplish. An example is to automatically snapshot the object if any of its properties change. I wanted an explicit snapshot mechanism because I know exactly when I want a snapshot to happen, so keeping track of all other intermediary states doesn’t matter.

This post was written only to transmit the idea of the “snapshottable object”, but here are some things you might want to think about if you want to improve on it:

Questions and answers

This feels like a hack, you’re introducing unexpected behaviour that other programmers will run into, this will turn into a mess to maintain, stop doing things that do too much magic, etc.

This post is too technical, and obviously I still wanted to get a bit philosophical, so here we are.

Nobody should be just throwing this code into their project, using it and calling it done. In fact, nobody should be just writing any code and calling it done.

As I have written before , code is more than just a way to provide instructions to the machine. It is also a means of communication between you and other people (or you and your future self). It is also a way to capture and share knowledge.

When you write any sort of code that functions as a tool (such as this snapshottable store), you should also be communicating and sharing knowledge about the tool. You should explain why that tool was needed, how it works, when and how to use it, and even when not to use it. None of that happens through code, because at that point you’re not talking to the machine anymore. You’re talking to other humans. This is partially why we have comments in code (and why they also bother me on a primordial level — but I’ll rant about that on a future post).

So when choosing to write (or use) code like this, do everyone a favour and write about those decisions so others can read about them and understand why they were made and why that code is there.

Is this a hack? I don’t think so. When I read “hack”, I understand it as something that is being used for an unintended purpose, that isn’t robust and/or has too many assumptions and will break when anything changes. None of these are true for this solution: it’s using a documented feature of JavaScript and nothing there relies on weird or unexpected behaviour. It’s also as much a mess to maintain as any other code.

This does go into “too much magic” territory, but that’s exactly why the magic should be documented — so it stops being magic. This question also touches a bit on a common opinion that I read about Svelte 5 online — that it’s doing things that are magic and confusing to people. And yes, they’re confusing to people when they don’t actually stop to read and learn the tool (Svelte’s documentation is great at undoing the magic). Is this a bad thing? I don’t know, I kinda like it. It’s selecting for people who can stop, read, and think.

I also read the same kind of criticism about Domain-Specific Languages, and these thoughts also apply to them. DSLs, snapshottable store, Svelte, or whatever — they’re all powerful tools. Just make sure you’re doing the rest of the work to document things properly and it’ll be fine.

Why would you ever want to use something like this?

I’m using it to track multiple versions of data that goes through update operations. The update operations are usually one-way only, thus making it wasteful to compute an arbitrary version at runtime.

In my use case, I may need to go through many versions within a small period of time, so it’s preferrable to keep snapshots for all versions than it is to recompute things every time.

On a solution with nested snapshottable objects and pointers to previous data when things haven’t changed, the memory usage is about the same as keeping a list of all updates and recomputing the state of the object every time.

Why only write a post about this and not release a package?

Since I first wrote code for this, it has gone through a bunch of change and is honestly already on the list of “refactor needed”. The idea behind the snapshottable object didn’t come to life as clean as I made it look in this post. I still wanted to write a post about it and share the idea, so I’m not releasing the code I have.

I also don’t really want to maintain an external package, but if I get a message from anyone saying that they want to use this for something, I’ll do my best to put it out there.

There’s something wrong with this solution and it’ll break, there’s an error in the code, I want to yell at you, etc.

Please email me and I’ll update this post if needed.