Implicit method proxy

I recently discovered a method to do generic proxying of Ethereum calls. Skip further down for nitty gritty details, I’ll start this off with some basics.

The problem

Sometimes, it’s convenient to build contract factories. For example, say you want to implement crowdsourcing, or auctions, or games, or DAOs. In those cases, instead of having one “Mother” auction which keeps track of active auctions (which, in turn keep track of bidders, offers and items), it makes sense to implement each auction/crowdfund/game/DAO as it’s own contract.

One problem is that these contracts are quite heavy; creating them may be very expensive and the creation of several may be limited by block gas limits. To counter this, the library-model can be used instead:

  1. An Auction instance A is instantiated
  2. When a new auction is needed, an AuctionLite contract is instantiated instead: B. AuctionLite has the same methods as Auction, but uses callcode/delegatecall to invoke the A instance as a library.

This has a bit of overhead;

  1. The cost of a callcode/delegatecall (pretty small).
  2. The cost of “unboxing” and “boxing” parameters. When a method is entered, the parameters from msg.data is masked according to constraints, such as uint or address. The B contract then creates a new msg which, when received by A, goes through the same unboxing once again.
  3. The B contract has a 1-1 mapping of methods; thus it can also become pretty large (expensive to create), scaling linearly with the size of A.

A solution

By going a bit under the hood of Solidity, and using raw EVM assembly, it is possible to create a more transparent proxy mechanism, where the proxy contract does not actually have to explicitly know the method signatures of the target (A).

The trick do pull this off, is to use a default function in the proxy contract and just pass msg.data along to the target contract, within a callcode or delegatecall (this works for call works as well, but that’s a different usecase).

contract proxy{
	/*
	  In order for this to work with callcode, the data-members
	  needs to be identical to the target contract. 
	  In other words, the "address add" must be the first
	  storage member. 
	  For `call` (true proxy), there's no such requirement.
	 */
	 
	address add;
	
	function proxy(address a){
		add = a;
	}
	function (){
	//Adds 185 in costs with no return value.
	//Adds 208 in gas with 32byte return value
		assembly{

			//gas needs to be uint:ed
			let g := and(gas,0xEFFFFFFF)
			let o_code := mload(0x40) //Memory end
			//Address also needs to be masked
			//Also, important, storage location must be correct
			// sload(0) is dependant on the order of declaration above
			let addr := and(sload(0),0xFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF) //Dest address
			//Get call data (method sig & params)
			calldatacopy(o_code, 0, calldatasize)

			//callcode or delegatecall or call
			let retval := call(g
				, addr //address
				, 0 //value
				, o_code //mem in
				, calldatasize //mem_insz
				, o_code //reuse mem
				, 32) //Hardcoded to 32 b return value
				
			// Check return value
			// 0 == it threw, so we do aswell by jumping to 
			// bad destination (02)
			jumpi(0x02,iszero(retval))

			// return(p,s) : end execution, return data mem[p..(p+s))
			return(o_code,32)
		}
	}
}

So how would you use this? Simple:

  1. Instantiate a target, A at <address_a>
  2. Instantiate a proxy instance B at <address_b>, with <address_a as input to constructor.
  3. Use the A ABI to invoke methods at <address_b> - do not use the B ABI.

Caveat

The overhead from the example above is that it adds ~200 in gas for each invocation, which is really miniscule. A second problem is that while method invocation is simple and truly generic, return values from those calls are not generic. It is simply not possible, from within our default function, with only access to the msg.data, to know how much data the target contract intends to return.

So the logic concerning that is not truly generic; it’s either hardcoded, or can be changed between method invocations. Note, though; if we expect a large return value, the gas cost is a bit higher, but if a target contract does not use the full space (or any at all, for that matter) - there are no errors; it still works. So hardcoding the largest expected return value makes for a somewhat generic proxy.

Some experimental results showed that an the contract above added 185 in gas with no return value, but 208 in gas with 32 byte return value defined.

Thirdly, one problem which is specific to callcode and delegatecall, but not call, is that A and B needs to have identical data members. In the example above, the address add resides at storage slot 0. It is fetched by sload(0). This means that A, which is the instantiated full contract, also needs a address at storage slot 0. And, naturally, all data members which A operates on needs to exist at identical storage slots in B.

Further development

A nice thing about this is that it’s based on the default function, which is only used if no other method signature matches. This means that if you add another method signature, say pointProxyAtAnotherContract(address), you can reprogram the proxy, instead of setting it in the constructor. This type of usage could be useful e.g for updating contracts, where the base contract is in fact a proxy, and whenever the library instance is updated, any new method signatures added to the new version will then be available in the original base contract.

Playground

If you want to experiment with this, you can use this gist, which can be imported directly into the online solidity compiler like this:

After instantiating a ‘complex’ contract, which will wind up on 0x692a70d2e424a56d2c6c27aa97d1a86395877b3a, you paste "0x692a70d2e424a56d2c6c27aa97d1a86395877b3a" as input to the constructor for proxy, and click Create.

Copy the proxy address (should be 0xbbf289d846208c16edc8474705c748aff07732db), and click complex At Address instead of Create. Now, paste 0xbbf289d846208c16edc8474705c748aff07732db - the address of the proxy contract. At this point, the interface “believes” the proxy contract to actually be a complex contract, and you can invoke the complex methods on that contract.

And you can experiment with the contract. Test for example to change call to callcode or delegatecall (but for the latter, you need to remove the value parameter in the call).

Invoking the toggle method, which (if executed correctly) costs more gas every other time, shows the negligable overhead using this proxy. No proxy:

Result: "0x0000000000000000000000000000000000000000000000000000000000001337"
Transaction cost: 15843 gas. 
Execution cost: 10414 gas.
Decoded: 
uint256: 4919
Result: "0x0000000000000000000000000000000000000000000000000000000000001337"
Transaction cost: 61666 gas. 
Execution cost: 40394 gas.
Decoded: 
uint256: 4919

With proxy:

Result: "0x0000000000000000000000000000000000000000000000000000000000001337"
Transaction cost: 15957 gas. 
Execution cost: 10641 gas.
Decoded: 
uint256: 4919
Result: "0x0000000000000000000000000000000000000000000000000000000000001337"
Transaction cost: 61893 gas. 
Execution cost: 40621 gas.
Decoded: 
uint256: 4919

Further reading

If you want to learn more about the nitty-gritty of Solidity, and the EVM, I can recommend checking out Andreas Olofsson’s solidity workshop here, specifically this. Also, the official docs.

Edit: 2016-06-16

After writing this post, I was made aware that this trick has already been documented by Ethereum developer Nick Johnson, for use in upgradeable contracts. So credit goes to him for finding it! The usecase that led me to it is a different one, though; instantiating cheap contracts with heavy libraries (using callcode/delegatecall), whereas Nick describes the usecase of updateable contracts (call) more in depth.

2016-06-15

tweets

favorites