Topic

This short entry is a longer answer to stackoverflow.
The questions is:
Is there a (negative) performance impact due to unused function arguments (in Rust)?
Claim: There is no runtime performance impact. The compiler removes all unused function arguments (in release mode).

Free Functions

To this end, we will consider the following three functions:

fn single_argument(x: u32) -> u32 {
    x + 424242
}
fn too_many_arguments(x: u32, y: u32) -> u32 {
    x + 424242
}
fn enough_arguments(x: u32, y: u32) -> u32 {
    x + y
}

Lets have a look at the generate assembly (playground_1):

playground::single_argument:
	pushq	%rax
	movl	%edi, 4(%rsp)
	addl	$424242, %edi
	setb	%al
	testb	$1, %al
	movl	%edi, (%rsp)
	jne	.LBB0_2
	movl	(%rsp), %eax
	popq	%rcx
	retq

playground::too_many_arguments:
	subq	$24, %rsp
	movl	%edi, 16(%rsp)
	movl	%esi, 20(%rsp)
	addl	$424242, %edi
	setb	%al
	testb	$1, %al
	movl	%edi, 12(%rsp)
	jne	.LBB1_2
	movl	12(%rsp), %eax
	addq	$24, %rsp
	retq

playground::enough_arguments:
	subq	$24, %rsp
	movl	%edi, 16(%rsp)
	movl	%esi, 20(%rsp)
	addl	%esi, %edi
	setb	%al
	testb	$1, %al
	movl	%edi, 12(%rsp)
	jne	.LBB2_2
	movl	12(%rsp), %eax
	addq	$24, %rsp
	retq

Since this is a bit much to understand, release mode:

playground::single_argument:
	leal	424242(%rdi), %eax
	retq

playground::too_many_arguments:
	leal	424242(%rdi), %eax
	retq

playground::enough_arguments:
	leal	(%rdi,%rsi), %eax
	retq

As promised, no difference between single_argument & too_many_arguments. I guess (I’m not an assembly expert), that in debug mode too_many_arguments writes its argument y into a register.

Last check: We want to check there is some overhead at the call site. Note that the following example forbids inlining and uses each function twice to ensure that the compiler does not replace function calls by their return values.

#[inline(never)]
#[no_mangle]
fn single_argument(x: u32) -> u32 {
    x + 424242
}

#[inline(never)]
#[no_mangle]
fn too_many_arguments(x: u32, y: u32) -> u32 {
    x + 424242
}

#[inline(never)]
#[no_mangle]
fn enough_arguments(x: u32, y: u32) -> u32 {
    x + y
}

fn main() {
    println!("Result: {:?}", too_many_arguments(112, 113));
    println!("Result: {:?}", enough_arguments(114, 115),);
    println!("Result: {:?}", single_argument(116));
    println!("Result: {:?}", too_many_arguments(117, 118));
    println!("Result: {:?}", enough_arguments(119, 120),);
    println!("Result: {:?}", single_argument(121));
}

And the generated assembly (in release mode):

playground::main:
  /* unimportant */	 	
  movl	$112, %edi
	callq	too_many_arguments
  /* unimportant */	 	
  movl	$114, %edi
	movl	$115, %esi
	callq	enough_arguments
	/* unimportant */	   	
  movl	$116, %edi
	callq	single_argument
	/* unimportant */	 	
  movl	$117, %edi
	callq	too_many_arguments
	/* unimportant */	 
  movl	$119, %edi
	movl	$120, %esi
	callq	enough_arguments
	/* unimportant */	 
  movl	$121, %edi
	callq	single_argument
	/* unimportant */	   

We observe that calls to enough_arguments are preceed by two movl (with the expected numbers), whereas both too_many_arguments & single_argument are only preceeded by a single movl.
Hence: In release mode, there is no runtime overhead associated with unused function arguments.

Sideremark: In debug mode we see the expected overhead:

playground::main:
  /* unimportant */	 	
  movl	$112, %edi
  movl	$113, %esi
	callq	too_many_arguments
  /* skipped */   
  

Traits

The questions which motivated this post is slightly different.

Let’s say we have some trait:

trait DummyTrait {
    fn add(&self, x: u32, y: u32) -> u32;  
}

and we have an implementor who is not using all arguments:

fn new(z: u32) -> DummyStruct {
    DummyStruct { z }
}

impl DummyTrait for DummyStruct {
    fn add(&self, x: u32, _: u32) -> u32 {
        self.z + x + 424242
    }
}

Question: Does this lead to run-time costs?

The expected answer is: Since LLVM does not know about traits, it basically sees only free function, hence there is no overhead.

But lets look at some more assembly. Here is the example rust: (Again, no inlining and double function calls to avoid optimizations.)

trait DummyTrait {
    #[inline(never)]
    fn add(&self, x: u32, y: u32) -> u32 {
        y + 1234567
    }
}

struct DummyStruct {
    z: u32,
}

#[inline(never)]
#[no_mangle]
fn new(z: u32) -> DummyStruct {
    DummyStruct { z }
}

impl DummyTrait for DummyStruct {
    #[inline(never)]
    fn add(&self, x: u32, _: u32) -> u32 {
        self.z + x + 424242
    }
}

fn main() {
    let dummy = new(12345);
    println!("Result: {:?}", dummy.add(110, 111));
    println!("Result: {:?}", dummy.add(112, 113));
}

and the relevant assembly:

playground::main:
  /* unimportant */	 	
	movl	$12345, %edi
	movl	$110, %esi
	callq	<playground::DummyStruct as playground::DummyTrait>::add
  /* unimportant */	 	
	movl	$12345, %edi
	movl	$112, %esi
	callq	<playground::DummyStruct as playground::DummyTrait>::add
  /* unimportant */

Once again, no $111 or $113, so no overhead.

Can we see some overhead? Yes, like this:

trait DummyTrait {
    #[inline(never)]
    fn add(&self, x: u32, y: u32) -> u32 {
        y + 1234567
    }
}

struct DummyStruct {
    z: u32,
}

#[inline(never)]
#[no_mangle]
fn new(z: u32) -> DummyStruct {
    DummyStruct { z }
}

impl DummyTrait for DummyStruct {
    #[inline(never)]
    fn add(&self, x: u32, _: u32) -> u32 {
        self.z + x + 424242
    }
}

#[inline(never)]
#[no_mangle]
fn overhead(d:&dyn DummyTrait, x:u32,y:u32) -> u32 {
    d.add(x,y)
}

fn main() {
    let dummy = new(12345);
    println!("Result: {:?}", overhead(&dummy, 114, 115));
    println!("Result: {:?}", overhead(&dummy, 116, 117));
}

and the relevant assembly:

playground::main:
  /* unimportant */	 	
	movl	$114, %edx
	movl	$115, %ecx
	callq	overhead
  /* unimportant */

But here we call into an unknown trait implementation, using dynamic dispatch. So: What else should the compiler do? 1

Last topic: Default implementation playground_2

trait DummyTrait {
    #[inline(never)]
    fn add(&self, x: u32, y: u32) -> u32 {
        y + 1234567
    }
}

struct DummyStruct2 {
    z: u32,
}

#[inline(never)]
#[no_mangle]
fn new2(z: u32) -> DummyStruct2 {
    DummyStruct2 { z }
}

impl DummyTrait for DummyStruct2 {}

fn main() {
    let dummy2 = new2(12345);
    println!("Result: {:?}", dummy2.add(114, 115));
    println!("Result: {:?}", dummy2.add(116, 117));
}

and the relevant assembly:

playground::main:
  /* unimportant */	 	
	movl	$115, %edi
	callq	playground::DummyTrait::add
	/* unimportant */
  movl	$117, %edi
	callq	playground::DummyTrait::add
	/* unimportant */

Once again, we do not see any overhead. This is expected, since we still do static dispatch, so have a free function in LLVM (only the location of the function’s source code changed).

Summary

Let’s recall the definition of zero-cost abstraction due to Bjarne Stroustrup:
What you don’t use, you don’t pay for. And further: What you do use, you couldn’t hand code any better.
So we have checked: Unused function arguments are a zero-cost abstraction in Rust.

Note that there is some overhead in debug mode.

Moreover, there is a linter warning about those parameters, which is really helpful. Also, the tooling (i.e., rust playground or godbolt) is really nice to have. It easily allows to look at the generated assembly.

Additium

On reddit it was remarked that drop-types behave differently. If a function argument is a drop-type (and not a reference to one), then the function must clean up the instance. So, even if the argument seems to be unused, it actually is used (in order to call drop on it).

I guess it is fair to say that the compiler leaves out some optimization potential, but I’m unsure if this leads to real-world performance digressions.

  1. Well, it could reason that there is only a unique implementation of DummyTrait. But this optimization seems unnecessary, since the whole point of dynamic dispatch is to support multiple implementations.