S2S Compilers: Understanding Switch Case Statements

This blog is sponsored and supported by Voxgig. Motivated and inspired by Decl. In development and soon to be open-sourced. Table Of Contents S2S Compilers: Understanding Switch Case Statements About Author Introduction Motivation Source Code Prerequisites Background GCC Options Headers and Macros Switch case VS If-else If-else statement Switch Case Benchmarking the two Out-of-Bounds Benchmark Turning a switch case into a jump table Is there a limit for the values of a switch case? Breaking switch case: Non-packed values Multiple Jump Tables If-else as a jump table Implementing Switch Case Jump Table with function pointers Implementation Benchmark Result 1 Benchmark Result 2 Computed GOTO: Goto Jump Table with Labels No optimization flags With -O2 optimizations Switch Case with -O1 optimizations No Inline Inlining Computed GOTO Local Label Problem Inlining Switch Case Customizing _switch_go Benchmark Default Case: Can we do without it? Conclusion References About Author My name is Aleksandar Milenkovic. I am a senior software engineer at Voxgig and a self-driven coder, an autodidact, that creates software and improves its quality at any level. If you have any feedback or questions, feel free to reach out to me via LinkedIn. Introduction Source-to-source compilers transpile the source code of a program and generate it into another language (target language). The target is typically a high-level language that is supported by all platforms so that the compatibility is not a problem. If you are turning your source code into languages such as C or C++, it is required to have great understanding and knowledge of C/C++. Since these languages also have compilers be it GNU Compiler Collection or Clang, we have to do a lot of digging and researching around their features and functionalities. There is a lot of benefit in that once the target codebase grows and developers start reusing the target source as a component rather than referring back to the source language code. For more context, also check out Wikipedia | Source-to-source compiler | Programming language implementations. In a lot of cases, we start benchmarking a lot of the target source code generated by our S2S compiler and comparing to the plain C/C++. We start finding a lot of improvements we can make for the target source itself. The goal is the quality and performance of the target source. As the first instalment of this series, we dive into the usage of the switch case, optimizing it, comparing it to if-else, implementing our own switch case with all of the assembly conversions and benchmarks to back it all up. We use GCC (GNU Compiler Collection) and cover some of its commands in detail. Even if you are not interested in S2S Compilers as a C/C++ developer, this blog still comes in handy with all of the examples and information. This blog series is named S2S (Source-to-source) Compilers as all of the code and content is motivated by building a S2S Compiler that takes in the source language and translates it into C++. By benchmarking and analyzing the target source (C++), I have realized how many improvements there are to make. More on the motivation in the following section. Motivation As a senior software engineer for Voxgig, we have been building SDKs in various programming languages. For our utilities, this brought about a lot of performance concerns as we port from one language to another, we would try to make the two and more implementations as close as possible side by side. That led us into the waters of language-specific implementations where we have pushed to use the most out of each language so as to make it as performant as possible for the developer using the SDK in the respective language. For example, the C++ implementation is going to have more focus on the performance since it is harder to optimize code compared to Python, Ruby, or JavaScript. At the same point in time while at Voxgig, I have been building a dynamic programming language that transpiles into C++, I have been doing a lot of research and have benchmarked plenty of C++ code in order to hit the maximum runtime performance of the target language (C++). All of this work was motivated by the dynamic programming language called, Decl; soon to be open-sourced. To be frank, the reason for C++ as the target language is that it is much easier to build a garbage collector by just implementing reference counter in the wrapper class. The state of dynamically allocated objects is determined by implicit destructors of C++, again, with the reference counting implementation. However, we won't cover any of that in this blog but stay ready for the next instalment :). All of the knowledge that I have acquired turned out to be very useful for me and, most of all, our team. If you are interested in building SDKs for one or even more programming languages, please contact Voxgig. Again, hug

Mar 27, 2025 - 15:36

S2S Compilers: Understanding Switch Case Statements

	This blog is sponsored and supported by Voxgig.

	Motivated and inspired by Decl. In development and soon to be open-sourced.

S2S Compilers: Understanding Switch Case Statements
- About Author
- Introduction
- Motivation
- Source Code
- Prerequisites
- Background
- GCC Options
- Headers and Macros
- Switch case VS If-else
- If-else statement
- Switch Case
- Benchmarking the two
- Out-of-Bounds Benchmark
- Turning a switch case into a jump table
- Is there a limit for the values of a switch case?
- Breaking switch case: Non-packed values
- Multiple Jump Tables
- If-else as a jump table
- Implementing Switch Case
- Jump Table with function pointers
  - Implementation
  - Benchmark Result 1
  - Benchmark Result 2
- Computed GOTO: Goto Jump Table with Labels
  - No optimization flags
  - With -O2 optimizations
  - Switch Case with -O1 optimizations
  - No Inline
  - Inlining Computed GOTO
  - Local Label Problem
  - Inlining Switch Case
  - Customizing _switch_go
  - Benchmark
  - Default Case: Can we do without it?
- Conclusion
- References

About Author

My name is Aleksandar Milenkovic. I am a senior software engineer at Voxgig and a self-driven coder, an autodidact, that creates software and improves its quality at any level.

If you have any feedback or questions, feel free to reach out to me via LinkedIn.

Introduction

Source-to-source compilers transpile the source code of a program and generate it into another language (target language). The target is typically a high-level language that is supported by all platforms so that the compatibility is not a problem.

If you are turning your source code into languages such as C or C++, it is required to have great understanding and knowledge of C/C++. Since these languages also have compilers be it GNU Compiler Collection or Clang, we have to do a lot of digging and researching around their features and functionalities. There is a lot of benefit in that once the target codebase grows and developers start reusing the target source as a component rather than referring back to the source language code. For more context, also check out Wikipedia | Source-to-source compiler | Programming language implementations.

In a lot of cases, we start benchmarking a lot of the target source code generated by our S2S compiler and comparing to the plain C/C++. We start finding a lot of improvements we can make for the target source itself. The goal is the quality and performance of the target source.

As the first instalment of this series, we dive into the usage of the switch case, optimizing it, comparing it to if-else, implementing our own switch case with all of the assembly conversions and benchmarks to back it all up. We use GCC (GNU Compiler Collection) and cover some of its commands in detail.

Even if you are not interested in S2S Compilers as a C/C++ developer, this blog still comes in handy with all of the examples and information.

This blog series is named S2S (Source-to-source) Compilers as all of the code and content is motivated by building a S2S Compiler that takes in the source language and translates it into C++. By benchmarking and analyzing the target source (C++), I have realized how many improvements there are to make. More on the motivation in the following section.

Motivation

As a senior software engineer for Voxgig, we have been building SDKs in various programming languages. For our utilities, this brought about a lot of performance concerns as we port from one language to another, we would try to make the two and more implementations as close as possible side by side. That led us into the waters of language-specific implementations where we have pushed to use the most out of each language so as to make it as performant as possible for the developer using the SDK in the respective language. For example, the C++ implementation is going to have more focus on the performance since it is harder to optimize code compared to Python, Ruby, or JavaScript.

At the same point in time while at Voxgig, I have been building a dynamic programming language that transpiles into C++, I have been doing a lot of research and have benchmarked plenty of C++ code in order to hit the maximum runtime performance of the target language (C++).

All of this work was motivated by the dynamic programming language called, Decl; soon to be open-sourced.

To be frank, the reason for C++ as the target language is that it is much easier to build a garbage collector by just implementing reference counter in the wrapper class. The state of dynamically allocated objects is determined by implicit destructors of C++, again, with the reference counting implementation. However, we won't cover any of that in this blog but stay ready for the next instalment :).

All of the knowledge that I have acquired turned out to be very useful for me and, most of all, our team.

If you are interested in building SDKs for one or even more programming languages, please contact Voxgig.

Again, huge thanks to Voxgig for enabling me to put in the time for this research and writing the blog.

Source Code

The source code of this blog is all in one file, at the Github repository where it is also hosted. The code contains the main function running all the benchmarks in order.

Prerequisites

Background

This blog is intended for C/C++ programmers who have the essential knowledge of assembly. However, if you haven't had any experience of assembly before, read on and see how you get on!

GCC Options

You can use the following gcc command to dissect assembly code:

gcc src/main.cpp -g -o output.s -masm=intel -fverbose-asm -S -Werror

As the Using the GNU Compiler Collection Documentation indicates, we use -fverbose-asm to "put extra commentary information in the generated assembly code to make it more readable".

For example,

#include 


int main() {
  int i = 0;


  printf("%d\n", i);
  printf("%d%d\n", i, i);

  return 0;
}

Generates the following assembly output with comments indicating the line number and its content.

main:
.LFB0:
        .file 1 "main.cpp"
        .loc 1 4 12
        .cfi_startproc  
        endbr64 
        push    rbp     #
        .cfi_def_cfa_offset 16
        .cfi_offset 6, -16
        mov     rbp, rsp        #,
        .cfi_def_cfa_register 6
        sub     rsp, 16 #,
# main.cpp:5:   int i = 0;
        .loc 1 5 7
        mov     DWORD PTR -4[rbp], 0    # i,
# main.cpp:8:   printf("%d\n", i);
        .loc 1 8 9
        mov     eax, DWORD PTR -4[rbp]  # tmp84, i
        mov     esi, eax        #, tmp84
        lea     rax, .LC0[rip]  # tmp85,
        mov     rdi, rax        #, tmp85
        mov     eax, 0  #,
        call    printf@PLT      #
# main.cpp:9:   printf("%d%d\n", i, i);
        .loc 1 9 9
        mov     edx, DWORD PTR -4[rbp]  # tmp86, i
        mov     eax, DWORD PTR -4[rbp]  # tmp87, i
        mov     esi, eax        #, tmp87
        lea     rax, .LC1[rip]  # tmp88,
        mov     rdi, rax        #, tmp88
        mov     eax, 0  #,
        call    printf@PLT      #
# main.cpp:11:   return 0;
        .loc 1 11 10
        mov     eax, 0  # _5,
# main.cpp:12: }
        .loc 1 12 1
        leave
        .cfi_def_cfa 7, 8
        ret
        .cfi_endproc

We also use -masm=dialect with intel to produce code optimized for the most current Intel processors but in this case we are using Intel ASM Syntax for the assembly generation. I do personally prefer this syntax branch.

For the assembly examples, we use the Compiler Explorer as demangling identifiers for the assembly comes in built-in and makes the code more readable and easier to follow. The Complier Explorer also allows you to follow the code by hovering over a line or scrolling to source, pointing you directly to the assembly conversion. Anyway, see what works best for you!

We will have two snippets where we introduce the C high-level source code and show the generated assembly code so we can make comparisons and draw conclusions - even so subtle in some cases. But don't forget that any assembly instruction can make a difference.

Headers and Macros

For benchmarking, we pretty much use printf and clock() defined as the following macros.

#include 
#include 

#define start_time clock_t s_t_a_r_t = clock();
#define end_time printf("[CPU Time Used: %f]\n", (double)(clock() -          s_t_a_r_t) / CLOCKS_PER_SEC);

Switch case VS If-else

Let's start by comparing the if-else to the switch statement. Note that in most of the examples we don't use any optimization flags by default since we need the most accurate assembly translation line by line, keeping our code "clean" in that we can also accurately benchmark our implementations.

For example, if we turn on the -O1 flag, the compiler, if finds appropriate, will generate a jump table even in the if-else example as it determines the values are densely packed. For more information, please refer to Using the GNU Compiler Collection | 3.12 Options That Control Optimization.