Fast string to integer#

Problem#

In typical cases, strtoul( ) is used to convert strings in decimal or hexadecimal to integers.
However, if strtoul( ) is used repeatedly many times, the processing speed becomes slow.
The following code was repeated 300,000,000 times with strtoul( ), and it took about 5 seconds.

#include <iostream>
#include <chrono>

int main()
{
	const char* hexString = "19AF";   // 0x19AF = 6575

	auto start = std::chrono::high_resolution_clock::now();

	for (int i = 0; i < 300000000; i++)
	{
		int value = strtoul(hexString, NULL, 16);
	}

	auto end = std::chrono::high_resolution_clock::now();
	std::chrono::duration<double, std::milli> elapsed = end - start;
	std::cout << "time: " << elapsed.count() << " ms" << std::endl;

    return 0;
}

Improvement#

Using a LookUp Table (LUT) can process faster than strtoul( ).
The characters included in decimal and hexadecimal strings are as follows.

  • ‘0’~‘9’
  • ‘a’~‘f’
  • ‘A’~‘F’

Therefore, we prepare a LUT in advance with the integer values for these characters, as shown in initLut() in the code below.
Then, by applying the LUT to each digit to obtain the integer value of the character and using shift operators to combine them, the string-to-integer conversion becomes faster.
The following code took about 0.7 seconds.

strtoul( )

#include <iostream>
#include <chrono>

char LUT['f'+1];   // 'f'=102

void initLut()
{
    LUT['0'] = 0x0;
    LUT['1'] = 0x1;
    LUT['2'] = 0x2;
    LUT['3'] = 0x3;
    LUT['4'] = 0x4;
    LUT['5'] = 0x5;
    LUT['6'] = 0x6;
    LUT['7'] = 0x7;
    LUT['8'] = 0x8;
    LUT['9'] = 0x9;

    LUT['a'] = 0xa;
    LUT['b'] = 0xb;
    LUT['c'] = 0xc;
    LUT['d'] = 0xd;
    LUT['e'] = 0xe;
    LUT['f'] = 0xf;

    LUT['A'] = 0xa;
    LUT['B'] = 0xb;
    LUT['C'] = 0xc;
    LUT['D'] = 0xd;
    LUT['E'] = 0xe;
    LUT['F'] = 0xf;
}

int main()
{
    initLut();

    const char* hexString = "19AF";   // 0x19AF = 6575

    auto start = std::chrono::high_resolution_clock::now();

    for (int i = 0; i < 300000000; i++)
    {
        int value = 
            (LUT[hexString[0]] << 12) | 
            (LUT[hexString[1]] << 8) | 
            (LUT[hexString[2]] << 4) | 
            LUT[hexString[3]];
    }

    auto end = std::chrono::high_resolution_clock::now();
    std::chrono::duration<double, std::milli> elapsed = end - start;
    std::cout << "time: " << elapsed.count() << " ms" << std::endl;

    return 0;
}

Using strtoul( ) takes 5 seconds, while using a LUT reduces the time to 0.7 seconds, reducing the processing time to about 1/7.