Benchmark: regex vs split and trim

Script Preparation code:

function generateText(length, containsOtherChars) {
    return [].constructor(length)
        .map(it => Math.random() > 0.5 ? ',' : containsOtherChars && Math.random() > 0.5 ? 'a' : ' ')
        .join("");
}

var shortEmptyText = generateText(100, false);
var shortNonEmptyText = generateText(100, true);

var longEmptyText = generateText(1000 * 1000 * 10, false);
var longNonEmptyText = generateText(1000 * 1000 * 10, true);

var regex = /^[\s,]*$/;

Tests:

Short regex
console.log(!!shortEmptyText.match(regex)?.length); console.log(!!shortNonEmptyText.match(regex)?.length);
Long regex
console.log(!!longEmptyText.match(regex)?.length); console.log(!!longNonEmptyText.match(regex)?.length);
Short split and trim
var sanitizedEmptyText = shortEmptyText.split(",").filter(it => !!it?.trim()).join(""); var sanitizedNonEmptyText = shortEmptyText.split(",").filter(it => !!it?.trim()).join(""); console.log(sanitizedEmptyText.length > 0); console.log(sanitizedNonEmptyText.length > 0);
Long split and trim
var sanitizedEmptyText = longEmptyText.split(",").filter(it => !!it?.trim()).join(""); var sanitizedNonEmptyText = longNonEmptyText.split(",").filter(it => !!it?.trim()).join(""); console.log(sanitizedEmptyText.length > 0); console.log(sanitizedNonEmptyText.length > 0);

Rendered benchmark preparation results:

Suite status: <idle, ready to run>

Previous results

Test case name	Result
Short regex
Long regex
Short split and trim
Long split and trim

Fastest: N/A

Slowest: N/A

Latest run results:

Run details: (Test run date: one year ago)

User agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/127.0.0.0 Safari/537.36 Edg/127.0.0.0

Browser/OS: Chrome 127 on Windows

View result in a separate tab

Test name	Executions per second
Short regex	190377.1 Ops/sec
Long regex	190873.9 Ops/sec
Short split and trim	190838.0 Ops/sec
Long split and trim	189239.7 Ops/sec

Autogenerated LLM Summary (model llama3.2:3b, generated one year ago):

LLMs can make mistakes. Check important info.

Let's break down the benchmark and explain what's being tested, the different approaches, their pros and cons, and other considerations.

**Benchmark Definition**

The benchmark is defined in two parts: `Script Preparation Code` and `Html Preparation Code`. The script preparation code generates a set of text samples with varying lengths (short and long) and contents (empty or non-empty). It also defines a regular expression (`regex`) that matches whitespace characters (`\s`). The Html preparation code is empty, suggesting that the benchmark is run in a headless browser environment.

**Test Cases**

The benchmark consists of four test cases:

1. **Short regex**: Tests whether the `shortEmptyText` and `shortNonEmptyText` samples match the `regex`.
2. **Long regex**: Tests whether the `longEmptyText` and `longNonEmptyText` samples match the `regex`.
3. **Short split and trim**: Tests whether the `split` and `trim` methods on the `shortEmptyText` sample produce an empty string.
4. **Long split and trim**: Tests whether the `split` and `trim` methods on the `longEmptyText` sample produce an empty string.

**Approaches**

There are two approaches being tested:

1. **Regex approach**: Uses the regular expression to match whitespace characters in the text samples.
2. **Split and trim approach**: Splits the text into substrings using a comma (`\`,) as a delimiter, filters out non-whitespace characters using `trim`, and then joins the remaining substrings back together.

**Pros and Cons**

1. **Regex approach**
	* Pros: Can match complex patterns and perform multiple operations in a single pass.
	* Cons: Can be slower due to the overhead of creating and executing regular expressions, especially for large inputs.
2. **Split and trim approach**
	* Pros: Typically faster since it involves simple string manipulation operations.
	* Cons: Requires splitting and trimming each substring individually, which can lead to increased memory usage and slower performance.

**Other Considerations**

1. **Library usage**: The benchmark uses the `trim` method from the JavaScript standard library, which is a built-in function that removes whitespace characters from both sides of a string.
2. **Special JS feature or syntax**: There are no special features or syntax used in this benchmark that would be specific to a particular JavaScript implementation.

**Alternatives**

Other alternatives for testing similar workloads could include:

1. Using a different regular expression engine, such as `RegExp` from the JavaScript standard library.
2. Implementing a custom string manipulation function that mimics the behavior of the `split` and `trim` methods.
3. Using a benchmarking framework that provides built-in support for testing regex patterns and string manipulation operations.

Note that these alternatives would likely have similar performance characteristics to the approaches being tested, but might offer different trade-offs in terms of complexity, memory usage, or other factors.

Related benchmarks:

regex vs split and trim (version: 0)

Comparing performance of: Short regex vs Long regex vs Short split and trim vs Long split and trim

Created: one year ago by: Guest

Jump to the latest result

Short regex

Long regex

Short split and trim

Long split and trim

Suite status: <idle, ready to run>

Fastest: N/A

Slowest: N/A

Autogenerated LLM Summary (model llama3.2:3b, generated one year ago):