Toggle navigation
MeasureThat.net
Create a benchmark
Tools
Feedback
FAQ
Register
Log In
Diacritics removal
(version: 0)
Comparing performance of:
Old vs New
Created:
5 years ago
by:
Guest
Jump to the latest result
Script Preparation code:
const replaceMap = { œ: 'oe', æ: 'ae', } const replacementsRegex = new RegExp(Object.keys(replaceMap).join('|'), 'g') window.removeDiacritics = (value) => { return value .normalize('NFD') .replace(/[\u0300-\u036f]/g, '') .replace(replacementsRegex, (char) => { return replaceMap[char] || char }) } window.oldRemoveDiacritics = function(s) { if (!s) return s; let r = s; r = r.replace(/[àáâãäå]/g, 'a'); r = r.replace(/æ/g, 'ae'); r = r.replace(/ç/g, 'c'); r = r.replace(/[èéêë]/g, 'e'); r = r.replace(/[ìíîï]/g, 'i'); r = r.replace(/ñ/g, 'n'); r = r.replace(/[òóôõö]/g, 'o'); r = r.replace(/œ/g, 'oe'); r = r.replace(/[ùúûü]/g, 'u'); r = r.replace(/[ýÿ]/g, 'y'); r = r.replace(/\\W/g, ''); // To prevent some weird behaviors with MacOS diacritics r = _(r).map(function(char) { return String.fromCharCode(char.charCodeAt(0)); }); return r.join(''); };
Tests:
Old
oldRemoveDiacritics(` Dji pou magnî do vêre, çoula m' freut nén må - æsope - robot œuf Lorem, ipsum dolor sit amet consectetur adipisicing elit. Reiciendis voluptatem dolores molestiae possimus laudantium consequuntur placeat earum neque maxime modi quasi fugiat quae inventore, illo in corporis corrupti esse a! `)
New
removeDiacritics(` Dji pou magnî do vêre, çoula m' freut nén må - æsope - robot œuf Lorem, ipsum dolor sit amet consectetur adipisicing elit. Reiciendis voluptatem dolores molestiae possimus laudantium consequuntur placeat earum neque maxime modi quasi fugiat quae inventore, illo in corporis corrupti esse a! `)
Rendered benchmark preparation results:
Suite status:
<idle, ready to run>
Run tests (2)
Previous results
Fork
Test case name
Result
Old
New
Fastest:
N/A
Slowest:
N/A
Latest run results:
Run details:
(Test run date:
one year ago
)
User agent:
Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/131.0.0.0 Safari/537.36
Browser/OS:
Chrome 131 on Mac OS X 10.15.7
View result in a separate tab
Embed
Embed Benchmark Result
Test name
Executions per second
Old
26697.3 Ops/sec
New
359406.4 Ops/sec
Autogenerated LLM Summary
(model
llama3.2:3b
, generated one year ago):
Measuring performance differences between JavaScript libraries is crucial for developers who need to write efficient code. **Benchmark Definition** The benchmark definition, represented by the JSON provided, tests the performance of two different approaches to removing diacritics from text: 1. **`removeDiacritics` function**: This function uses Unicode Normalization (UAX #14) to decompose the input string into its base characters and diacritic marks, then replaces each diacritic mark with a replacement character. 2. **`oldRemoveDiacritics` function**: This function uses a set of predefined replacements for common diacritic marks, such as `æ`, `œ`, and `ý`. **Comparison Options** The two functions are compared to measure their performance differences: 1. **`removeDiacritics` function** * Pros: + More flexible, as it can handle any Unicode character + Uses Unicode Normalization, which is a standardized way of handling Unicode text * Cons: + May be slower due to the complexity of Unicode Normalization 2. **`oldRemoveDiacritics` function** * Pros: + Typically faster than `removeDiacritics`, as it uses a simpler, predefined approach * Cons: + Less flexible, as it only handles specific diacritic marks **Library** The `removeDiacritics` function uses the JavaScript `String.prototype.normalize()` method and regular expressions to remove diacritics. **Special JS Feature/Syntax** There is no special JS feature or syntax used in this benchmark. The code relies on standard JavaScript features, such as functions, objects, and string manipulation. **Benchmark Preparation Code** The preparation code sets up the environment for the benchmark by defining the `replaceMap` object and the `replacementsRegex` regular expression. **Individual Test Cases** There are two test cases: 1. **`Old`**: Tests the performance of the `oldRemoveDiacritics` function with a sample text that contains various diacritic marks. 2. **`New`**: Tests the performance of the `removeDiacritics` function with the same sample text. **Benchmark Results** The latest benchmark results show the execution frequency per second for each test case, indicating which approach is faster: * **`New` ( `removeDiacritics` )**: 471214.4375 executions/second * **`Old` ( `oldRemoveDiacritics` )**: 16742.8671875 executions/second **Other Alternatives** If the `removeDiacritics` function is not suitable, other alternatives could include: 1. Using a library like Unicode-Java or ujs-unicode-normalization for Unicode Normalization. 2. Implementing a custom diacritic removal algorithm that uses a combination of regular expressions and string manipulation techniques. Keep in mind that the choice of approach depends on the specific requirements of your project, such as performance, flexibility, and maintainability.
Related benchmarks:
Diacritics removal (+ lowercase2)
Accent Mark Removal
Yeahaaa
replaceAll vs regex replace vs neu parser
Comments
Confirm delete:
Do you really want to delete benchmark?