strchr.com commentshttp://www.strchr.comPerfectionistic and minimalistic programming.1440Frank Earnest on Two-stage tables for storing Unicode character propertiesFri, 28 Feb 2020 13:31:29 +0700http://www.strchr.com/multi-stage_tables#comment_788<p>I'm not interested in doing artificial benchmarks on a whim, I just wanted to point out the insanity of this statement:</p> <p></p> <p>&quot;The table will be smaller, but you will have a branch (if statement) in your code, so it will be slower, because unpredictable branches are expensive on modern processors.&quot;</p> <p></p> <p>A branch misprediction on &quot;modern processors&quot; costs about 16 cycles. In some situations that's a big deal -- but treating branches like some kind of unknown quantity, to be avoided at all costs, is a very weird phenomenon.</p> <p></p> <p>Especially when you're comparing it to something that adds tons of extra cache pressure, which won't even register in synthetic benchmarks but could easily destroy performance when you actually throw it into a real application.</p> Peter Kankowski on Two-stage tables for storing Unicode character propertiesTue, 18 Feb 2020 14:02:49 +0700http://www.strchr.com/multi-stage_tables#comment_787<p>Frank, please feel free to do a benchmark</p> Frank Earnest on Two-stage tables for storing Unicode character propertiesTue, 18 Feb 2020 03:55:17 +0700http://www.strchr.com/multi-stage_tables#comment_786<p>That was supposed to be &quot;branch [mis]prediction bad&quot;.</p> Frank Earnest on Two-stage tables for storing Unicode character propertiesTue, 18 Feb 2020 03:50:39 +0700http://www.strchr.com/multi-stage_tables#comment_785<p>&gt; It's a noticeable difference for modern processors with branch prediction.</p> <p></p> <p>Most of the branches in a binary search loop *are* predictable though. It's a backwards branch forming a loop that's only going to break once -- either when it's exhausted the search or found a match. Saying something to the effect of &quot;branch prediction bad&quot; is not particularly insightful.</p> <p></p> <p>Also, a cacheline fill is an order of magnitude more expensive than a mispredicted branch. Unpredictable branches are certainly bad for performance, but it's gotten to the point where saying &quot;hurrr branches bad&quot; is just an overused meme without any sane quantification or analysis of what you're replacing it with.</p> Peter Kankowski on x86 Machine Code StatisticsThu, 13 Feb 2020 04:12:26 +0700http://www.strchr.com/x86_machine_code_statistics#comment_784<p>Hello Jay, I plan to update these statistics and other articles, but unfortunately I'm quite busy with the other work now, so it will not happen soon.</p> <p></p> <p>Under Linux, you could try strace to collect the API calls statistics.</p>