Is performance reduced when executing loops whose uop count is not a multiple of processor width?

submitted by /u/ketralnis
[link] [comments]