|
Help with __builtin_prefetch function and it's timing
Hello there, I just needed to know how to get the timing right when using the gcc __builtin_prefetch() function, that is, how many instructions before the actual utilization of the data should I make the prefetch call.
I will be measuring the L1 cache hit rate with valgrind's cachegrind, simulating a 1KB L1 Data cache.
Just in case you ask yourself what's the point in doing what I'm doing, I tell you it's a university project.
|