Query: cord2
OS: ultrix
Section: 1
Format: Original Unix Latex Style Formatted with HTML and a Horizontal Scroll Bar
cord2(1) General Commands Manual cord2(1) Name cord2 - rearranges basic blocks in an executable file to facilitate better cache mapping. Syntax cord2 [-v] [-o outfile] [-c cachewords] [-d] [-b bridge_limit] [-n] [-A addersfile] [[-C countsfile] ...] obj Description The cord2 command extracts basic blocks from a program and deposits them in a new area in the text, making jumps to and from that area as necessary. By separating the basic blocks, you can reduce instruction cache miss rates. The cord2 command takes the output of a pixie profiling run as input (see The executable object file has the suffix obj. The cord2 command only requires one addersfile; it creates the filename by appending .Bbad- drs to the obj filename if none is specified with -A. Multiple counts files can be specified from many runs with multiple -C arguments. If none are specified, cord2 creates the counts filename by appending .Counts to the obj name. Multiple counts files are added together into an internal counts array represented with C double-type elements. The counts array elements contain the density of a block or cycles/byte. If you specify -n, then the counts are normalized so that each counts array entry is cycles/totalcycles. When one counts file is specified, the default is to favor small blocks; -n negates that. When many counts files are specified, -n also negates favoring one counts file. This is because its totalcycles may exceed the totalcycles of another counts file. The cord2 command determines which basic blocks to insert by sorting the counts array and collecting the blocks with the highest counts that can fit into the new area. The cord2 command may skip over huge blocks that do not fit at the end of the new area. Once the blocks are determined, they are inserted into the new area, and their original location is modified to jump to the new area. At the end of each block in the new area, a jump is added back to the original block's subsequent or fall-through location, and the branch/jump target (if necessary). Both entering and exiting the new area is optimized to take advantage of other blocks in the new area and jump delay slots. Often, there may be one or more fall-through blocks of a block in the new area which are small, hardly ever used, and not in the new area. If the block following these fall-through blocks is in the new area, the fall-through blocks are called bridge blocks. It may be more costly to generate jumps to and from bridge blocks rather than to simply copy them. The cord2 command allows you to specify that bridge blocks be added to the new area if they total less than the bridge_limit instructions between two new-area blocks. You can specify the bridge_limit with -b; the default is zero. Bridge blocks can bump blocks out of the new area that might normally fit into it. Because the cord2 command works from profile output, the resulting binary is data dependent. In other words, it may perform well only on the same input data that generated the profile information, and may perform worse than the original binary on other data. Furthermore, if the hot areas in the cache do not fit well into one cachepage, performance can degrade. Options The cord2 command also accepts these options: -d Fill the delay slots with nops only when adding jumps to and from the new area. -v Print verbose information. This includes statistics about the cord2 process. -v -v Print all of the -v information, but include detailed disassemblies of the code moved, changed, and generated by cord2. -c cachewords Specify the number of words in the cache of the machine on which you want to execute. This is actually the size of the new area. The cachesize may be a misnomer, as you can specify a size other than your machine's cache size; however, it is probably the correct num- ber. -o outputfile Specify the output file. If it is not specified, the default is a.out.cord2. Restrictions The cord2 command adds the new area to the end of text so any program using the etext symbol may not work. See See Also pixie(1), cord(1) RISC cord2(1)
Related Man Pages |
---|
swap(1m) - sunos |
cord2(1) - ultrix |
cord(1) - osf1 |
swap(1m) - debian |
swap(1m) - mojave |
Similar Topics in the Unix Linux Community |
---|
Comparing Counts Within Separate Files |
How to add the Counts of 2 tables in sybase! |
Undestanding LANG setting in /etc/environment |
awk adding counts together from column |