3 | | <h1 id="LinuxOSCodeProfiling"><b style="color:#000;background:#ffcc99">Linux OS Code Profiling</b></h1> |
4 | | <p> |
5 | | There are several options for code <b style="color:#000;background:#66ffff">profiling</b> on the <b style="color:#000;background:#ffff66">Linux</b> OS. The kernel itself has a <b style="color:#000;background:#66ffff">profiling</b> API which can be enabled: |
6 | | </p> |
7 | | <ul><li>CONFIG_<b style="color:#000;background:#66ffff">PROFILING</b> - General <b style="color:#000;background:#66ffff">profiling</b> |
8 | | </li><li>CONFIG_OPROFILE - OProfile system <b style="color:#000;background:#66ffff">profiling</b> (capable of <b style="color:#000;background:#66ffff">profiling</b> the whole system including kernel, kernel modules, libraries, and applications) |
9 | | </li></ul><p> |
10 | | OProfile was the <b style="color:#000;background:#66ffff">profiling</b> tool of choice for <b style="color:#000;background:#ffff66">linux</b> devls for nearly 10 years. A few years back various kernel developers defined and implemented a new formal kernel API to access performance monitor counters (PMC's), which are hardware elements in most modern CPU's, to address needs of performance tools. Prior to this new API oPOProfileofile used a special OProfile-specific kernel module while other tools relied on patches (perctr, perfmon). |
11 | | </p> |
12 | | <p> |
13 | | The developers of the new <b style="color:#000;background:#66ffff">profiling</b> API also developed an example tool that used the new API called 'perf'. The perf tool has thus matured greatly in the past few years. oprfile is strickly a <b style="color:#000;background:#66ffff">profiling</b> tool. |
14 | | </p> |
15 | | <p> |
| 3 | = Linux OS Code Profiling = |
| 4 | There are several options for code profiling on the Linux OS. The kernel itself has a profiling API which can be enabled: |
| 5 | * CONFIG_PROFILING - General profiling |
| 6 | * CONFIG_OPROFILE - OProfile system profiling (capable of profiling the whole system including kernel, kernel modules, libraries, and applications) |
| 7 | |
| 8 | OProfile was the profiling tool of choice for linux devls for nearly 10 years. A few years back various kernel developers defined and implemented a new formal kernel API to access performance monitor counters (PMC's), which are hardware elements in most modern CPU's, to address needs of performance tools. Prior to this new API oPOProfileofile used a special OProfile-specific kernel module while other tools relied on patches (perctr, perfmon). |
| 9 | |
| 10 | The developers of the new profiling API also developed an example tool that used the new API called 'perf'. The perf tool has thus matured greatly in the past few years. oprfile is strickly a profiling tool. |
| 11 | |
23 | | </p> |
24 | | <ul><li><a class="ext-link" href="http://rhaas.blogspot.co.uk/2012/06/perf-good-bad-ugly.html"><span class="icon"></span>http://rhaas.blogspot.co.uk/2012/06/perf-good-bad-ugly.html</a> |
25 | | </li><li><a class="ext-link" href="http://homepages.cwi.nl/~aeb/linux/profile.html"><span class="icon"></span>http://homepages.cwi.nl/~aeb/<b style="color:#000;background:#ffff66">linux</b>/profile.html</a> |
26 | | </li></ul><h2 id="BasicKernelProfilingCONFIG_PROFILINGandreadprofile">Basic Kernel <b style="color:#000;background:#66ffff">Profiling</b> (CONFIG_<b style="color:#000;background:#66ffff">PROFILING</b> and readprofile)</h2> |
27 | | <p> |
28 | | There are several facilities to see where the kernel spends its resources. A simple one which can be built-in with (CONFIG_<b style="color:#000;background:#66ffff">PROFILING</b>) will store the current EIP (instruction pointer) at each clock tick. |
29 | | </p> |
30 | | <p> |
31 | | To use this ensure the kernel is built with CONFIG_<b style="color:#000;background:#66ffff">PROFILING</b> and either boot the kernel with command line option <strong>profile=2</strong> or enable at runtime with an <strong>echo 2 > /sys/kernel/<b style="color:#000;background:#66ffff">profiling</b></strong>. |
32 | | </p> |
33 | | <p> |
34 | | This will cause a file /proc/profile to be created. The number provided (2 in the example above) is the number of positions EIP is shifted right when <b style="color:#000;background:#66ffff">profiling</b>. So a large number gives a coarse profile. The counters are reset by writing to /proc/profile. |
35 | | </p> |
36 | | <p> |
| 18 | * http://rhaas.blogspot.co.uk/2012/06/perf-good-bad-ugly.html |
| 19 | * http://homepages.cwi.nl/~aeb/linux/profile.html |
| 20 | |
| 21 | == Basic Kernel Profiling (CONFIG_PROFILING and readprofile) == |
| 22 | There are several facilities to see where the kernel spends its resources. A simple one which can be built-in with (CONFIG_PROFILING) will store the current EIP (instruction pointer) at each clock tick. |
| 23 | |
| 24 | To use this ensure the kernel is built with CONFIG_PROFILING and either boot the kernel with command line option profile=2 or enable at runtime with an echo 2 > /sys/kernel/profiling. |
| 25 | |
| 26 | This will cause a file /proc/profile to be created. The number provided (2 in the example above) is the number of positions EIP is shifted right when profiling. So a large number gives a coarse profile. The counters are reset by writing to /proc/profile. |
| 27 | |
41 | | </p> |
42 | | <ol><li>boot kernel compiled with CONFIG_<b style="color:#000;background:#66ffff">PROFILING</b> |
43 | | </li><li>enable (either with placing <strong>profile=2</strong> on cmdline or dynamically with: |
44 | | <pre class="wiki">echo 2 > /sys/kernel/<b style="color:#000;background:#66ffff">profiling</b> # enable <b style="color:#000;background:#66ffff">profiling</b> |
45 | | </pre></li><li>(optional) clear counters |
46 | | <pre class="wiki">echo > /proc/profile # reset counters |
47 | | </pre></li><li>do some activity you wish to profile |
48 | | </li><li>use <strong>readprofile</strong> to interpret the results: |
49 | | <pre class="wiki">readprofile -m System.map | sort -nr | head -2 |
| 31 | 1. boot kernel compiled with CONFIG_PROFILING |
| 32 | 2. enable (either with placing {{{profile=2}}} on cmdline or dynamically with: |
| 33 | {{{#!bash |
| 34 | echo 2 > /sys/kernel/profiling # enable profiling |
| 35 | }}} |
| 36 | 3. (optional) clear counters |
| 37 | {{{#!bash |
| 38 | echo > /proc/profile # reset counters |
| 39 | }}} |
| 40 | 4. do some activity you wish to profile |
| 41 | 5. use readprofile to interpret the results: |
| 42 | {{{#!bash |
| 43 | readprofile -m System.map | sort -nr | head -2 |
56 | | </p> |
57 | | <ul><li><a class="ext-link" href="http://lxr.missinglinkelectronics.com/linux/Documentation/basic_profiling.txt"><span class="icon"></span>http://lxr.missinglinkelectronics.com/<b style="color:#000;background:#ffff66">linux</b>/Documentation/basic_<b style="color:#000;background:#66ffff">profiling</b>.txt</a> |
58 | | </li><li><a class="ext-link" href="http://homepages.cwi.nl/~aeb/linux/profile.html"><span class="icon"></span>http://homepages.cwi.nl/~aeb/<b style="color:#000;background:#ffff66">linux</b>/profile.html</a> |
59 | | </li><li>See <a class="ext-link" href="http://lxr.missinglinkelectronics.com/linux/kernel/profile.c"><span class="icon"></span>kernel/profile.c</a> and <a class="ext-link" href="http://lxr.missinglinkelectronics.com/linux/fs/proc/proc_misc.c"><span class="icon"></span>fs/proc/proc_misc.c</a> and <a class="ext-link" href="http://techpubs.sgi.com/library/tpl/cgi-bin/getdoc.cgi?coll=linux&db=man&fname=/usr/share/catman/man1/readprofile.1.html"><span class="icon"></span>readprofile(1)</a>. |
60 | | </li></ul><h2 id="OProfile">OProfile</h2> |
61 | | <p> |
62 | | OProfile provides a <b style="color:#000;background:#66ffff">profiler</b> and post-processing tools for analyzing profile data, event counter. |
63 | | </p> |
64 | | <p> |
65 | | The tool used is called <strong>operf</strong>. Some processors are not supported by the underlying new perf_events kernel API and thus not supported by operf. If you see <strong>Your kernel's Performance Events Subsystem does not support your processor type</strong> then you need to try and use opcontrol for the legacy mode. |
66 | | </p> |
67 | | <p> |
| 51 | * http://lxr.missinglinkelectronics.com/linux/Documentation/basic_profiling.txt |
| 52 | * http://homepages.cwi.nl/~aeb/linux/profile.html |
| 53 | * See [http://lxr.missinglinkelectronics.com/linux/kernel/profile.c kernel/profile.c] and [http://lxr.missinglinkelectronics.com/linux/fs/proc/proc_misc.c fs/proc/proc_misc.c] and [http://techpubs.sgi.com/library/tpl/cgi-bin/getdoc.cgi?coll=linux&db=man&fname=/usr/share/catman/man1/readprofile.1.html readprofile(1)]. |
| 54 | |
| 55 | == OProfile == |
| 56 | |
| 57 | OProfile provides a profiler and post-processing tools for analyzing profile data, event counter. |
| 58 | |
| 59 | The tool used is called {{{operf}}}. Some processors are not supported by the underlying new perf_events kernel API and thus not supported by operf. If you see **Your kernel's Performance Events Subsystem does not support your processor type** then you need to try and use opcontrol for the legacy mode. |
| 60 | |
90 | | </p> |
91 | | <ul><li>oprofile kernel module (requires CONFIG_<b style="color:#000;background:#66ffff">PROFILING</b>=y and CONFIG_OPROFILE=m) |
92 | | </li><li>opcontrol - used to setup <b style="color:#000;background:#66ffff">profiling</b> (need vmlinux file) |
93 | | </li><li>opprofiled - the daemon (controlled via opcontrol) |
94 | | </li><li>opreport - report on collected samples |
95 | | </li></ul><p> |
| 78 | * oprofile kernel module (requires CONFIG_PROFILING=y and CONFIG_OPROFILE=m) |
| 79 | * opcontrol - used to setup profiling (need vmlinux file) |
| 80 | * opprofiled - the daemon (controlled via opcontrol) |
| 81 | * opreport - report on collected samples |
97 | | </p> |
98 | | <ul><li>--session-dir specifies the location to store samples. It defaults to /var/lib/oprofile and you can use this (with both opcontrol and opreport) to use samples from alternate locations |
99 | | </li><li>--separate specifies how to seperate samples. By default they are all stored in a single file (none), but you can choose to store by: |
100 | | <ul><li>none - no profile separation (default) |
101 | | </li><li>lib - per-application profiles for libraries |
102 | | </li><li>kernel - per-application profiles for the kernel and kernel modules |
103 | | </li><li>thread - profiles for each thread and each task |
104 | | </li><li>cpu - profiles for each CPU |
105 | | </li><li>all - all of the above |
106 | | </li></ul></li><li>Using <strong>profile specification parameters</strong> you can choose how to sample and report data" |
107 | | <ul><li>cpu:0 - report just cpu0 (assuming data was collected separately (see above)) |
108 | | </li></ul></li><li>--vmlinux=file (both for opcontrol and opreport) specifies the vmlinux kernel image required for decrypting kernel symbols |
109 | | </li><li>--setup will store the following list of parameters in /root/.oprofile/daemonrc to be used as default settings for opcontrol and opreport. Alternatively you can specify setup options to each program as needed |
110 | | </li></ul><p> |
| 83 | * --session-dir specifies the location to store samples. It defaults to /var/lib/oprofile and you can use this (with both opcontrol and opreport) to use samples from alternate locations |
| 84 | * --separate specifies how to seperate samples. By default they are all stored in a single file (none), but you can choose to store by: |
| 85 | - none - no profile separation (default) |
| 86 | - lib - per-application profiles for libraries |
| 87 | - kernel - per-application profiles for the kernel and kernel modules |
| 88 | - thread - profiles for each thread and each task |
| 89 | - cpu - profiles for each CPU |
| 90 | - all - all of the above |
| 91 | * Using profile specification parameters you can choose how to sample and report data" |
| 92 | - cpu:0 - report just cpu0 (assuming data was collected separately (see above)) |
| 93 | * --vmlinux=file (both for opcontrol and opreport) specifies the vmlinux kernel image required for decrypting kernel symbols |
| 94 | * --setup will store the following list of parameters in /root/.oprofile/daemonrc to be used as default settings for opcontrol and opreport. Alternatively you can specify setup options to each program as needed |
| 95 | |
112 | | </p> |
113 | | <ol><li>copy your current kernel's vmlinux to /tmp |
114 | | </li><li>(optional) setup our configuration for vmlinux symbol decrypting, specific session location, and separating events by cpu: |
115 | | <pre class="wiki">opcontrol --setup --vmlinux=/tmp/vmlinux --session-dir=/tmp/session1 --separate=cpu |
116 | | </pre></li><li>start capturing events: |
117 | | <pre class="wiki">opcontrol --start |
118 | | </pre><ul><li>you can force a flush of collected events via <strong>opcontrol --dump</strong> at any time |
119 | | </li><li>you can clearout current collected events via <strong>opcontrol --reset<em> at any time |
120 | | </em></strong></li></ul></li><li>stop capturing events (and flush data): |
121 | | <pre class="wiki">opcontrol --shutdown |
122 | | </pre></li><li>report events: |
123 | | <pre class="wiki">opreport --vmlinux=/tmp/vmlinux --session-dir=/tmp/session1 |
124 | | </pre><ul><li>if capturing events from individual cpu's separately (as shown above) you can show the info for just cpu0 via <strong>opreport cpu:0</strong> |
125 | | </li><li>Note that opreport doesn't make use of the conf file generated by opcontrol --setup |
126 | | </li></ul></li></ol><p> |
| 97 | 1. copy your current kernel's vmlinux to /tmp |
| 98 | 2. (optional) setup our configuration for vmlinux symbol decrypting, specific session location, and separating events by cpu: |
| 99 | {{{#!bash |
| 100 | opcontrol --setup --vmlinux=/tmp/vmlinux --session-dir=/tmp/session1 --separate=cpu |
| 101 | }}} |
| 102 | 3. start capturing events: |
| 103 | {{{#!bash |
| 104 | opcontrol --start |
| 105 | }}} |
| 106 | * you can force a flush of collected events via opcontrol --dump at any time |
| 107 | * you can clearout current collected events via opcontrol --reset at any time |
| 108 | 4. stop capturing events (and flush data): |
| 109 | {{{#!bash |
| 110 | opcontrol --shutdown |
| 111 | }}} |
| 112 | 5. report events: |
| 113 | {{{#!bash |
| 114 | opreport --vmlinux=/tmp/vmlinux --session-dir=/tmp/session1 |
| 115 | }}} |
| 116 | * if capturing events from individual cpu's separately (as shown above) you can show the info for just cpu0 via opreport cpu:0 |
| 117 | * Note that opreport doesn't make use of the conf file generated by opcontrol --setup |
| 118 | |
132 | | </p> |
133 | | <ul><li><a class="ext-link" href="http://oprofile.sourceforge.net/doc/controlling-daemon.html"><span class="icon"></span>http://oprofile.sourceforge.net/doc/controlling-daemon.html</a> |
134 | | </li><li><a class="ext-link" href="http://oprofile.sourceforge.net/doc/getting-started-with-legacy.html"><span class="icon"></span>http://oprofile.sourceforge.net/doc/getting-started-with-legacy.html</a> |
135 | | </li></ul><h2 id="Perf">Perf</h2> |
136 | | <p> |
137 | | In general <b style="color:#000;background:#66ffff">profiling</b> with the <strong>perf</strong> tool is considered easier to install and run. |
138 | | </p> |
139 | | <p> |
| 123 | * http://oprofile.sourceforge.net/doc/controlling-daemon.html |
| 124 | * http://oprofile.sourceforge.net/doc/getting-started-with-legacy.html |
| 125 | |
| 126 | |
| 127 | == Perf == |
| 128 | In general profiling with the {{{perf}}} tool is considered easier to install and run. |
| 129 | |
141 | | </p> |
142 | | <ol><li>(optional) copy your current kernel's vmlinux to /tmp |
143 | | </li><li>capture 120 seconds worth of <b style="color:#000;background:#66ffff">profiling</b> data |
144 | | <pre class="wiki">perf record -p $(pidofprogram) sleep 120 |
145 | | </pre></li><li>report data (using kernel symbols): |
146 | | <pre class="wiki">perf report -k /tmp/vmlinux |
147 | | </pre><ul><li>the -k is optional and adds kernel symbol decoding |
148 | | </li></ul></li></ol><p> |
| 131 | 1. (optional) copy your current kernel's vmlinux to /tmp |
| 132 | 2. capture 120 seconds worth of profiling data |
| 133 | {{{#!bash |
| 134 | perf record -p $(pidofprogram) sleep 120 |
| 135 | }}} |
| 136 | 3. report data (using kernel symbols): |
| 137 | {{{#!bash |
| 138 | perf report -k /tmp/vmlinux |
| 139 | }}} |
| 140 | * the -k is optional and adds kernel symbol decoding |
| 141 | |
150 | | </p> |
151 | | <ul><li><a class="ext-link" href="https://perf.wiki.kernel.org/index.php/Tutorial"><span class="icon"></span>https://perf.wiki.kernel.org/index.php/Tutorial</a> |
152 | | </li></ul><h2 id="OpenWrt"><a class="wiki" href="/wiki/OpenWrt">OpenWrt</a></h2> |
153 | | <p> |
154 | | <a class="wiki" href="/wiki/OpenWrt">OpenWrt</a> has support for both oProfile and perf. Because perf depends on glibc (or at least is configured that way) we recommend oprofile when using <a class="wiki" href="/wiki/OpenWrt">OpenWrt</a>. |
155 | | </p> |
156 | | <p> |
157 | | To enable oProfile on <a class="wiki" href="/wiki/OpenWrt">OpenWrt</a> do a make menuconfig and: |
158 | | </p> |
159 | | <ul><li>Global build Settings -> Compile the kernel with <b style="color:#000;background:#66ffff">profiling</b> enabled |
160 | | </li><li>Development -> oprofile |
161 | | </li><li>Development -> oprofile-utils |
162 | | <ul><li>Note that package/devel/oprofile/Makefile may need +librt added to DEPENDS |
163 | | </li></ul></li></ul><p> |
| 143 | * https://perf.wiki.kernel.org/index.php/Tutorial |
| 144 | |
| 145 | |
| 146 | = OpenWrt = |
| 147 | OpenWrt has support for both oProfile and perf. Because perf depends on glibc (or at least is configured that way) we recommend oprofile when using OpenWrt. |
| 148 | |
| 149 | To enable oProfile on OpenWrt do a make menuconfig and: |
| 150 | * Global build Settings -> Compile the kernel with profiling enabled |
| 151 | * Development -> oprofile |
| 152 | * Development -> oprofile-utils |
| 153 | - Note that package/devel/oprofile/Makefile may need +librt added to DEPENDS |
| 154 | |
174 | | </p> |
175 | | <ul><li><a class="ext-link" href="http://false.ekta.is/2012/11/cpu-profiling-applications-on-openwrt-with-perf-or-oprofile/"><span class="icon"></span><b style="color:#000;background:#66ffff">Profiling</b> on OpenWrt with perf or OProfile</a> |
176 | | </li></ul></div> |
177 | | |
178 | | <div class="trac-modifiedby"> |
179 | | <span><a href="/wiki/linux/profiling?action=diff&version=3" title="Version 3 by tharvey: added note about cns3xxx timer based profiling limitations">Last modified</a> <a class="timeline" href="/timeline?from=2015-04-07T16%3A03%3A47-07%3A00&precision=second" title="See timeline at 04/07/15 16:03:47">2 years ago</a></span> |
180 | | <span class="trac-print">Last modified on 04/07/15 16:03:47</span> |
181 | | </div> |
182 | | |
183 | | |
184 | | </div> |
185 | | |
186 | | |
187 | | </div> |
188 | | <div id="altlinks"> |
189 | | <h3>Download in other formats:</h3> |
190 | | <ul> |
191 | | <li class="last first"> |
192 | | <a rel="nofollow" href="/wiki/linux/profiling?format=txt">Plain Text</a> |
193 | | </li> |
194 | | </ul> |
195 | | </div> |
196 | | </div> |
197 | | <div id="footer" lang="en" xml:lang="en"><hr /> |
198 | | <a id="tracpowered" href="http://trac.edgewall.org/"><img src="/chrome/common/trac_logo_mini.png" height="30" width="107" alt="Trac Powered" /></a> |
199 | | <p class="left">Powered by <a href="/about"><strong>Trac 1.0</strong></a><br /> |
200 | | By <a href="http://www.edgewall.org/">Edgewall Software</a>.</p> |
201 | | <p class="right">Visit the Trac open source project at<br /><a href="http://trac.edgewall.org/">http://trac.edgewall.org/</a></p> |
202 | | </div> |
203 | | }}} |
| 165 | * [http://false.ekta.is/2012/11/cpu-profiling-applications-on-openwrt-with-perf-or-oprofile/ Profiling on OpenWrt with perf or OProfile]] |