dhcp-perf-guide.xml 59 KB


  1. <?xml version="1.0" encoding="UTF-8"?>
  2. <!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook XML V4.2//EN"
  3. "http://www.oasis-open.org/docbook/xml/4.2/docbookx.dtd" [
  4. <!ENTITY mdash "&#x2014;" >
  5. <!ENTITY % version SYSTEM "version.ent">
  6. %version;
  7. ]>
  8. <!--
  9. - Copyright (C) 2012 Internet Systems Consortium, Inc. ("ISC")
  10. -
  11. - Permission to use, copy, modify, and/or distribute this software for any
  12. - purpose with or without fee is hereby granted, provided that the above
  13. - copyright notice and this permission notice appear in all copies.
  14. -
  15. - THE SOFTWARE IS PROVIDED "AS IS" AND ISC DISCLAIMS ALL WARRANTIES WITH
  16. - REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY
  17. - AND FITNESS. IN NO EVENT SHALL ISC BE LIABLE FOR ANY SPECIAL, DIRECT,
  18. - INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES WHATSOEVER RESULTING FROM
  19. - LOSS OF USE, DATA OR PROFITS, WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE
  20. - OR OTHER TORTIOUS ACTION, ARISING OUT OF OR IN CONNECTION WITH THE USE OR
  21. - PERFORMANCE OF THIS SOFTWARE.
  22. -->
  23. <?xml-stylesheet href="bind10-guide.css" type="text/css"?>
  24. <book>
  25. <bookinfo>
  26. <title>DHCP Performance Guide</title>
  27. <!-- <subtitle>Various aspects of DHCP Performance in BIND 10</subtitle> -->
  28. <copyright>
  29. <year>2012</year>
  30. <holder>Internet Systems Consortium, Inc. ("ISC")</holder>
  31. </copyright>
  32. <author>
  33. <firstname>Tomasz</firstname>
  34. <surname>Mrugalski</surname>
  35. </author>
  36. <author>
  37. <firstname>Marcin</firstname>
  38. <surname>Siodelski</surname>
  39. </author>
  40. <abstract>
  41. <para>BIND 10 is a framework that features Domain Name System
  42. (DNS) and Dynamic Host Configuration Protocol (DHCP)
  43. software with development managed by Internet Systems Consortium (ISC).
  44. This document describes various aspects of DHCP performance,
  45. measurements and tuning. It covers BIND 10 DHCP (codename Kea),
  46. existing ISC DHCP4 software, perfdhcp (a DHCP performance
  47. measurement tool) and other related topics.</para>
  48. </abstract>
  49. <releaseinfo>This is a companion document for BIND 10 version
  50. &__VERSION__;.</releaseinfo>
  51. </bookinfo>
  52. <preface>
  53. <title>Preface</title>
  54. <section id="acknowledgements">
  55. <title>Acknowledgements</title>
  56. <para>ISC would like to acknowledge generous support for
  57. BIND 10 development of DHCPv4 and DHCPv6 components provided
  58. by <ulink url="http://www.comcast.com/">Comcast</ulink>.</para>
  59. </section>
  60. </preface>
  61. <chapter id="intro">
  62. <title>Introduction</title>
  63. <para>
  64. This document is in the early stages of development. It is
  65. expected to grow significantly in the near future. It will
  66. cover topics like database backend perfomance measurements,
  67. tools, and the pros an cons of various optimization techniques.
  68. </para>
  69. </chapter>
  70. <chapter id="dhcp4">
  71. <title>ISC DHCP 4.x</title>
  72. <para>
  73. TODO: Write something about ISC DHCP4 here.
  74. </para>
  75. </chapter>
  76. <chapter id="kea">
  77. <title>Kea</title>
  78. <para>
  79. </para>
  80. <section>
  81. <title>Backend performance evaluation</title>
  82. <para>
  83. Kea will support several different database backends, using
  84. both popular databases (like MySQL or SQLite) and
  85. custom-developed solutions (such as an in-memory database).
  86. To aid in the choice of backend, the BIND 10
  87. source code features a set of performance microbenchmarks.
  88. Written in C/C++, these are small tools that simulate expected
  89. DHCP server behaviour and evaluate the performance of
  90. considered databases. As implemented, the benchmarks are not really
  91. simulating DHCP operation, but rather use set of primitives
  92. that can be used by a real server. For this reason, they are called
  93. micro-benchmarks.
  94. </para>
  95. <para>Although there are many operations and data types that
  96. server could store in a database, the most frequently used data
  97. type is lease information. Although the information held for IPv4
  98. and IPv6 leases differs slightly, it is expected that the performance
  99. differences will be minimal between IPv4 and IPv6 lease operations.
  100. Therefore each test uses the lease4 table (in which IPv4 leases are stored)
  101. for performance measurements.
  102. </para>
  103. <para>All benchmarks are implemented as single threaded applications
  104. that take advantage of a single database connection.</para>
  105. <para>
  106. Those benchmarks are stored in tests/tools/dhcp-ubench directory of the
  107. BIND 10 source tree. This directory contains simplified prototypes for
  108. the various database back-ends that are planned or considered as a
  109. possibly for BIND10 DHCP. These benchmarks are expected to evolve into
  110. useful tools that will allow users to measure performance in their
  111. specific environment.
  112. </para>
  113. <para>
  114. Currently the following benchmarks are implemented:
  115. <itemizedlist>
  116. <listitem><para>In memory + flat file</para></listitem>
  117. <listitem><para>SQLite</para></listitem>
  118. <listitem><para>MySQL</para></listitem>
  119. </itemizedlist>
  120. </para>
  121. <para>
  122. As the benchmarks require additional (sometimes heavy) dependencies, they
  123. are not built by default. Actually, their build system is completely
  124. separate from that of the rest of BIND 10. It will be eventually merged
  125. with the main BIND 10 build system.
  126. </para>
  127. <para>
  128. All benchmarks will follow the same pattern:
  129. <orderedlist>
  130. <listitem><para>Prepare operation (connect to a database, create a file etc.)</para></listitem>
  131. <listitem><para>Measure timestamp 0</para></listitem>
  132. <listitem><para>Commit new lease4 record (repeated N times)</para></listitem>
  133. <listitem><para>Measure timestamp 1</para></listitem>
  134. <listitem><para>Search for random lease4 record (repeated N times)</para></listitem>
  135. <listitem><para>Measure timestamp 2</para></listitem>
  136. <listitem><para>Update existing lease4 record (repeated N times)</para></listitem>
  137. <listitem><para>Measure timestamp 3</para></listitem>
  138. <listitem><para>Delete existing lease4 record (repeated N times)</para></listitem>
  139. <listitem><para>Measure timestamp 4</para></listitem>
  140. <listitem><para>Print out statistics, based on N and measured timestamps.</para></listitem>
  141. </orderedlist>
  142. Although this approach does not attempt to simulate actual DHCP server
  143. operation that has mix of all steps, it answers the
  144. questions about basic database strengths and weak points. In particular
  145. it can show what is the impact of specific database optimizations, such as
  146. changing engine, optimizing for writes/reads etc.
  147. </para>
  148. <para>
  149. The framework attempts to do the same amount of work for every
  150. backend thus allowing fair complarison between them.
  151. </para>
  152. </section>
  153. <section id="mysql-backend">
  154. <title>MySQL backend</title>
  155. <para>The MySQL backend requires the MySQL client development libraries. It uses
  156. the mysql_config tool (similar to pkg-config) to discover required
  157. compilation and linking options. To install required packages on Ubuntu,
  158. use the following command:
  159. <screen>$ <userinput>sudo apt-get install mysql-client mysql-server libmysqlclient-dev</userinput></screen>
  160. Make sure that MySQL server is running. Make sure that you have your setup
  161. configured so there is a user that is able to modify used database.</para>
  162. <para>Before running tests, you need to initialize your database. You can
  163. use mysql.schema script for that purpose.</para>
  164. <para><emphasis>WARNING: It will drop existing
  165. Kea database. Do not run this on your production server. </emphasis></para>
  166. <para>Assuming your
  167. MySQL user is "kea", you can initialize your test database by:
  168. <screen>$ <userinput>mysql -u kea -p &lt; mysql.schema</userinput></screen>
  169. </para>
  170. <para>After the database is initialized, you are ready to run the test:
  171. <screen>$ <userinput>./mysql_ubench</userinput></screen>
  172. or
  173. <screen>$ <userinput>./mysql_ubench &gt; results-mysql.txt</userinput></screen>
  174. Redirecting output to a file is important, because for each operation
  175. there is a single character printed to show progress. If you have a slow
  176. terminal, this may considerably affect test performance. On the other hand,
  177. printing something after each operation is required as poor database settings
  178. may slow down operations to around 20 per second. (The observant user is expected
  179. to note that the initial dots are printed too slowly and abort the test.)</para>
  180. <para>Currently all default parameters are hardcoded. Default values can be
  181. overridden using command line switches. Although all benchmarks take
  182. the same list of parameters, some of them are specific to a given backend.
  183. To get a list of supported parameters, run the benchmark with the "-h" option:
  184. <screen>$ <userinput>./mysql_ubench -h</userinput></screen>
  185. </para>
  186. <para>Synchronous operation requires database backend to
  187. physically store changes to disk before proceeding. This
  188. property ensures that no data is lost in case of the server
  189. failure. Unfortunately, it slows operation
  190. considerably. Asynchronous mode allows database to write data at
  191. a later time (usually controlled by the database engine on OS
  192. disk buffering mechanism).</para>
  193. <section>
  194. <title>MySQL tweaks</title>
  195. <para>To modify the default mysql_ubench parameters, command line
  196. switches can be used. The currently supported switches are
  197. (default values specified in brackets):
  198. <orderedlist>
  199. <listitem><para>-f name - name of the database ("kea")</para></listitem>
  200. <listitem><para>-m hostname - name of the database host ("localhost")</para></listitem>
  201. <listitem><para>-u user - MySQL username ("root")</para></listitem>
  202. <listitem><para>-p password - MySQL password ("secret")</para></listitem>
  203. <listitem><para>-n num - number of iterations (100)</para></listitem>
  204. <listitem><para>-s yes|no - should the operations be performed in a synchronous (yes)
  205. or asynchronous (no) manner (yes)</para></listitem>
  206. <listitem><para>-v yes|no - verbose mode. Should the test print out progress? (yes)</para></listitem>
  207. <listitem><para>-c yes|no - precompiled statements. Should the SQL statements be precompiled? (yes)</para></listitem>
  208. </orderedlist>
  209. </para>
  210. <para>One parameter that has huge impact on performance is the choice of backend engine.
  211. You can get a list of engines of your MySQL implementation by using
  212. <screen>&gt; <userinput>show engines;</userinput></screen>
  213. in your mysql client. Two notable engines are MyISAM and InnoDB. mysql_ubench uses
  214. use MyISAM for synchronous mode and InnoDB for asynchronous. Please use
  215. '-s yes|no' to choose whether you want synchronous or asynchronous operations.</para>
  216. <para>Another parameter that affects performance are precompiled statements.
  217. In a basic approach, the actual SQL query is passed as a text string that is
  218. then parsed by the database engine. Alternative is a so called precompiled
  219. statement. In this approach the SQL query is compiled an specific values are being
  220. bound to it. In the next iteration the query remains the same, only bound values
  221. are changing (e.g. searching for a different address). Usage of basic or precompiled
  222. statements is controlled with '-c no|yes'.</para>
  223. </section>
  224. </section>
  225. <section id="sqlite-ubench">
  226. <title>SQLite-ubench</title>
  227. <para>The SQLite backend requires both the sqlite3 development and run-time packages. Their
  228. names may vary from system to system, but on Ubuntu 12.04 they are called
  229. sqlite3 libsqlite3-dev. To install them, use the following command:
  230. <screen>&gt; <userinput>sudo apt-get install sqlite3 libsqlite3-dev</userinput></screen>
  231. Before running the test the database has to be created. Use the following command for that:
  232. <screen>&gt; <userinput>cat sqlite.schema | sqlite3 sqlite.db</userinput></screen>
  233. A new database called sqlite.db will be created. That is the default name used
  234. by sqlite_ubench test. If you prefer other name, make sure you update
  235. sqlite_ubench.cc accordingly.</para>
  236. <para>Once the database is created, you can run tests:
  237. <screen>&gt; <userinput>./sqlite_ubench</userinput></screen>
  238. or
  239. <screen>&gt; <userinput>./sqlite_ubench > results-sqlite.txt</userinput></screen>
  240. </para>
  241. <section id="sqlite-tweaks">
  242. <title>SQLite tweaks</title>
  243. <para>To modify default sqlite_ubench parameters, command line
  244. switches can be used. The currently supported switches are
  245. (default values specified in brackets):
  246. <orderedlist>
  247. <listitem><para>-f filename - name of the database file ("sqlite.db")</para></listitem>
  248. <listitem><para>-n num - number of iterations (100)</para></listitem>
  249. <listitem><para>-s yes|no - should the operations be performed in a synchronous (yes)
  250. or asynchronous (no) manner (yes)</para></listitem>
  251. <listitem><para>-v yes|no - verbose mode. Should the test print out progress? (yes)</para></listitem>
  252. <listitem><para>-c yes|no - precompiled statements. Should the SQL statements be precompiled? (yes)</para></listitem>
  253. </orderedlist>
  254. </para>
  255. <para>SQLite can run in asynchronous or synchronous mode. This
  256. mode can be controlled by using "synchronous" parameter. It is set
  257. using the SQLite command:</para>
  258. <para><command>PRAGMA synchronous = ON|OFF</command></para>
  259. <para>Another tweakable feature is journal mode. It can be
  260. turned to several modes of operation. Its value can be
  261. modified in SQLite_uBenchmark::connect(). See
  262. http://www.sqlite.org/pragma.html#pragma_journal_mode for
  263. detailed explanantion.</para>
  264. <para>sqlite_bench supports precompiled statements. Please use
  265. '-c no|yes' to define which should be used: basic SQL query (no) or
  266. precompiled statement (yes).</para>
  267. </section>
  268. </section>
  269. <section id="memfile-ubench">
  270. <title>memfile-ubench</title> <para>The memfile backend is a
  271. custom backend that somewhat mimics operation of ISC DHCP4. It
  272. implements in-memory storage using standard C++ and boost
  273. mechanisms (std::map and boost::shared_ptr&lt;&gt;). All
  274. database changes are also written to a lease file, which is
  275. strictly write-only. This approach takes advantage of the fact
  276. that file append operation is faster than modifications introduced
  277. in the middle of the file (as it often requires moving all data
  278. after modified point, effectively requiring writing large parts of
  279. the whole file, not just changed fragment).</para>
  280. <para>There are no preparatory steps required for memfile benchmark.
  281. The only requirement is the ability to create and write specified lease
  282. file (dhcpd.leases in the current directory). The tests can be run
  283. as follows:
  284. <screen>&gt; <userinput>./memfile_ubench</userinput></screen>
  285. or
  286. <screen>&gt; <userinput>./memfile_ubench > results-memfile.txt</userinput></screen>
  287. </para>
  288. <section id="memfile-tweaks">
  289. <title>memfile tweaks</title>
  290. <para>To modify default memfile_ubench parameters, command line
  291. switches can be used. Currently supported switches are
  292. (default values specified in brackets):
  293. <orderedlist>
  294. <listitem><para>-f filename - name of the database file ("dhcpd.leases")</para></listitem>
  295. <listitem><para>-n num - number of iterations (100)</para></listitem>
  296. <listitem><para>-s yes|no - should the operations be performend in a synchronous (yes)
  297. or asynchronous (no) manner (yes)</para></listitem>
  298. <listitem><para>-v yes|no - verbose mode. Should the test print out progress? (yes)</para></listitem>
  299. </orderedlist>
  300. </para>
  301. <para>memfile can run in asynchronous or synchronous mode. This
  302. mode can be controlled by using sync parameter. It uses
  303. fflush() and fsync() in synchronous mode to make sure that
  304. data is not buffered and physically stored on disk.</para>
  305. </section>
  306. </section>
  307. <section>
  308. <title>Basic performance measurements</title>
  309. <para>This section contains sample results for backend performance measurements,
  310. taken using microbenchmarks. Tests were conducted on reasonably powerful machine:
  311. <screen>
  312. CPU: Quad-core Intel(R) Core(TM) i7-2600K CPU @ 3.40GHz (8 logical cores)
  313. HDD: 1,5TB Seagate Barracuda ST31500341AS 7200rpm, ext4 partition
  314. OS: Ubuntu 12.04, running kernel 3.2.0-26-generic SMP x86_64
  315. compiler: g++ (Ubuntu/Linaro 4.6.3-1ubuntu5) 4.6.3
  316. MySQL version: 5.5.24
  317. SQLite version: 3.7.9sourceid version is 2011-11-01 00:52:41 c7c6050ef060877ebe77b41d959e9df13f8c9b5e</screen>
  318. </para>
  319. <para>The benchmarks were run without using precompiled statements.
  320. The code was compiled with the -O0 flag (no code optimizations).
  321. Each run was executed once.</para>
  322. <para>Two series of measures were made, synchronous and
  323. asynchronous. As those modes offer radically different
  324. performances, synchronous mode was conducted for one
  325. thousand repetitions and asynchronous mode was conducted for
  326. one hundred thousand repetitions.</para>
  327. <!-- raw results sync -->
  328. <table><title>Synchronous results (basic)</title>
  329. <tgroup cols='6' align='center' colsep='1' rowsep='1'>
  330. <colspec colname='Backend'/>
  331. <colspec colname='Num' />
  332. <colspec colname='Create'/>
  333. <colspec colname='Search'/>
  334. <colspec colname='Update'/>
  335. <colspec colname='Delete'/>
  336. <colspec colname='Average'/>
  337. <thead>
  338. <row>
  339. <entry>Backend</entry>
  340. <entry>Operations</entry>
  341. <entry>Create [s]</entry>
  342. <entry>Search [s]</entry>
  343. <entry>Update [s]</entry>
  344. <entry>Delete [s]</entry>
  345. <entry>Average [s]</entry>
  346. </row>
  347. </thead>
  348. <tbody>
  349. <row>
  350. <entry>MySQL</entry>
  351. <entry>1,000</entry>
  352. <entry>31.604</entry>
  353. <entry> 0.117</entry>
  354. <entry>27.964</entry>
  355. <entry>27.695</entry>
  356. <entry>21.845</entry>
  357. </row>
  358. <row>
  359. <entry>SQLite</entry>
  360. <entry>1,000</entry>
  361. <entry>61.421</entry>
  362. <entry> 0.033</entry>
  363. <entry>59.477</entry>
  364. <entry>56.034</entry>
  365. <entry>44.241</entry>
  366. </row>
  367. <row>
  368. <entry>memfile</entry>
  369. <entry>1,000</entry>
  370. <entry>38.224</entry>
  371. <entry> 0.001</entry>
  372. <entry>38.041</entry>
  373. <entry>38.017</entry>
  374. <entry>28.571</entry>
  375. </row>
  376. </tbody>
  377. </tgroup>
  378. </table>
  379. <para>The following parameters were measured for asynchronous mode.
  380. MySQL and SQLite were run with one hundred thousand repetitions.</para>
  381. <!-- raw results async -->
  382. <table><title>Asynchronous results (basic)</title>
  383. <tgroup cols='6' align='center' colsep='1' rowsep='1'>
  384. <colspec colname='Backend'/>
  385. <colspec colname='Num' />
  386. <colspec colname='Create'/>
  387. <colspec colname='Search'/>
  388. <colspec colname='Update'/>
  389. <colspec colname='Delete'/>
  390. <colspec colname='Average'/>
  391. <thead>
  392. <row>
  393. <entry>Backend</entry>
  394. <entry>Operations</entry>
  395. <entry>Create [s]</entry>
  396. <entry>Search [s]</entry>
  397. <entry>Update [s]</entry>
  398. <entry>Delete [s]</entry>
  399. <entry>Average [s]</entry>
  400. </row>
  401. </thead>
  402. <tbody>
  403. <row>
  404. <entry>MySQL</entry>
  405. <entry>100,000</entry>
  406. <entry>10.585</entry>
  407. <entry>10.386</entry>
  408. <entry>10.062</entry>
  409. <entry> 8.890</entry>
  410. <entry> 9.981</entry>
  411. </row>
  412. <row>
  413. <entry>SQLite</entry>
  414. <entry>100,000</entry>
  415. <entry> 3.710</entry>
  416. <entry> 3.159</entry>
  417. <entry> 2.865</entry>
  418. <entry> 2.439</entry>
  419. <entry> 3.044</entry>
  420. </row>
  421. <row>
  422. <entry>memfile</entry>
  423. <entry>100,000</entry>
  424. <entry> 1.300</entry>
  425. <entry> 0.039</entry>
  426. <entry> 1.307</entry>
  427. <entry> 1.278</entry>
  428. <entry> 0.981</entry>
  429. </row>
  430. </tbody>
  431. </tgroup>
  432. </table>
  433. <para>The presented performance results can be converted into operations per second metrics.
  434. It should be noted that due to large differences between various operations (sometimes
  435. over three orders of magnitude), it is difficult to create a simple, readable chart with
  436. that data.</para>
  437. <table id="tbl-basic-perf-results"><title>Estimated basic performance</title>
  438. <tgroup cols='6' align='center' colsep='1' rowsep='1'>
  439. <colspec colname='Backend'/>
  440. <colspec colname='Create'/>
  441. <colspec colname='Search'/>
  442. <colspec colname='Update'/>
  443. <colspec colname='Delete'/>
  444. <colspec colname='Average'/>
  445. <thead>
  446. <row>
  447. <entry>Backend</entry>
  448. <entry>Create [oper/s]</entry>
  449. <entry>Search [oper/s]</entry>
  450. <entry>Update [oper/s]</entry>
  451. <entry>Delete [oper/s]</entry>
  452. <entry>Average [oper/s]</entry>
  453. </row>
  454. </thead>
  455. <tbody>
  456. <row>
  457. <entry>MySQL (async)</entry>
  458. <entry>9447.47</entry>
  459. <entry>9627.97</entry>
  460. <entry>9938.00</entry>
  461. <entry>11248.34</entry>
  462. <entry>10065.45</entry>
  463. </row>
  464. <row>
  465. <entry>SQLite (async)</entry>
  466. <entry>26951.59</entry>
  467. <entry>31654.29</entry>
  468. <entry>34899.70</entry>
  469. <entry>40993.59</entry>
  470. <entry>33624.79</entry>
  471. </row>
  472. <row>
  473. <entry>memfile (async)</entry>
  474. <entry>76944.27</entry>
  475. <entry>2542588.35</entry>
  476. <entry>76504.54</entry>
  477. <entry>78269.25</entry>
  478. <entry>693576.60</entry>
  479. </row>
  480. <row>
  481. <entry>MySQL (sync)</entry>
  482. <entry>31.64</entry>
  483. <entry>8575.45</entry>
  484. <entry>35.76</entry>
  485. <entry>36.11</entry>
  486. <entry>2169.74</entry>
  487. </row>
  488. <row>
  489. <entry>SQLite (sync)</entry>
  490. <entry>16.28</entry>
  491. <entry>20045.37</entry>
  492. <entry>16.81</entry>
  493. <entry>17.85</entry>
  494. <entry>7524.08</entry>
  495. </row>
  496. <row>
  497. <entry>memfile (sync)</entry>
  498. <entry>26.16</entry>
  499. <entry>1223990.21</entry>
  500. <entry>26.29</entry>
  501. <entry>26.30</entry>
  502. <entry>306017.24</entry>
  503. </row>
  504. </tbody>
  505. </tgroup>
  506. </table>
  507. <!-- that is obsolete approach that is going to be removed in docbook 5.0.
  508. Its only advantage is that it actually works with docbook2pdf -->
  509. <!--
  510. <figure>
  511. <title>Graphical representation of the basic performance
  512. results presented in table <xref linkend="tbl-basic-perf-results" />.</title>
  513. <graphic scale="100" fileref="performance-results-graph1.png" />
  514. </figure>-->
  515. <!-- this works great for HTML export, but is silently ignored by docbook2pdf
  516. and docbook2ps tools. -->
  517. <mediaobject>
  518. <imageobject>
  519. <imagedata fileref="performance-results-graph1.png" format="PNG" />
  520. </imageobject>
  521. <caption>
  522. <para>Graphical representation of the basic performance results
  523. presented in table <xref linkend="tbl-basic-perf-results" />.</para>
  524. </caption>
  525. </mediaobject>
  526. </section>
  527. <section>
  528. <title>Optimized performance measurements</title>
  529. <para>This section contains sample results for backend performance measurements,
  530. taken using microbenchmarks. Tests were conducted on reasonably powerful machine:
  531. <screen>
  532. CPU: Quad-core Intel(R) Core(TM) i7-2600K CPU @ 3.40GHz (8 logical cores)
  533. HDD: 1,5TB Seagate Barracuda ST31500341AS 7200rpm, ext4 partition
  534. OS: Ubuntu 12.04, running kernel 3.2.0-26-generic SMP x86_64
  535. compiler: g++ (Ubuntu/Linaro 4.6.3-1ubuntu5) 4.6.3
  536. MySQL version: 5.5.24
  537. SQLite version: 3.7.9sourceid version is 2011-11-01 00:52:41 c7c6050ef060877ebe77b41d959e9df13f8c9b5e</screen>
  538. </para>
  539. <para>The benchmarks were run with precompiled statements enabled.
  540. The code was compiled with the -Ofast flag (optimize compilation for speed).
  541. Each run was repeated three times and measured values were averaged.</para>
  542. <para>Again the benchmarks were run in two series, synchronous and
  543. asynchronous. As those modes offer radically different
  544. performances, synchronous mode was conducted for one
  545. thousand repetitions and asynchronous mode was conducted for
  546. one hundred thousand repetitions.</para>
  547. <!-- raw results sync -->
  548. <table><title>Synchronous results (optimized)</title>
  549. <tgroup cols='6' align='center' colsep='1' rowsep='1'>
  550. <colspec colname='Backend'/>
  551. <colspec colname='Num' />
  552. <colspec colname='Create'/>
  553. <colspec colname='Search'/>
  554. <colspec colname='Update'/>
  555. <colspec colname='Delete'/>
  556. <colspec colname='Average'/>
  557. <thead>
  558. <row>
  559. <entry>Backend</entry>
  560. <entry>Operations</entry>
  561. <entry>Create [s]</entry>
  562. <entry>Search [s]</entry>
  563. <entry>Update [s]</entry>
  564. <entry>Delete [s]</entry>
  565. <entry>Average [s]</entry>
  566. </row>
  567. </thead>
  568. <tbody>
  569. <row>
  570. <entry>MySQL</entry>
  571. <entry>1,000</entry>
  572. <entry>27.887</entry>
  573. <entry> 0.106</entry>
  574. <entry>28.223</entry>
  575. <entry>27.696</entry>
  576. <entry>20.978</entry>
  577. </row>
  578. <row>
  579. <entry>SQLite</entry>
  580. <entry>1,000</entry>
  581. <entry>61.299</entry>
  582. <entry> 0.015</entry>
  583. <entry>59.648</entry>
  584. <entry>61.098</entry>
  585. <entry>45.626</entry>
  586. </row>
  587. <row>
  588. <entry>memfile</entry>
  589. <entry>1,000</entry>
  590. <entry>39.564</entry>
  591. <entry> 0.001</entry>
  592. <entry>39.543</entry>
  593. <entry>39.326</entry>
  594. <entry>29.608</entry>
  595. </row>
  596. </tbody>
  597. </tgroup>
  598. </table>
  599. <para>The following parameters were measured for asynchronous mode.
  600. MySQL and SQLite were run with one hundred thousand repetitions.</para>
  601. <!-- raw results async -->
  602. <table><title>Asynchronous results (optimized)</title>
  603. <tgroup cols='6' align='center' colsep='1' rowsep='1'>
  604. <colspec colname='Backend'/>
  605. <colspec colname='Num' />
  606. <colspec colname='Create'/>
  607. <colspec colname='Search'/>
  608. <colspec colname='Update'/>
  609. <colspec colname='Delete'/>
  610. <colspec colname='Average'/>
  611. <thead>
  612. <row>
  613. <entry>Backend</entry>
  614. <entry>Operations</entry>
  615. <entry>Create [s]</entry>
  616. <entry>Search [s]</entry>
  617. <entry>Update [s]</entry>
  618. <entry>Delete [s]</entry>
  619. <entry>Average [s]</entry>
  620. </row>
  621. </thead>
  622. <tbody>
  623. <row>
  624. <entry>MySQL</entry>
  625. <entry>100,000</entry>
  626. <entry>8.507</entry>
  627. <entry>9.698</entry>
  628. <entry>7.785</entry>
  629. <entry>8.326</entry>
  630. <entry>8.579</entry>
  631. </row>
  632. <row>
  633. <entry>SQLite</entry>
  634. <entry>100,000</entry>
  635. <entry> 1.562</entry>
  636. <entry> 0.949</entry>
  637. <entry> 1.513</entry>
  638. <entry> 1.502</entry>
  639. <entry> 1.382</entry>
  640. </row>
  641. <row>
  642. <entry>memfile</entry>
  643. <entry>100,000</entry>
  644. <entry>1.302</entry>
  645. <entry>0.038</entry>
  646. <entry>1.306</entry>
  647. <entry>1.263</entry>
  648. <entry>0.977</entry>
  649. </row>
  650. </tbody>
  651. </tgroup>
  652. </table>
  653. <para>The presented performance results can be converted into operations per second metrics.
  654. It should be noted that due to large differences between various operations (sometime
  655. over three orders of magnitude), it is difficult to create a simple, readable chart with
  656. the data.</para>
  657. <table id="tbl-optim-perf-results"><title>Estimated optimized performance</title>
  658. <tgroup cols='6' align='center' colsep='1' rowsep='1'>
  659. <colspec colname='Backend'/>
  660. <colspec colname='Create'/>
  661. <colspec colname='Search'/>
  662. <colspec colname='Update'/>
  663. <colspec colname='Delete'/>
  664. <colspec colname='Average'/>
  665. <thead>
  666. <row>
  667. <entry>Backend</entry>
  668. <entry>Create [oper/s]</entry>
  669. <entry>Search [oper/s]</entry>
  670. <entry>Update [oper/s]</entry>
  671. <entry>Delete [oper/s]</entry>
  672. <entry>Average [oper/s]</entry>
  673. </row>
  674. </thead>
  675. <tbody>
  676. <row>
  677. <entry>MySQL (async)</entry>
  678. <entry>11754.84</entry>
  679. <entry>10311.34</entry>
  680. <entry>12845.35</entry>
  681. <entry>12010.24</entry>
  682. <entry>11730.44</entry>
  683. </row>
  684. <row>
  685. <entry>SQLite (async)</entry>
  686. <entry>64005.90</entry>
  687. <entry>105391.29</entry>
  688. <entry>66075.51</entry>
  689. <entry>66566.43</entry>
  690. <entry>75509.78</entry>
  691. </row>
  692. <row>
  693. <entry>memfile (async)</entry>
  694. <entry>76832.16</entry>
  695. <entry>2636018.56</entry>
  696. <entry>76542.50</entry>
  697. <entry>79188.81</entry>
  698. <entry>717145.51</entry>
  699. </row>
  700. <row>
  701. <entry>MySQL (sync)</entry>
  702. <entry>35.86</entry>
  703. <entry>9461.10</entry>
  704. <entry>35.43</entry>
  705. <entry>36.11</entry>
  706. <entry>2392.12</entry>
  707. </row>
  708. <row>
  709. <entry>SQLite (sync)</entry>
  710. <entry>16.31</entry>
  711. <entry>67036.11</entry>
  712. <entry>16.76</entry>
  713. <entry>16.37</entry>
  714. <entry>16771.39</entry>
  715. </row>
  716. <row>
  717. <entry>memfile (sync)</entry>
  718. <entry>25.28</entry>
  719. <entry>3460207.61</entry>
  720. <entry>25.29</entry>
  721. <entry>25.43</entry>
  722. <entry>865070.90</entry>
  723. </row>
  724. </tbody>
  725. </tgroup>
  726. </table>
  727. <mediaobject>
  728. <imageobject>
  729. <imagedata fileref="performance-results-graph2.png" format="PNG"/>
  730. </imageobject>
  731. <textobject>
  732. <phrase>Optimized performance measurements</phrase>
  733. </textobject>
  734. <caption>
  735. <para>Graphical representation of the optimized performance
  736. results presented in table <xref linkend="tbl-optim-perf-results"
  737. />.</para>
  738. </caption>
  739. </mediaobject>
  740. </section>
  741. <section>
  742. <title>Conclusions</title>
  743. <para>
  744. Improvements gained by introducing support for precompiled
  745. statements in MySQL is somewhat disappointing - between 6 and
  746. 29%. On the other hand, the improvement in SQLite is
  747. surprisingly high - the efficiency is more than doubled.
  748. </para>
  749. <para>
  750. Compiled statements do not have any measureable impact on
  751. synchronous operations. That is as expected, because the major
  752. bottleneck is the disk performance.
  753. </para>
  754. <para>
  755. Compilation flags yield surprisingly high improvements for C++
  756. STL code. The memfile backend is in some operations is almost
  757. twice as fast.
  758. </para>
  759. <para>
  760. If synchronous operation is required, the current performance
  761. results are likely to be deemed inadequate. The limiting
  762. factor here is a disk access time. Even migrating to high
  763. performance 15,000 rpm disk is expected to only roughly double
  764. number of leases per second, compared to the current results.
  765. The reason is that to write a file to disk, at least two disk
  766. sector writes
  767. are required: the new content and i-node modification of the
  768. file. The easiest way to boost synchronous performance is to
  769. switch to SSD disks. Memory-backed RAM disks are also a viable
  770. solution. However, care should be taken to properly engineer
  771. backup strategy for such RAM disks.
  772. </para>
  773. <para>
  774. While the custom made backend (memfile) provides the best
  775. perfomance, it carries over all the limitations existing in
  776. the ISC DHCP4 code: there are no external tools to query or
  777. change database, the maintenance requires deep knowledge etc.
  778. Those flaws are not shared by usage of a proper database
  779. backend, like MySQL and SQLite. They both offer third party
  780. tools for administrative tasks, they are well documented and
  781. maintained. However, SQLite support for concurrent access is
  782. limiting in certain cases. Since all three investigated
  783. backends more than meet expected performance results, it is
  784. recommended to use MySQL as a first concrete database backend.
  785. Should this choice be rejected for any reason, the second
  786. recommended choice is SQLite.
  787. </para>
  788. <para>
  789. It should be emphaisized that obtained measurements indicate
  790. only database performance and they cannot be directly
  791. translated to expected leases per second or queries per second
  792. performance by an actual server. The DHCP server must do much
  793. more than just query the database to properly process client's
  794. message. The provided results should be considered as only rough
  795. estimates. They can also be used for relative comparisons
  796. between backends.
  797. </para>
  798. </section>
  799. <section>
  800. <title>Possible further optimizations</title>
  801. <para>
  802. For basic measurements the code was compiled with -g -O0
  803. flags. For optimized measurements the benchmarking code was
  804. compiled with -Ofast (optimize for speed). In both cases, the
  805. same backend (MySQL or SQLite) library was used. It may be
  806. useful to recompile the libraries (or the whole server in case
  807. of MySQL) with -Ofast.
  808. </para>
  809. <para>
  810. There are many MySQL parameters that various sources recommend
  811. to improve performance. They were not investigated further.
  812. </para>
  813. <para>
  814. Currently all operations are conducted on one by one
  815. basis. Each operation is treated as a separate
  816. transaction. Grouping N operations together will potentially
  817. bring almost N fold increase in synchronous operations. Such a
  818. feature is present in ISC DHCP4 and is called cache-threshold.
  819. Extension for this benchmark in this regard should be
  820. considered. That affects only write operations (insert,
  821. update and delete). Read operations (search) are expected to
  822. be barely affected.
  823. </para>
  824. <para>
  825. Multi-threaded or multi-process benchmark may be considered in
  826. the future. It may be somewhat difficult as only some backends
  827. support concurrent access.
  828. </para>
  829. </section>
  830. </chapter>
  831. <chapter id="perfdhcp">
  832. <title>perfdhcp</title>
  833. <section>
  834. <title>Purpose</title>
  835. <para>
  836. Evaluation of the performance of a DHCP server requires that it
  837. be tested under varying traffic loads. perfdhcp is a testing
  838. tool with the capability to create traffic loads
  839. and generate statistics from the results. Additional features,
  840. such as the ability to send customised DHCP packets, allow it to
  841. be used in a wide range of functional testing.
  842. </para>
  843. </section>
  844. <section id="perfdhcp-key-features">
  845. <title>Key features</title>
  846. <para>
  847. perfdhcp has a number of command line switches to
  848. control DHCP message exchanges. Currently they fall into
  849. the following categories:
  850. <itemizedlist>
  851. <listitem>
  852. <para>
  853. Rate control - control how many DHCP exchanges
  854. are initiated within a period of time. The tool can also simulate
  855. best effort conditions by attempting to initiate as many DHCP
  856. packet exchanges as possible within a unit of time.
  857. </para>
  858. </listitem>
  859. <listitem>
  860. <para>
  861. Test exit specifiers - control the conditions for test
  862. completion, including the number of initiated exchanges,
  863. the test period orthe maximum number of dropped packets.
  864. </para>
  865. </listitem>
  866. <listitem>
  867. <para>
  868. Packet templates - specify files containing packet templates that
  869. are used by perfdhcp to create custom DHCP messages. The tool
  870. allows the specification of a number of values indicating
  871. offsets of values within a packet that are set by the tool.
  872. </para>
  873. </listitem>
  874. <listitem>
  875. <para>
  876. Reporting - for each test produce a set of performance data
  877. including the achieved packet exchange rate (server performance).
  878. There are a number of diagnostic selectors available that
  879. enable periodic (intermediate) reporting, printing of packet timestamps,
  880. and the listing of detailed information about internal perfdhcp
  881. states (for debugging).
  882. </para>
  883. </listitem>
  884. <listitem>
  885. <para>
  886. Different mode of operations - specify the DHCP protocol used
  887. (v4 or v6), two-way or four-way exchanges, use of the
  888. Rapid Commit option for DHCPv6.
  889. </para>
  890. </listitem>
  891. <listitem>
  892. <para>
  893. IP layer options - specify the local/remote address, local interface
  894. and local port to be used for communication with DHCP server.
  895. </para>
  896. </listitem>
  897. </itemizedlist>
  898. </para>
  899. </section>
  900. <section id="perfdhcp-command-line">
  901. <title>Command line options</title>
  902. <para>
  903. The following "help" output from the tool describes the
  904. command line switches. This summary also lists the tool's
  905. possible exit codes as well as describing the
  906. error counters printed when the test is complete:
  907. <screen>$ <userinput>perfdhcp -h</userinput>
  908. <![CDATA[perfdhcp [-hv] [-4|-6] [-r<rate>] [-t<report>] [-R<range>] [-b<base>]
  909. [-n<num-request>] [-p<test-period>] [-d<drop-time>] [-D<max-drop>]
  910. [-l<local-addr|interface>] [-P<preload>] [-a<aggressivity>]
  911. [-L<local-port>] [-s<seed>] [-i] [-B] [-c] [-1]
  912. [-T<template-file>] [-X<xid-offset>] [-O<random-offset]
  913. [-E<time-offset>] [-S<srvid-offset>] [-I<ip-offset>]
  914. [-x<diagnostic-selector>] [-w<wrapped>] [server]
  915. The [server] argument is the name/address of the DHCP server to
  916. contact. For DHCPv4 operation, exchanges are initiated by
  917. transmitting a DHCP DISCOVER to this address.
  918. For DHCPv6 operation, exchanges are initiated by transmitting a DHCP
  919. SOLICIT to this address. In the DHCPv6 case, the special name 'all'
  920. can be used to refer to All_DHCP_Relay_Agents_and_Servers (the
  921. multicast address FF02::1:2), or the special name 'servers' to refer
  922. to All_DHCP_Servers (the multicast address FF05::1:3). The [server]
  923. argument is optional only in the case that -l is used to specify an
  924. interface, in which case [server] defaults to 'all'.
  925. The default is to perform a single 4-way exchange, effectively pinging
  926. the server.
  927. The -r option is used to set up a performance test, without
  928. it exchanges are initiated as fast as possible.
  929. Options:
  930. -1: Take the server-ID option from the first received message.
  931. -4: DHCPv4 operation (default). This is incompatible with the -6 option.
  932. -6: DHCPv6 operation. This is incompatible with the -4 option.
  933. -a<aggressivity>: When the target sending rate is not yet reached,
  934. control how many exchanges are initiated before the next pause.
  935. -b<base>: The base mac, duid, IP, etc, used to simulate different
  936. clients. This can be specified multiple times, each instance is
  937. in the <type>=<value> form, for instance:
  938. (and default) mac=00:0c:01:02:03:04.
  939. -d<drop-time>: Specify the time after which a request is treated as
  940. having been lost. The value is given in seconds and may contain a
  941. fractional component. The default is 1 second.
  942. -E<time-offset>: Offset of the (DHCPv4) secs field / (DHCPv6)
  943. elapsed-time option in the (second/request) template.
  944. The value 0 disables it.
  945. -h: Print this help.
  946. -i: Do only the initial part of an exchange: DO or SA, depending on
  947. whether -6 is given.
  948. -I<ip-offset>: Offset of the (DHCPv4) IP address in the requested-IP
  949. option / (DHCPv6) IA_NA option in the (second/request) template.
  950. -l<local-addr|interface>: For DHCPv4 operation, specify the local
  951. hostname/address to use when communicating with the server. By
  952. default, the interface address through which traffic would
  953. normally be routed to the server is used.
  954. For DHCPv6 operation, specify the name of the network interface
  955. via which exchanges are initiated.
  956. -L<local-port>: Specify the local port to use
  957. (the value 0 means to use the default).
  958. -O<random-offset>: Offset of the last octet to randomize in the template.
  959. -P<preload>: Initiate first <preload> exchanges back to back at startup.
  960. -r<rate>: Initiate <rate> DORA/SARR (or if -i is given, DO/SA)
  961. exchanges per second. A periodic report is generated showing the
  962. number of exchanges which were not completed, as well as the
  963. average response latency. The program continues until
  964. interrupted, at which point a final report is generated.
  965. -R<range>: Specify how many different clients are used. With 1
  966. (the default), all requests seem to come from the same client.
  967. -s<seed>: Specify the seed for randomization, making it repeatable.
  968. -S<srvid-offset>: Offset of the server-ID option in the
  969. (second/request) template.
  970. -T<template-file>: The name of a file containing the template to use
  971. as a stream of hexadecimal digits.
  972. -v: Report the version number of this program.
  973. -w<wrapped>: Command to call with start/stop at the beginning/end of
  974. the program.
  975. -x<diagnostic-selector>: Include extended diagnostics in the output.
  976. <diagnostic-selector> is a string of single-keywords specifying
  977. the operations for which verbose output is desired. The selector
  978. keyletters are:
  979. * 'a': print the decoded command line arguments
  980. * 'e': print the exit reason
  981. * 'i': print rate processing details
  982. * 'r': print randomization details
  983. * 's': print first server-id
  984. * 't': when finished, print timers of all successful exchanges
  985. * 'T': when finished, print templates
  986. -X<xid-offset>: Transaction ID (aka. xid) offset in the template.
  987. DHCPv4 only options:
  988. -B: Force broadcast handling.
  989. DHCPv6 only options:
  990. -c: Add a rapid commit option (exchanges will be SA).
  991. The remaining options are used only in conjunction with -r:
  992. -D<max-drop>: Abort the test if more than <max-drop> requests have
  993. been dropped. Use -D0 to abort if even a single request has been
  994. dropped. If <max-drop> includes the suffix '%', it specifies a
  995. maximum percentage of requests that may be dropped before abort.
  996. In this case, testing of the threshold begins after 10 requests
  997. have been expected to be received.
  998. -n<num-request>: Initiate <num-request> transactions. No report is
  999. generated until all transactions have been initiated/waited-for,
  1000. after which a report is generated and the program terminates.
  1001. -p<test-period>: Send requests for the given test period, which is
  1002. specified in the same manner as -d. This can be used as an
  1003. alternative to -n, or both options can be given, in which case the
  1004. testing is completed when either limit is reached.
  1005. -t<report>: Delay in seconds between two periodic reports.
  1006. Errors:
  1007. - tooshort: received a too short message
  1008. - orphans: received a message which doesn't match an exchange
  1009. (duplicate, late or not related)
  1010. - locallimit: reached to local system limits when sending a message.
  1011. Exit status:
  1012. The exit status is:
  1013. 0 on complete success.
  1014. 1 for a general error.
  1015. 2 if an error is found in the command line arguments.
  1016. 3 if there are no general failures in operation, but one or more
  1017. exchanges are not successfully completed.
  1018. ]]>
  1019. </screen>
  1020. </para>
  1021. </section>
  1022. <section id="starting-perfdhcp">
  1023. <title>Starting perfdhcp</title>
  1024. <para>
  1025. In order to run a performance test, at least two separate systems
  1026. have to be installed: client and server. The first one must have
  1027. perfdhcp installed, and the latter must be running the DHCP server
  1028. (either v4 or v6). If only single system is available the client
  1029. and server can be run on virtual machines (running on the same
  1030. physical system) but in this case performance data may be heavily
  1031. impacted by the overhead involved in running such the virtual
  1032. machines.
  1033. </para>
  1034. <para>
  1035. Currently, perfdhcp is seen from the server perspective as relay agent.
  1036. This simplifies its implementation: specifically there is no need to
  1037. receive traffic sent to braodcast addresses. However, it does impose
  1038. a requirement that the IPv4
  1039. address has to be set manually on the interface that will be used to
  1040. communicate with the server. For example, if the DHCPv4 server is listening
  1041. on the interface connected to the 172.16.1.0 subnet, the interface on client
  1042. machine has to have network address assigned from the same subnet, e.g.
  1043. <screen><userinput>#ifconfig eth3 172.16.1.2. netmask 255.255.255.0 up</userinput></screen>
  1044. </para>
  1045. <para>
  1046. As DHCP uses low port numbers (67 for DHCPv4 relays and
  1047. 547 for DHCPv6), running perfdhcp with non-root privileges will
  1048. usually result in the error message similar to this:
  1049. <screen><userinput>$ perfdhcp -4 -l eth3 -r 100 all</userinput>
  1050. Error running perfdhcp: Failed to bind socket 3 to 172.16.1.2/port=67
  1051. </screen>
  1052. The '-L' command line switch allows the use of a custom local port.
  1053. However, although the following command line will work:
  1054. <screen><userinput>$ perfdhcp -4 -l eth3 -r 100 -L 10067 all</userinput></screen>
  1055. in the standard configuration no responses will be received
  1056. from the DHCP server because the server responds to default relay
  1057. port 67. A way to overcome this issue is to run
  1058. perfdhcp as root.
  1059. </para>
  1060. </section>
  1061. <section id="perfdhcp-commandline-examples">
  1062. <title>perfdhcp command line examples</title>
  1063. <para>
  1064. In this section, a number of perfdhcp command line examples
  1065. are presented as a quick start guide for new users. For the
  1066. detailed list of command line options refer to
  1067. <xref linkend="perfdhcp-command-line"/>.
  1068. </para>
  1069. <section id="perfdhcp-basic-usage">
  1070. <title>Example: basic usage</title>
  1071. <para>
  1072. If server is listening on interface with IPv4 address 172.16.1.1,
  1073. the simplest perfdhcp command line will look like:
  1074. <screen><userinput># perfdhcp 172.16.1.1</userinput>
  1075. ***Rate statistics***
  1076. Rate: 206.345
  1077. ***Statistics for: DISCOVER-OFFER***
  1078. sent packets: 21641
  1079. received packets: 350
  1080. drops: 21291
  1081. orphans: 0
  1082. min delay: 9.022 ms
  1083. avg delay: 143.100 ms
  1084. max delay: 259.303 ms
  1085. std deviation: 56.074 ms
  1086. collected packets: 30
  1087. ***Statistics for: REQUEST-ACK***
  1088. sent packets: 350
  1089. received packets: 268
  1090. drops: 82
  1091. orphans: 0
  1092. min delay: 3.010 ms
  1093. avg delay: 152.470 ms
  1094. max delay: 258.634 ms
  1095. std deviation: 56.936 ms
  1096. collected packets: 0
  1097. </screen>
  1098. Here, perfdhcp uses remote address 172.16.1.1 as a
  1099. destination address and will use a suitable local interface for
  1100. communication. Since, no rate control parameters have been specified,
  1101. it will initiate DHCP exchanges at the maximum possible rate. Due to the server's
  1102. performance limitation, it is likely that many of the packets will be dropped.
  1103. The performance test will continue running until it is
  1104. interrupted by the user (with ^C).
  1105. </para>
  1106. <para>
  1107. The default performance statistics reported by perfdhcp have the
  1108. following meaning:
  1109. <itemizedlist>
  1110. <listitem><para>Rate - number of packet exchanges (packet sent
  1111. to the server and matching response received from the server)
  1112. completed within a second.</para></listitem>
  1113. <listitem><para>sent packets - total number of DHCP packets of
  1114. a specific type sent to the server.</para></listitem>
  1115. <listitem><para>received packets - total number of DHCP packets
  1116. of specific type received from the server.</para></listitem>
  1117. <listitem><para>drops - number of dropped packets for the
  1118. particular exchange. The number of dropped packets is calculated as
  1119. the difference between the number of sent packets and number of
  1120. response packets received from the server. In some cases, the
  1121. server will have sent a reponse but perfdhcp execution ended before
  1122. the reponse arrived. In such case this packet will be counted
  1123. as dropped.</para></listitem>
  1124. <listitem><para>orphans - number of packets that have been
  1125. received from the server and did not match any packet sent by
  1126. perfdhcp. This may occur if received packet has been sent
  1127. to some other host or if then exchange timed out and
  1128. the sent packet was removed from perfdhcp's list of packets
  1129. awaiting a response.</para></listitem>
  1130. <listitem><para>min delay - minimum delay that occured between
  1131. sending the packet to the server and receiving a reponse from
  1132. it.</para></listitem>
  1133. <listitem><para>avg delay - average delay between sending the
  1134. packet of the specific type the server and receiving a response
  1135. from it.</para></listitem>
  1136. <listitem><para>max delay - maximum delay that occured between
  1137. sending the packet to the server and receiving a response from
  1138. it.</para></listitem>
  1139. <listitem><para>std deviation - standard deviation of the delay
  1140. between sending the packet of a specific type to the server and
  1141. receiving response from it.</para></listitem>
  1142. <listitem><para>collected packets - number of sent packets that
  1143. were garbage collected. Packets may get garbage collected when
  1144. the waiting time for server a response exceeds value set with the
  1145. '-d' (drop time) switch.
  1146. <![CDATA[-d<drop-time>]]>.</para></listitem>
  1147. </itemizedlist>
  1148. </para>
  1149. <para>
  1150. Note: should multiple interfaces on the system running perfdhcp be
  1151. connected to the same subnet, the interface to be used for the test
  1152. can be specified using either the interface name:
  1153. <screen><userinput># perfdhcp -l eth3</userinput></screen>
  1154. or a local address assigned to it:
  1155. <screen><userinput># perfdhcp -l 172.16.1.2</userinput></screen>
  1156. </para>
  1157. </section>
  1158. <section id="perfdhcp-rate-control">
  1159. <title>Example: rate control</title>
  1160. <para>
  1161. In the examples above perfdhcp initiates new exchanges with a best
  1162. effort rate. With this setting, many packets are expected to be dropped
  1163. by the server due to performance limitations. In many cases though, it is
  1164. desired to verify that the server can handle an expected (reasonable) rate
  1165. without dropping any packets. The following command is an example of such
  1166. a test: it causes perfdhcp to initiate 300 four-way exchanges
  1167. per second, and runs the test for 60 seconds:
  1168. <screen><userinput># perfdhcp -l eth3 -p 60 -r 300</userinput>
  1169. ***Rate statistics***
  1170. Rate: 256.683 exchanges/second, expected rate: 300 exchanges/second
  1171. ***Statistics for: DISCOVER-OFFER***
  1172. sent packets: 17783
  1173. received packets: 15401
  1174. drops: 2382
  1175. orphans: 0
  1176. min delay: 0.109 ms
  1177. avg delay: 75.756 ms
  1178. max delay: 575.614 ms
  1179. std deviation: 60.513 ms
  1180. collected packets: 11
  1181. ***Statistics for: REQUEST-ACK***
  1182. sent packets: 15401
  1183. received packets: 15317
  1184. drops: 84
  1185. orphans: 0
  1186. min delay: 0.536 ms
  1187. avg delay: 72.072 ms
  1188. max delay: 576.749 ms
  1189. std deviation: 58.189 ms
  1190. collected packets: 0
  1191. </screen>
  1192. Note that here, the packet drops for the DISCOVER-OFFER
  1193. exchange have been significantly reduced (when compared with the
  1194. output from the previous example) thanks to the setting of a
  1195. reasonable rate. The non-zero number of packet drops and achieved
  1196. rate (256/s) indicate that server's measured performance is lower than 300 leases
  1197. per second. A further rate decrease should eliminate most of the packet
  1198. drops and bring the achieved rate close to expected rate:
  1199. <screen><userinput># perfdhcp -l eth3 -p 60 -r 100 -R 30</userinput>
  1200. ***Rate statistics***
  1201. Rate: 99.8164 exchanges/second, expected rate: 100 exchanges/second
  1202. ***Statistics for: DISCOVER-OFFER***
  1203. sent packets: 5989
  1204. received packets: 5989
  1205. drops: 0
  1206. orphans: 0
  1207. min delay: 0.023 ms
  1208. avg delay: 2.198 ms
  1209. max delay: 181.760 ms
  1210. std deviation: 9.429 ms
  1211. collected packets: 0
  1212. ***Statistics for: REQUEST-ACK***
  1213. sent packets: 5989
  1214. received packets: 5989
  1215. drops: 0
  1216. orphans: 0
  1217. min delay: 0.473 ms
  1218. avg delay: 2.355 ms
  1219. max delay: 189.658 ms
  1220. std deviation: 5.876 ms
  1221. collected packets: 0
  1222. </screen>
  1223. There are now no packet drops, confirming that the server is able to
  1224. handle a load of 100 leases/second.
  1225. Note that the last parameter (-R 30) configures perfdhcp to simulate
  1226. traffic from 30 distinct clients.
  1227. </para>
  1228. </section>
  1229. <section id="perfdhcp-templates">
  1230. <title>Example: templates</title>
  1231. <para>
  1232. By default the DHCP messages are formed with default options. With
  1233. template files, it is possible to define a custom packet format.
  1234. </para>
  1235. <para>
  1236. The template file names are specified with <![CDATA[-T<template-file>]]>
  1237. command line option. This option can be specified zero, one or two times.
  1238. The first occurence of this option refers to DISCOVER or SOLICIT message
  1239. and the second occurence refers to REQUEST (DHCPv4 or DHCPv6) message.
  1240. If -T option occurs only once the DISCOVER or SOLICIT message will be
  1241. created from the template and the REQUEST message will be created
  1242. dynamically (without the template). Note that each template file
  1243. holds data for exactly one DHCP message type. Templates for multiple
  1244. message types must not be combined in the single file.
  1245. The content in template files is encoded as series of ASCII hexadecimal
  1246. digits (each byte represented by two ASCII chars 00..FF). Data in a
  1247. template file is laid in network byte order and it can be used on the
  1248. systems with different endianess.
  1249. perfdhcp forms the packet by replacing parts of the message buffer read
  1250. from the file with variable data such as elapsed time, hardware address, DUID
  1251. etc. The offsets where such variable data is placed is specific to the
  1252. template file and have to be specified from the command line. Refer to
  1253. <xref linkend="perfdhcp-command-line"/> to find out how to
  1254. specify offsets for particular options and fields. With the following
  1255. command line the DHCPv6 SOLICIT and REQUEST packets will be formed from
  1256. solicit.hex and request6.hex packets:
  1257. <screen><userinput># perfdhcp -6 -l eth3 -r 100 -R 20 -T solicit.hex -T request6.hex -O 21 -E 84 -S 22 -I 40 servers</userinput>
  1258. ***Rate statistics***
  1259. Rate: 99.5398 exchanges/second, expected rate: 100 exchanges/second
  1260. ***Statistics for: SOLICIT-ADVERTISE***
  1261. sent packets: 570
  1262. received packets: 569
  1263. drops: 1
  1264. orphans: 0
  1265. min delay: 0.259 ms
  1266. avg delay: 0.912 ms
  1267. max delay: 6.979 ms
  1268. std deviation: 0.709 ms
  1269. collected packets: 0
  1270. ***Statistics for: REQUEST-REPLY***
  1271. sent packets: 569
  1272. received packets: 569
  1273. drops: 0
  1274. orphans: 0
  1275. min delay: 0.084 ms
  1276. avg delay: 0.607 ms
  1277. max delay: 6.490 ms
  1278. std deviation: 0.518 ms
  1279. collected packets: 0
  1280. </screen>
  1281. where the switches have the following meaning:
  1282. <itemizedlist>
  1283. <listitem><para>two occurences of -O 21 - DUID's last octet
  1284. positions in SOLICIT and REQUEST respectively.</para></listitem>
  1285. <listitem><para>-E 84 - elapsed time option position in the
  1286. REQUEST template</para></listitem>
  1287. <listitem><para>-S 22 - server id position in the REQUEST
  1288. template</para></listitem>
  1289. <listitem><para>-I 40 - IA_NA option position in the REQUEST
  1290. template</para></listitem>
  1291. </itemizedlist>
  1292. </para>
  1293. <para>
  1294. The offsets of options indicate where they begin in the packet.
  1295. The only exception from this rule is <![CDATA[-O<random-offset>]]>
  1296. option that specifies the end of the DUID (DHCPv6) or MAC address
  1297. (DHCPv4). Depending on the number of simulated clients
  1298. (see <![CDATA[-R<random-range>]]> command line option) perfdhcp
  1299. will be randomizing bytes in packet buffer starting from this
  1300. position backwards. For the number of simulated clients
  1301. <![CDATA[<=]]> 256 only one octet (at random-offset position)
  1302. will be ranomized, for the number of clients <![CDATA[<=]]> 65536
  1303. two octets (at random-offset and random-offset-1)
  1304. will be randmized etc.
  1305. </para>
  1306. </section>
  1307. </section>
  1308. </chapter>
  1309. </book>