diff --git a/doc/scaling/clixon-startup.png b/doc/scaling/clixon-startup.png new file mode 100644 index 00000000..5115bb67 Binary files /dev/null and b/doc/scaling/clixon-startup.png differ diff --git a/doc/scaling/large-lists.md b/doc/scaling/large-lists.md index 35424dc0..7bb3653b 100644 --- a/doc/scaling/large-lists.md +++ b/doc/scaling/large-lists.md @@ -12,10 +12,17 @@ Olof Hagsand, 2019-04-17 ## 1. Background -Clixon can handle large configurations. Here, large number of elements -in a "flat" list is presented. There are other scaling usecases, -such as large configuratin "depth", large number of requesting -clients, etc. +Clixon can handle large configurations. Here, measurements using a +large number of elements in a simple "flat" list is analysed. This +includes starting up with alarge existing database; initializing an +empty database with a large number of entries, accessing single +entries with a large database, etc. + +In short, the results show a linear dependency on the number of +entries. This is OK for startup scenarions, but single-enrty (transactional) operations need improvement. + +There are other scaling usecases, such as large configuratin "depth", +large number of requesting clients, etc. Thanks to [Netgate](www.netgate.com) for supporting this work. @@ -41,7 +48,7 @@ The basic case is a large list, according to the following Yang specification: ``` where `a` is a unique key and `b` is a payload, useful in replace operations. -XML lists with `N` elements are generated based on +With this XML lists with `N` elements are generated based on this configuration, eg for `N=10`: ``` 00 @@ -66,6 +73,7 @@ Requests are either made over the _whole_ dataset, or for one specific element. Operations of single elements (transactions) are made in a burst of random elements, typically 100. + ## 3. Tests All details of the setup are in the [test script](../../test/plot_perf.sh). @@ -76,6 +84,7 @@ All tests measure the "real" time of a command on a lightly loaded machine using the Linux command `time(1)`. The following tests were made (for each architecture and protocol): +* Write `N` entries into the startup configuration. The clixon_backend was started with options `-1s startup`. * Write `N` entries in one single operation. (With an empty datastore) * Read `N` entries in one single operation. (With a datastore of `N` entries) * Commit `N` entries (With a candidate of `N` entries and empty running) @@ -83,7 +92,7 @@ The following tests were made (for each architecture and protocol): * Write/Replace 1 entry (In a datastore of `N` entries) * Delete 1 entry (In a datastore of `N` entries) -The tests are made using Netconf and Restconf, except commit which is made only for Netconf. +The tests are made using Netconf and Restconf, except commit which is made only for Netconf and startup where protocol is irrelevant. ### Architecture and OS @@ -118,9 +127,12 @@ The tests were made on the following hardware, all running Ubuntu Linux: ## 4. Results -### Access of the whole datastore - This section shows the results of the measurements as defined in [Tests](#tests). +### Startup + +![Startup](clixon-startup.png "Startup") + +### Access of the whole datastore ![Get config](clixon-get-0.png "Get config") @@ -196,11 +208,14 @@ system degrades with the size of the lists. Examining the profiling of the most demanding Restconf PUT case, most cycles are spent on handling writing and copying the existing datastore. -Concluding, the +Note that the experiments here contains _very_ simple +data-structures. A more realistic complex example will require more +CPU effort. Ad-hoc measurement of a more complex datastructure, +generated four times the duration of the simple yang model in this work. ## 6. Future work -* Improve access of single list elements to sub-linear performance. +* Improve access of individual elements to sub-linear performance. * CLI access on large lists (not included in this study) ## 7. References diff --git a/test/plot_perf.sh b/test/plot_perf.sh index e4a95acf..9d394746 100755 --- a/test/plot_perf.sh +++ b/test/plot_perf.sh @@ -270,78 +270,142 @@ plot(){ echo # newline } +# Run an operation, iterate from to in increment of +# Each operation do times +# args: +# =0 means all in one go +startup(){ + from=$1 + step=$2 + to=$3 + mode=startup + + if [ $# -ne 3 ]; then + exit "plot should be called with 3 arguments, got $#" + fi + + # gnuplot file + gfile=$resdir/startup-$arch + new "Create file $gfile" + echo -n "" > $gfile + + # Startup db: load with n entries + dbfile=$dir/${mode}_db + sudo touch $dbfile + sudo chmod 666 $dbfile + for (( n=$from; n<=$to; n=$n+$step )); do + new "startup-$arch $n" + new "Generate $n entries to $dbfile" + echo -n "" > $dbfile + for (( i=0; i<$n; i++ )); do + echo -n "$i$i" >> $dbfile + done + echo "" >> $dbfile + + new "Startup backend once -s $mode -f $cfg -y $fyang" + echo -n "$n " >> $gfile + { time -p sudo $clixon_backend -F1 -D $DBG -s $mode -f $cfg -y $fyang 2> /dev/null; } 2>&1 | awk '/real/ {print $2}' | tr , . >> $gfile + + done + echo # newline +} + if $run; then -new "test params: -f $cfg -y $fyang" -if [ $BE -ne 0 ]; then - new "kill old backend" - sudo clixon_backend -zf $cfg -y $fyang - if [ $? -ne 0 ]; then - err + + # Startup test before regular backend/restconf start since we only start + # backend a single time + startup $step $step $to + + new "test params: -f $cfg -y $fyang" + if [ $BE -ne 0 ]; then + new "kill old backend" + sudo clixon_backend -zf $cfg -y $fyang + if [ $? -ne 0 ]; then + err + fi + new "start backend -s init -f $cfg -y $fyang" + start_backend -s init -f $cfg -y $fyang fi - new "start backend -s init -f $cfg -y $fyang" - start_backend -s init -f $cfg -y $fyang -fi -new "kill old restconf daemon" -sudo pkill -u www-data -f "/www-data/clixon_restconf" + new "kill old restconf daemon" + sudo pkill -u www-data -f "/www-data/clixon_restconf" -new "start restconf daemon" -start_restconf -f $cfg -y $fyang + new "start restconf daemon" + start_restconf -f $cfg -y $fyang -new "waiting" -sleep $RCWAIT + new "waiting" + sleep $RCWAIT -to=$to0 -step=$step0 -reqs=$reqs0 + to=$to0 + step=$step0 + reqs=$reqs0 -# Put all tests -for proto in netconf restconf; do - new "$proto put all entries to candidate (restconf:running)" - plot put $proto $step $step $to 0 0 0 # all candidate 0 running 0 -done -# Get all tests -for proto in netconf restconf; do - new "$proto get all entries from running" - plot get $proto $step $step $to 0 n n # start w full datastore -done + # Put all tests + for proto in netconf restconf; do + new "$proto put all entries to candidate (restconf:running)" + plot put $proto $step $step $to 0 0 0 # all candidate 0 running 0 + done -# Netconf commit all -new "Netconf commit all entries from candidate to running" -plot commit netconf $step $step $to 0 n 0 # candidate full running empty + # Get all tests + for proto in netconf restconf; do + new "$proto get all entries from running" + plot get $proto $step $step $to 0 n n # start w full datastore + done -# Transactions get/put/delete -reqs=$reqs0 -for proto in netconf restconf; do - new "$proto get $reqs from full database" - plot get $proto $step $step $to $reqs n n + # Netconf commit all + new "Netconf commit all entries from candidate to running" + plot commit netconf $step $step $to 0 n 0 # candidate full running empty - new "$proto put $reqs to full database(replace / alter values)" - plot put $proto $step $step $to $reqs n n + # Transactions get/put/delete + reqs=$reqs0 + for proto in netconf restconf; do + new "$proto get $reqs from full database" + plot get $proto $step $step $to $reqs n n - new "$proto delete $reqs from full database(replace / alter values)" - plot delete $proto $step $step $to $reqs n n -done + new "$proto put $reqs to full database(replace / alter values)" + plot put $proto $step $step $to $reqs n n -new "Kill restconf daemon" -stop_restconf + new "$proto delete $reqs from full database(replace / alter values)" + plot delete $proto $step $step $to $reqs n n + done -if [ $BE -ne 0 ]; then - new "Kill backend" - # Check if premature kill - pid=`pgrep -u root -f clixon_backend` - if [ -z "$pid" ]; then - err "backend already dead" + new "Kill restconf daemon" + stop_restconf + + if [ $BE -ne 0 ]; then + new "Kill backend" + # Check if premature kill + pid=`pgrep -u root -f clixon_backend` + if [ -z "$pid" ]; then + err "backend already dead" + fi + # kill backend + stop_backend -f $cfg fi - # kill backend - stop_backend -f $cfg -fi fi # if run if $plot; then +# 0. Startup +gplot="" +for a in $archs; do + gplot="$gplot \"$resdir/startup-$a\" title \"startup-$a\"," +done + +gnuplot -persist < $cfg EOF +if [ $BE -ne 0 ]; then + new "kill old backend" + sudo clixon_backend -zf $cfg -y $fyang + if [ $? -ne 0 ]; then + err + fi +fi # Try startup mode w startup for mode in startup running; do file=$dir/${mode}_db @@ -113,7 +120,6 @@ new "netconf write large config" expecteof_file "/usr/bin/time -f %e $clixon_netconf -qf $cfg -y $fyang" "$fconfig" "^]]>]]>$" # Here, there are $perfnr entries in candidate - new "netconf write large config again" expecteof_file "/usr/bin/time -f %e $clixon_netconf -qf $cfg -y $fyang" "$fconfig" "^]]>]]>$"