Difference between revisions of "SUSE Manager/Scalability-research"

From MicroFocusInternationalWiki
Jump to: navigation, search
(Created page with "== Scalability research summary == Scalability research on SUSE Manager is an ongoing process, governed by the roadmap defined in [https://github.com/SUSE/susemanager-rfc/blo...")
 
(Scalability research summary)
Line 7: Line 7:
 
Discussion is, as always, very welcome on the [mailto:suse-manager@suse.de SUSE Manager mailing list].
 
Discussion is, as always, very welcome on the [mailto:suse-manager@suse.de SUSE Manager mailing list].
  
'''PLEASE REMEMBER THAT ALL DATA IS FOR INTERNAL USE ONLY!'''
 
  
 
=== List of benchmarks ===
 
=== List of benchmarks ===
  
* Salt onboarding smoke test - 20160309<br />
+
* Salt onboarding smoke test - 20160309
** Purpose: make sure onboarding works on 100 minions<br />
+
** Purpose: make sure onboarding works on 100 minions
** Main results: 100 minions can be onboarded in ~16 minutes on a low-end server<br />
+
** Main results: 100 minions can be onboarded in ~16 minutes on a low-end server
* Minion channel switching via API smoke test - 20160310<br />
+
* Minion channel switching via API smoke test - 20160310
** Purpose: make sure channel switching is fast enough on 100 minions<br />
+
** Purpose: make sure channel switching is fast enough on 100 minions
** Main results: channel switching takes well under a second per minon on a low-end server<br />
+
** Main results: channel switching takes well under a second per minon on a low-end server
* Minion package upgrade smoke test, 1000 minions - 20160323<br />
+
* Minion package upgrade smoke test, 1000 minions - 20160323
** Purpose: check that our minimal hardware requirements make sense<br />
+
** Purpose: check that our minimal hardware requirements make sense
 
** Main results:
 
** Main results:
*** one kernel patch took ~52 minutes for 1000 minions, on a low-end server<br />
+
*** one kernel patch took ~52 minutes for 1000 minions, on a low-end server
*** several bugs discovered and fixed, some default settings were changed<br />
+
*** several bugs discovered and fixed, some default settings were changed
*** hardware recommendations and best practices updated in the official documentation<br />
+
*** hardware recommendations and best practices updated in the official documentation
* Smoke tests (onboarding and patching) repeated - 20160519<br />
+
* Smoke tests (onboarding and patching) repeated - 20160519
** Purpose: make sure no functional regressions were introduced in newer versions<br />
+
** Purpose: make sure no functional regressions were introduced in newer versions
** Main results: two bugs fixed, no shipstopper performance regression found<br />
+
** Main results: two bugs fixed, no shipstopper performance regression found
* Oracle RAC - 20160615<br />
+
* Oracle RAC - 20160615
** Purpose: validate the hypothesis (made by observation of architecture) that RAC does not really help with scalability<br />
+
** Purpose: validate the hypothesis (made by observation of architecture) that RAC does not really help with SUSE Manager scalability
 
** Results:
 
** Results:
*** hypothesis validated: adding hardware to a single server is recommended instead of RAC<br />
+
*** hypothesis validated: adding hardware to a single server is recommended instead of adding nodes to a RAC
*** RAC is more prone to deadlock issues and recovery time after deadlock is longer. Engineering recommends against it for this reason. RAC One Node not tested but should not have this problem (active-passive architecture)<br />
+
*** RAC is more prone to deadlock issues and recovery time after deadlock is longer. Engineering recommends against it for this reason. RAC One Node not tested but should not have this problem (active-passive architecture)
*** indirect comparison with Postgres suggests there is no big performance gap between the two, there are cases in which Postgres handles deadlock avoidance better
+
*** no result so far suggests that we will not be able to match (or even exceed) Oracle performance with Postgres
 +
*** in at least some cases Postgres implements deadlock avoidance better

Revision as of 05:55, 24 June 2016

Scalability research summary

Scalability research on SUSE Manager is an ongoing process, governed by the roadmap defined in RFC 23 and curated mainly by Silvio Moioli.

This page summarises the studies and results so far. Fully detailed data sets and reproducer instructions are always up-to-date on the Scalability research page in the GitHub development wiki.

Discussion is, as always, very welcome on the SUSE Manager mailing list.


List of benchmarks

  • Salt onboarding smoke test - 20160309
    • Purpose: make sure onboarding works on 100 minions
    • Main results: 100 minions can be onboarded in ~16 minutes on a low-end server
  • Minion channel switching via API smoke test - 20160310
    • Purpose: make sure channel switching is fast enough on 100 minions
    • Main results: channel switching takes well under a second per minon on a low-end server
  • Minion package upgrade smoke test, 1000 minions - 20160323
    • Purpose: check that our minimal hardware requirements make sense
    • Main results:
      • one kernel patch took ~52 minutes for 1000 minions, on a low-end server
      • several bugs discovered and fixed, some default settings were changed
      • hardware recommendations and best practices updated in the official documentation
  • Smoke tests (onboarding and patching) repeated - 20160519
    • Purpose: make sure no functional regressions were introduced in newer versions
    • Main results: two bugs fixed, no shipstopper performance regression found
  • Oracle RAC - 20160615
    • Purpose: validate the hypothesis (made by observation of architecture) that RAC does not really help with SUSE Manager scalability
    • Results:
      • hypothesis validated: adding hardware to a single server is recommended instead of adding nodes to a RAC
      • RAC is more prone to deadlock issues and recovery time after deadlock is longer. Engineering recommends against it for this reason. RAC One Node not tested but should not have this problem (active-passive architecture)
      • no result so far suggests that we will not be able to match (or even exceed) Oracle performance with Postgres
      • in at least some cases Postgres implements deadlock avoidance better