SUSE Manager/Scalability-research

From MicroFocusInternationalWiki
Revision as of 15:22, 23 June 2016 by SilvioMoioli (Talk | contribs) (Created page with "== Scalability research summary == Scalability research on SUSE Manager is an ongoing process, governed by the roadmap defined in [https://github.com/SUSE/susemanager-rfc/blo...")

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

Scalability research summary

Scalability research on SUSE Manager is an ongoing process, governed by the roadmap defined in RFC 23 and curated mainly by Silvio Moioli.

This page summarises the studies and results so far. Fully detailed data sets and reproducer instructions are always up-to-date on the Scalability research page in the GitHub development wiki.

Discussion is, as always, very welcome on the SUSE Manager mailing list.

PLEASE REMEMBER THAT ALL DATA IS FOR INTERNAL USE ONLY!

List of benchmarks

  • Salt onboarding smoke test - 20160309
    • Purpose: make sure onboarding works on 100 minions
    • Main results: 100 minions can be onboarded in ~16 minutes on a low-end server
  • Minion channel switching via API smoke test - 20160310
    • Purpose: make sure channel switching is fast enough on 100 minions
    • Main results: channel switching takes well under a second per minon on a low-end server
  • Minion package upgrade smoke test, 1000 minions - 20160323
    • Purpose: check that our minimal hardware requirements make sense
    • Main results:
      • one kernel patch took ~52 minutes for 1000 minions, on a low-end server
      • several bugs discovered and fixed, some default settings were changed
      • hardware recommendations and best practices updated in the official documentation
  • Smoke tests (onboarding and patching) repeated - 20160519
    • Purpose: make sure no functional regressions were introduced in newer versions
    • Main results: two bugs fixed, no shipstopper performance regression found
  • Oracle RAC - 20160615
    • Purpose: validate the hypothesis (made by observation of architecture) that RAC does not really help with scalability
    • Results:
      • hypothesis validated: adding hardware to a single server is recommended instead of RAC
      • RAC is more prone to deadlock issues and recovery time after deadlock is longer. Engineering recommends against it for this reason. RAC One Node not tested but should not have this problem (active-passive architecture)
      • indirect comparison with Postgres suggests there is no big performance gap between the two, there are cases in which Postgres handles deadlock avoidance better