Project Syn Tech

Rewriting a Python Library in Rust

20. Mrz 2024

Earlier this month I presented at the Rust Zürich meetup group about how we re-implemented a critical piece of code used in our workflows. In this presentation I walked the audience through the migration of a key component of Project Syn (our Kubernetes configuration management framework) from Python to Rust.

We tackled this project to address the longer-than-15-minute CI pipeline runs we needed to roll out changes to our Kubernetes clusters. Thanks to this rewrite (and some other improvements) we’ve been able to reduce the CI pipeline runs to under 5 minutes.

The related pull request, available on GitHub, was merged 5 days ago, and includes the mandatory documentation describing its functionality.

I’m also happy to report that this talk was picked up by the popular newsletter „This Week in Rust“ for its 538th edition! You can find the recording of the talk, courtesy of the Rust Zürich meetup group organizers, on YouTube.

Simon Gerber

Simon Gerber ist ein DevOps-Ingenieur bei VSHN.

Kontaktiere uns

Unser Expertenteam steht für dich bereit. Im Notfall auch 24/7.

Kontakt
Tech

Benchmarking Kubernetes Storage Solutions

23. Jul 2021

One of the most difficult subjects in the world of Kubernetes is storage. In our day-to-day operations we’ve often had to choose the best storage solution for our customers, but in a changing landscape of requirements and technical offerings, such choice becomes a major task.

Faced to many options, we decided to benchmark storage solutions in real life conditions, to generate the data required for a proper decision. In this article we’re going to share with you our methodology, our results, and our final choice.

The chosen storage providers for this evaluation were:

All of these benchmarks (except Gluster) were run on an OpenShift 4.7 cluster on Exoscale VMs.

We benchmarked Ceph both with unencrypted and encrypted storage for the OSDs (object-storage daemons). We included Gluster in our evaluation for reference and comparison only, as that’s the solution we offered for storage on OpenShift Container Platform 3.x. We never intended to use Gluster as the storage engine for our new Kubernetes storage cluster product.

Methodology

We first created a custom Python script driving kubestr, which in turn orchestrates Fio. This script performed ten (10) iterations for each benchmark, each of which included the following operations in an isolated Fio run:

  • Read iops
  • Read bandwidth
  • Write iops, with different frequencies of calls to fsync:
    • no fsync calls during each benchmark iteration
    • an fsync call after each operation („fsync=1“)
    • an fsync call after every 32 operations („fsync=32“)
    • an fsync call after every 128 operations („fsync=128“)
  • Write bandwidth, with different frequencies of calls to fsync:
    • no fsync calls during each benchmark iteration
    • an fsync call after each operation („fsync=1“)
    • an fsync call after every 32 operations („fsync=32“)
    • an fsync call after every 128 operations („fsync=128“)

This is the Fio configuration used for benchmarking:

[global]
randrepeat=0
verify=0
ioengine=libaio
direct=1
gtod_reduce=1
[job]
name=JOB_NAME     (1)
bs=BLOCKSIZE      (2)
iodepth=64
size=2G
readwrite=OP      (3)
time_based
ramp_time=5s
runtime=30s
fsync=X           (4)
  1. We generate a descriptive fio job name based on the benchmark we’re executing. The job name is generated by taking the operation („read“ or „write“) and the measurement („bw“ or „iops“) and concatenating them as „OP_MEASUREMENT“, for example „read_iops“.
  2. The blocksize for each operation executed by fio. We use a blocksize of 4K (4kB) for IOPS benchmarks and 128K (128kB) for bandwidth benchmarks.
  3. The IO pattern which fio uses for the benchmark. randread for read benchmarks. randwrite for write benchmarks.
  4. The number of operations to batch between fsync calls. This parameter doesn’t have an influence on read benchmarks.

If writing to a file, issue an fsync(2) (or its equivalent) of the dirty data for every number of blocks given. For example, if you give 32 as a parameter, fio will sync the file after every 32 writes issued. If fio is using non-buffered I/O, we may not sync the file. The exception is the sg I/O engine, which synchronizes the disk cache anyway. Defaults to 0, which means fio does not periodically issue and wait for a sync to complete. Also see end_fsync and fsync_on_close.

Fio documentation

Results

The following graph, taken from the full dataset for our benchmark (available for download and study) shows the type of comparison performed across all considered solutions.

Figure 1. Read IOPS (higher is better)

The table below gives an overview over the data gathered during our evaluation (each column shows mean ± standard deviation):

Storage solutionread IOPSread bandwidth MB/swrite IOPS, no fsyncwrite bandwidth MB/s, no fsyncwrite IOPS, fsync=1write bandwidth MB/s, fsync=1
OCS/Rook.io Ceph RBD (unencrypted OSDs)42344.21 ± 885.521585 ± 32.6559549.14 ± 371.11503.208 ± 12.544305.18 ± 15.6535.591 ± 1.349
OCS/Rook.io CephFS (unencrypted OSDs)44465.21 ± 1657.911594 ± 82.5229978.00 ± 456.97512.788 ± 8.0498808.47 ± 357.87452.086 ± 10.154
OCS/Rook.io Ceph RBD (encrypted OSDs)36303.06 ± 2254.871425 ± 59.7206292.75 ± 424.91310.520 ± 63.047225.00 ± 12.1122.804 ± 1.031
OCS/Rook.io CephFS (encrypted OSDs)36343.35 ± 1234.931405 ± 92.8686020.49 ± 251.16278.486 ± 49.1015004.28 ± 152.01291.729 ± 17.367
Longhorn (unencrypted backing disk)11298.36 ± 664.99295.458 ± 25.458111.197 ± 10.3225975.43 ± 697.14391.57 ± 26.1129.993 ± 1.544
Gluster22957.87 ± 345.40976.511 ± 45.2682630.89 ± 69.21531.88 ± 48.22133.563 ± 11.45543.549 ± 1.656

Unencrypted Rook/OCS numbers are from OCS, encrypted Rook/OCS numbers from vanilla Rook.

Conclusion

After careful evaluation of the results shown above, we chose Rook to implement our APPUiO Managed Storage Cluster product. Rook allows us to have a single product for all the Kubernetes distributions we’re offering at VSHN.

We have released the scripts on GitHub for everyone to verify our results, and published even more data in our Products documentation. Feel free to check the data, run these tests in your own infrastructure, and of course, your pull requests are more than welcome.

Simon Gerber

Simon Gerber ist ein DevOps-Ingenieur bei VSHN.

Kontaktiere uns

Unser Expertenteam steht für dich bereit. Im Notfall auch 24/7.

Kontakt
Interne

Herzlich willkommen, Simon!

10. Jan 2019

Nach meinem Masterabschluss hatte ich die Möglichkeit ein Doktorat zum Thema Memory-Management in Betriebssystemen in Angriff zu nehmen. Dies war ein Thema, welches mich schon einige Zeit während meinem Master-Studium fasziniert hatte und ich habe in den letzten 6 Jahren eine grosse Rolle in der Forschung und Entwicklung des Barrelfish Forschungs-Betriebssystems. In dieser Rolle habe ich das Memory-Management von Barrelfish zu grossen Teilen konzeptuell neu aufgearbeitet und implementiert. Nachdem ich das Doktorat letzten Sommer abschliessen konnte, hatte ich allerdings erst einmal genug von der Forschung und entschied mich dazu, eine „hands-on“ DevOps-Stelle zu suchen.

Bei VSHN habe ich eine grossartige Möglichkeit gefunden, die DevOps-Seite in mir auszuleben, und damit auch die Erfahrung zu nutzen, welche ich durch die Wartung und Modernisierung der internen Build- und Test-Infrastruktur von Barrelfish gesammelt habe.
Bei VSHN hoffe ich, dass ich meine Erfahrungen und Perspektiven aus der Forschung – frei nach dem Motto meiner Forschungsarbeit, „gut genug ist nicht immer gut genug“ – auf die Infrastruktur-Dienste, wie z.B. OpenShift, anwenden kann und dass ich nach den doch eher Theorie-lastigen Jahren an der ETH bei VSHN Praxiserfahrung sammeln kann.

Ausserhalb der Arbeitszeiten bin ich oft mit der Nase in einem Fantasy- oder Science Fiction Roman zu finden. Daneben bin ich auch weiterhin ab und zu in einem Computerspiel anzutreffen, oder dabei an meinen persönlichen Programmier-Projekten zu basteln.

Simon Gerber

Simon Gerber ist ein DevOps-Ingenieur bei VSHN.

Kontaktiere uns

Unser Expertenteam steht für dich bereit. Im Notfall auch 24/7.

Kontakt