Thread by @cevianNY, The team at @MetricsVictoria recently published some benchmarks comparing their product vs. [...]

The team at @MetricsVictoria recently published some benchmarks comparing their product vs. Promscale: https://valyala.medium.com/promscale-vs-victoriametrics-resource-usage-on-production-workload-91c8e3786c03

These benchmarks, like others that Victoria Metrics has done in the past, are not completely honest. 1/

Promscale vs VictoriaMetrics: resource usage on production workload

Let’s compare Promscale and VictoriaMetrics resource usage on production workload

https://valyala.medium.com/promscale-vs-victoriametrics-resource-usage-on-production-workload-91c8e3786c03

At a high-level, here is what they could have done to produce a more objective and fair comparison: 2/

When calculating the compression ratio, analyze compressed data. Strangely, they decided to calculate a compression ratio on our uncompressed data.

(Promscale keeps a small fixed amount of uncompressed data for performance; at higher volumes this amount is minimal.) 3/

I can guarantee that our uncompressed data has a compression ratio of exactly 0.

4/

Similarly, admit that their own compression is lossy, while Promscale’s is lossless. There is a big difference between lossy compression and lossless. Sometimes lossy compression is okay (eg JPG vs PNG image formats), as long as you know what you are getting. 5/

Comparing not just IOPS but also durability. Promscale by default requires higher IOPS, but in return offers strong guarantees that you won’t lose data. We fsync all commits to WAL by default, and return an acknowledgement to Prometheus’s write-request only after commit. 6/

This means that when Promscale tells Prometheus it stored a thing, it actually stored that thing.

Promscale also allows users to change this default, in case one desires lower IOPS with lower durability. 7/

VM by contrast doesn’t have a WAL, and as far as I know there is no solid guarantee of durability even after some time. VM maybe stores the thing. It’s the UDP of databases.

Again, perhaps an okay tradeoff (eg usage of UDP vs TCP), but one that should be made clear to users. 8/

Another example: VM claimed to measure how much memory we needed; what they actually measure is how much memory they configured. PostgreSQL is designed to use all the memory you configure it with as a page cache for queries. 9/

So, VM verified that our DB’s memory configuration variable did exactly the right thing. They even used nice Grafana graphs. Thanks! 10/

But seriously, why wouldn’t you want your database to use the memory you give it for caching? An interesting analysis would be the minimum amount of memory needed to support the workload, but that’s not what VM did (even though they probably thought they did). 11/

VM is also not transparent about the qualitative differences: Promscale supports both SQL and 100% standard-compliant PromQL. VM only supports a broken version of PromQL, which a third-party found featured <60% compatibility with PromQL: https://promlabs.com/blog/2020/11/26/an-update-on-promql-compatibility-across-vendors 12/

Blog - An Update on PromQL Compatibility Across Vendors

PromLabs - Products and services around the Prometheus monitoring system to make Prometheus work for you

https://promlabs.com/blog/2020/11/26/an-update-on-promql-compatibility-across-vendors

In addition, Promscale (but not VM) allows you to perform complex analysis, joining metric data with other relational data, using standard SQL visualization and AI/ML tools, supporting enterprise-grade security and permissions, etc. 13/

Benchmarking, done right, informs the reader of the tradeoffs various systems make (and how best to choose what's right for your system). Unfortunately, by hiding the true tradeoffs and design decisions made by these two systems, these benchmarks fail to inform the reader. 14/

The real shame here is that I think VM is actually a good piece of technology, with some trade-offs that some may find reasonable. But that gets hidden by this kind of questionable marketing (and general spamming its creators tend to do on other people’s forums). 15/

VictoriaMetrics tech /is/ pretty cool: the most impressive part is their careful optimization of Go and the resulting low resource usage. I can definitely see use cases for it in resource-constrained environments. 16/

One thing we appreciate about the @PrometheusIO community is that everyone is helpful. Eg., Thanos and Cortex share code. Some of the authors of Thanos and Cortex also continue to provide helpful feedback to us (including helping us choose our name: https://github.com/timescale/promscale/issues/243) 17/

POLL: Help us choose a new name for this project · Issue #243 · timescale/promscale

Now that this project is getting a lot of usage, we are considering giving it a new name (partly to give it its own identity, partly because we're tired of typing "timescale-promet...

https://github.com/timescale/promscale/issues/243

The @PrometheusIO community deserves honest benchmarking. Let’s do better! 18/18

Wow, this blew up more than I anticipated. If people have suggestions about better benchmarking, please leave a comment on this Promscale issue https://github.com/timescale/promscale/issues/391. (especially useful would be pointers to real-world Prometheus datasets)

Feedback Wanted: Prometheus benchmarking feedback · Issue #391 · timescale/promscale

This is meant as a thread to share ideas of the best way to create an open, reproducible, and fair benchmarking suite for various long term stores in the Prometheus ecosystem. Especially helpful wo...

https://github.com/timescale/promscale/issues/391

Latest Threads Unrolled: