Sounds like you want distributed tracing, for example Jaeger or dynatrace. PS: ideally using some vendor neutral protocol like opentracing, which has a lot of integrations
Why averages suck and percentiles are great
I work in the Observability space and its a very common misconception. People don't really know what they don't know.
Questions:
You could check Dynatrace, depending on the servers pricing may vary (they charge by the RAM of the server) https://www.dynatrace.com/pricing/
>"It makes sense to me that Dynatrace, Appdynamics and whatever are legacy because they're so old."
I disagree. These companies have proven to be leaders in the APM industry for several years then and now. As someone who has a few years experience working with Dynatrace I can tell you that they release new sprints bi-weekly and there is ALWAYS some kind of new early access feature available to customers. Sure it's expensive, that's a fact; however, you simply cannot beat the level of OOTB automation and scalability it provides.
Gartner runs annual competitive reports comparing the different vendors. You can download a free copy of this report through Dynatrace's site:
https://www.dynatrace.com/gartner-magic-quadrant-for-application-performance-monitoring/
Let's use an example from my home state. Here we have public universities with tuition ranging from ~$5,000/AY to ~$30,000. Most schools are at the lower range. A few are at this upper tier. Most students go to the cheaper schools. Far fewer go to the more expensive schools. If we were to average the tuition for my state, the few outlier schools would affect the average such that it would suggest the average students is paying more than they are for tuition. Remove that school from the list and it better reflects what the typical student pays.
Because after all we're looking to make public tuition free for the average student. If you want to go to a gold plated public institution, you can choose to pay more. But we don't have to make *every* program at *every* public institution free. That's not what we are talking about when we say "tuition-free college". If you want to talk about dishonesty, not acknowledging that right there is dishonest.
> Instead, he used the median because it was a significantly lower number. Which is the nature of a median when you have a floor with no ceiling. It's going to be skewed towards the bottom.
Another way of saying this is the average skews the central tendency higher. Take a look at the second graph of this article: https://www.dynatrace.com/news/blog/why-averages-suck-and-percentiles-are-great/
This graph is a good model for the distribution of public school tuition in the US.
I think you're missing the point of what I'm saying here.
Averages can not properly represent bimodal data.
Whether or not the 1 ratings are legitimate, it is apparent that those who dislike the movie rate it to a much greater extreme than those who like it.
If >70% of people liked the film, and the 30% has more extreme opinions, providing only an average does not represent the data.
This can work in the opposite direction too.
This article explains the issue very thoroughly.
Averages and standard deviations based on them are not robust statistical tools to use on a sample of a population of data whose distribution you do not know or that has a dynamic distribution that changes case by case, depending on other unknown variables.
Distribution is just the curve of responses made by a histogram of the values being measured across its range.
This blog post gives some reasons why averages suck.
There is no point in taking the first of the subvec, because it's a vector you can perform nth in constant time. I highly recommend learning about the different types of sequences in clojure. You're running into GC overhead issues because your sequence is so big. Try learning about java's automatic garbage collection https://www.dynatrace.com/resources/ebooks/javabook/how-garbage-collection-works/