Wow, genuinely correct use of 'lag'.
So Bedrock uses UDP rather than TCP for its communication, my assumption it would be either the router is doing something silly with TCP traffic or possibly something similar is happening on your host system. Maybe bufferbloat?
This is going to be a pain to figure out, but for now I can suggest two things:
I recently setup netdata on a couple servers in my homelab to test it out. I haven't spent much time with it yet but just did a scroll through and it has CPU usage breakdowns for process, applications, systemd processes, etc. This may be what you are looking for.
I use a NetData dashboard for this. https://www.netdata.cloud/ https://github.com/netdata/netdata
Docker settings: > docker run -d --name=netdata \ > -p 19999:19999 \ > -v netdataconfig:/etc/netdata \ > -v netdatalib:/var/lib/netdata \ > -v netdatacache:/var/cache/netdata \ > -v /etc/passwd:/host/etc/passwd:ro \ > -v /etc/group:/host/etc/group:ro \ > -v /proc:/host/proc:ro \ > -v /sys:/host/sys:ro \ > -v /etc/os-release:/host/etc/os-release:ro \ > --restart unless-stopped \ > --cap-add SYS_PTRACE \ > --security-opt apparmor=unconfined \ > netdata/netdata
I use netdata to monitor physical hosts. It has tons of metrics, and I know you don't want tons of metrics, but I think it's still worth consideration:
> I feel like I am using it more than it can really handle
what makes you think this? Check your RAM usage, CPU load, and what your 4 cores are doing. Always measure before making changes. Check out something like Netdata. When you find your server is feeling slow, go to Netdata and see what was going on.
If you're doing multiple software transcodes then that's probably your problem right there. It appears you can hardware transcode with Ryzen APU. See if that's an option. Otherwise, consider whether you can transcode ahead of time instead of on the fly.
> I know most like to use Intel, but I've always liked AMD.
Ryzen is more than fine. It's a great alternative to the used Xeon gear people are buying, and easily runs circles around most of it. Xeon has the benefit of being widely used in the enterprise, which means you get a constant supply of cheap used gear on eBay. Which also means you can get into motherboards with IPMI, or a dozen SATA/SAS ports, or whatever. Ryzen is pretty limited in the respect, outside of Asrock Rack and I think one or two ASUS WS motherboards which are more expensive.
My guess is you'd probably benefit from stepping up to a Ryzen 7. It's a simple upgrade if you find you're running out of cores. But AM4 has a ton of upgrade options. You can switch to a Ryzen 7 2700, or switch to Ryzen 5 3600x. Or go totally insane and get a Ryzen 9. It's a wonderful platform.
Netdata comes to mind; it handles collection, storage & display all in one, and for many use-cases "just works" out-of-the-box. It's not really designed for long-term metrics in the default configuration, though, what with one second granularity and no support for multiple levels of granularity.
According to https://www.netdata.cloud/blog/the-reality-of-netdatas-long-term-metrics-storage-database/ (Oct 2020) you totally can do long-term metrics (a year) while keeping 1s granularity. However, I've got my server set to one week retention at one second, and I sometimes (once every 5 months or so) get buffer overrun errors i.e. data is generated faster than it can be saved to disk. Maybe my I/O just sucks...
Given my experience, I'd recommend Netdata anyway if only for the short-term (48h) analysis you can do; if you want long-term, use the exporting features to throw the same metrics into Prometheus/Grafana/etc.
Hey u/Anycast,
Odysseas from Netdata here. We just released a blog post about the different options to secure the dashboard of a Netdata Agent
https://www.netdata.cloud/blog/netdata-agent-dashboard/
If you have any questions, just shoot them up here!
Monitoring wise I like https://checkmk.com (there is the open source version and a paid version)
For Linux there is also https://www.netdata.cloud/, but it's mostly Linux today (you can do MacOS somewhat, but not really for Windows yet)
I guess for "the rest" we'd need a list of "enterprise applications" you're interested in. You can certainly Google as well, if you don't want to know what sysadmins on reddit might be using.
Prometheus is a tool which is capable of scraping an endpoint and storing the data for querying later. This can take many forms, be it usage metrics or system stats - really anything. As well, Prometheus provides dash-boarding capabilities, I recommend pairing Prometheus with Grafana to create beautiful dashboards of your data.
The above raises 2 points though:
Not really sure of a tool which provides metrics in a Prometheus format which would show you the stats on your NAS, given I don't know what NAS you have so I'm assuming it is just a server. You could write something custom which provides an HTTP endpoint that Prometheus can poll, but this is a lot of work for little reward. However, you'd get fine grain control over what data you have to visualize.
I'd recommend checking out netdata (https://www.netdata.cloud), run it in a docker container and it provides pretty solid motioning of your system plus alerting out of the box. Saves you from having to re-invent the wheel unless you really want to learn how to build wheels.
Sounds like pressure of some system resource. Likely memory, disk or CPU. Using a system monitor after you start up to track the core system resources would be a good place to start. I generally find netdata a good tool for this as it gives you a lot of high fidelity metrics for your system and graphs them over time.
First of all I used a profiler (pgbadger and netdata) to figure out where the lags were coming from. I then tried the usual stuff (increasing shared_buffers, max_wal_size, min_wal_size from their ultra low defaults), but the biggest performance gain came from moving the database from eMMC to a mechanical hard drive :-D
It turns out that while eMMC is fast on paper, there are some fairly absurd costs involved in erasing and write commits (several seconds) especially when low-end chips are involved.
Collect logs from everything. Windows journals, hardware (mainly SMART, this does feels like a raid on virtualziation host dying), throw in some netdata collectors for machines, if there are none, then focus on collecting network data.
Even if you introduced some changes, there's a good chance its coinciding with something else previously undiscovered
Is netdata not suitable? https://www.netdata.cloud/agent/
It seems that is can run locally, and you can install the agent on various devices. Not sure if it requires still a connection to the netdata cloud. ,( This is /r/selfhosted)
My vote is Netdata as well. It's super quick and easy to install, we'll monitor more than just Docker if you care about that, has alerting and notifications, and it's simple to have it export all of its metrics to Prometheus if you wanted to do that. It also can read any Prometheus exports you have set up from other services.
Yes, It's Out of Memory.
I set the thread count to number of physical cores of my CPU. I have 6C/12T CPU and only 16GB ram, So:
-r 6 -u 256 -v 128
I get the best results using those values. even better than assigning 12 threads. monitor your memory usage during plotting. I am using NetData.
Well typically if you want to monitor machine you need to monitor load, memory, network speed, interface status etc
Rather than using python to monitor all the, consider switching to ebpf to do it. It will have far less overhead and be dynamic rather that having to poll all the time.
Good on you for finding an itch and wanting to scratch it. That's the spirit of Linux welling up in you. Props.
Netdata is good for my use cases, but you have to choose between looking at an interface on each system, or using their cloud based console to view them all in one pain of glass. Depends on how you feel about that. https://www.netdata.cloud/
I already had netdata, it's super lightweight and will give a stupid amount of info if you wanna know what's going on on your Pi as well as docker.
Then I got to know Grafana and found a good dashboard that I customised for my use. Good ones I found are: this and .that.
So now, I use Netdata + InfluxDB + Grafana.
You are in luck, we just released (yesterday) a new feature that does just that, called "Composite Charts". Read more about it in our blog: https://www.netdata.cloud/blog/bringing-rich-and-real-time-infrastructure-monitoring-to-netdata-cloud/ Share you experience in our forums: https://community.netdata.cloud/topic/163/composite-charts-overview-pane-are-here
Hey,
Odysseas from Netdata here! We have already shipped an anomaly-detection-like feature, called metric correlations! We are quite proud, so any feedback is more than welcome :)
Regarding the netdata agent, I would say that it's much much more than a simple node-exporter. It can auto-detect literally all the available data sources, it creates beautiful visualizations automatically and it has zero configuration for health alarms.
Netdata cloud will remain free to some extent, but we can't share more at this point. We are slowly adding features to it, as it becomes more mature from a product perspective.
OP if you have any question about Netdata, I would be happy to reply to a fellow startup in the races :)
NetData is an easy to deploy app that will help pinpoint network/system issues in real time.
It’s not great at history though since it’s primary used for real-time data vs historical.
+1, likely not the network ;)
Initially when you mentioned server-side resource monitoring my reaction was "No, I don't want to compete with the amazing Netdata".
But the more I was thinking about it I realized that some people may not want extensive monitoring on their server and basic CPU, memory, temperature monitoring can be easily baked in after some kind of authentication is implemented. In fact, the server-side script for pushing that data can just be a simple bash script that uses curl to push that data to the service. Info gathering is available in Linux without any special software. Definitely adding that in the todo list.
Thanks a lot for the feedback and your kind words. :)
Netdata is another option. it needs zero configuration. You can just run a one-line install command, and it starts to monitor the whole system with default settings. It can monitor almost anything. İt's free but beware that it collects some usage date that you can opt out. You can check their website. it is the easiest option i could find till today. netdata
Netdata is pretty robust for real time monitoring. Not quite sure how you would get it to show your Jitsi connections but someone here may know. The rest of what you are looking for, and a whole lot more, is straight outta the box.
I suggest you install the NetData plugin for FreeNAS, which is in the community section. It will give you a full system monitoring dashboard for realtime analysis of resource usage from which you can determine bottlenecks.
Here's more info:
I really like Netdata for this task: https://www.netdata.cloud/
Fire-and forget Docker container: https://docs.netdata.cloud/packaging/docker/
No need for an external DB (If you don't want one)
Written in C (lightweight)
Pluggable architecture: https://docs.netdata.cloud/collectors/
EDIT: Well, that's what I get for composing a message and walking away. Thanks gvim!
Just head about Netdata about an hour ago in another subreddit, and was really impressed with their live demo page. Regarding long term storage, they have your traditional options here, but also rolled out a new DB engine a few months ago for long term storage which you can read more about here.
First I've heard of this, but after looking at their live demo page I have to agree the level of detail this provides is pretty insane. Also, a quick search on YouTube returned an tutorial by Lawrence Systems / PC Pickup, which if you use pfsense at all you're probably familiar with. His 10 minute video shows just how simple it is to get it up and running using curl (not docker, sorry).
One thing I did note though on their Github page is that, starting with v1.12, Netdata collects anonymous usage information by default and sends it to Google Analytics,
so if this bothers you make sure to read up on how to opt-out.
I was actually just googling around again and found this: https://www.netdata.cloud/
This might not give me the granularity I want, but this might not be bad. My import will likely be the only thing running on the VM. This would also give me various insights as well. It also does plugins for different metrics.
Actually 12 months of a Digital Ocean Droplet is $60 per year.
Whereas the cheapest option for RDS is a db.t3.micro which at its cheapest is $106 a year.
Granted of course like everything should you go with X or Y, the answer is it depends. For my needs so far the droplet has more than enough performance to meet my needs. However, there are of course cases where it makes sense to do RDS.
Once CapRover is set up, you can use a 1-click installer to set up a PostgreSQL database, so it's pretty easy actually to deploy.
As for monitoring CapRover uses NetData, which you can install with 1-click, its the most impressive monitoring tool I've ever seen.
On DO you can do weekly automatic system-wide snapshots of your VPS for $1 per droplet per month, with the last 4 being saved. But you can also do it manually.
On CapRover you can also make backups of the system with 1-click. In fact, if you are tech-savvy you can add a shell script to your app (which is actually a docker container), which can make automatic backups.
It is more work, but if you have more time than money then why not ¯\_(ツ)_/¯
This is a reasonable start. Two of my favorites (projects I'm involved with) are CheckMK (https://checkmk.com/) which is our primary monitoring/alerting system for long term monitoring and Netdata (https://www.netdata.cloud/) which really is mostly about real-time monitoring, but has actually made steady ground lately in adding long term data recording and alerting. Netdata shows great promise and is very usable today, but it's main strength is in real-time (right now) monitoring. And it doesn't handle "everything under the sun" like CheckMK does.
While the free version of CheckMK does have Nagios under it, it actually makes Nagios easy to manage and it supports auto-discovery, push/pull, SNMP, snmptrapping, syslog.... I have found very little that it can;'t monitor, and I've written custom monitors for very very specific things (our own apps). We chose CheckMK over the others (research we did in 2016) because it has the robust alerting features we needed and ... well... it can pretty much do whatever you need to get done and makes it easy to monitor things you thought would be impossible to monitor.
If you need a more scalable platform, the commercial version of CheckMK replaces Nagios underneath with a high speed engine (anywhere from 4-10x faster) and replaces RRD based pnp4nagios with their own graphing engine. Free CheckMK is very very good. The paid version takes good even further.
I contribute to both CheckMK and Netdata.
for hw monitoring I use netdata. You can install it from their repo, or I think a docker image is also available.
for cpu temp you have to install and configure lm-sensors, for hdd temperature hddtemp. both of them installable from the default debian repos: apt install lm-sensors hddtemp
To set up lm-sensors, run sensors-detect
. hddtemp asks questions during install, no other command needed for setup.
If you want to use built-in omv tools, you can see the current hdd temperature under Storage->S.M.A.R.T->Devices. I don't know a way to show cpu temp on the omv webui.
On the simple side, there's /sys/class/net/<INTERFACE>/statistics/
and, in that folder, tx_bytes
and rx_bytes
.
If you want current transfer rate vs. total transmitted at a given point, sar
(part of sysstat
) can collect and report that information.
If you want something more fancy-pants there's netdata: https://www.netdata.cloud/
key management? one central server I admin from generally, plus my laptop keys go into authorized_keys, copy that everywhere. that's all i need at home.
monitoring? netdata, use it at home and at work.
> 1) Create a bridge on eth1 and allow traffic only from a VLAN, ie. an existing isolated IoT/Guest VLAN
For a typical home system it probably won't really matter than much, but separating management traffic VM traffic is generally a good idea. It also means that you make changes on the "VM interface" (eth1) without as much worry about locking yourself out because of a bad rule on eth0.
Tools like Proxmox can manage hypervisor level firewalls for VMs. Proxmox also supports OpenVZ containers, but not Docker directly (although there are people running Docker inside OpenVZ, apparently...).
Two other points: iptables isn't going to have much of an impact, even on your streaming server (assuming you have a reasonably sane set of rules); you could look at --set-mark
on traffic to/from your VMs to help with your iptables rules.
You should worry about congestion if you start dropping packets, or see other netwok problems (high latency, etc). There are lots of tools that can help track this, although NetData will probably be good enough in this case.