Check out Prometheus and the Influx suite - combine either with Grafana for a very flexible data storage and visualization solution which fit into the container/kubernetes world pretty well.
This was my first proper foray into Grafana/Telegraf and InfluxDB. I used this dashboard but changed it up to match requirements.
My next dashboard will incorporate Plex and Pfsense stats also.
You can always role your own using something like https://grafana.com (there are other dashboard builders), but most have their own dashboards (Travis, Jenkins, GitLab, etc). Another common tactic is to update folks via Slack or similar and/or use embedded status images in a wiki or project web page (e.g., the green [build|passing] icon you often see on github). I myself have used all of the above, depending on how much info my peeps needed to see.
​
I know a lot of places that do this. I know one company that had a wall of TV's with different dashboards just to impress clients and visitors. My last job we had a TV showing server status.
I know many who like making custom dashboards like to use Grafana.
If you're a big powershell user you can look at PowerShell Universal Dashbaord. I haven't gotten around to playing with it, but I learned about it at a presentation and it looks like it can do some cool things.
May I ask why you went with CollectD instead of Telegraf? You already have the repository setup since it's the same as InfluxDB's. It would also probably be a good idea to set up the repository for Grafana to enable easy updates instead of manually installing the dpkg. https://grafana.com/docs/grafana/latest/installation/debian/
Exactly. CloudWatch replaced almost all of logging and monitoring needs. Looked into 3rd parties like DataDog, but found CloudWatch can do almost all what they offer.
Side note, I would highly recommend using Grafana as a front end for CloudWatch. It offers a sweet CloudWatch data source that allows you to easily create dashboards from any CloudWatch metric - even custom ones.
Quick snapshot from our ECS cluster of workers all pulling metrics from CloudWatch. http://i.imgur.com/lwfRW5p.png
You can check Grafana 8. They already integrated alertmanager and cortex as one of their main alerting rule along with Grafana alerts.
https://plugins.jenkins.io/prometheus will give you an endpoint that prometheus can scrape. Then you can put grafana in front of prometheus and generate some really cool dashboards. https://grafana.com/dashboards/306 is a nice one.
Check out Grafana. I'm sure if you searched hard enough you could find a pre-made template and then tweak it to your liking. Here's one for example
Curious to know what/how you implement/monitor. I'd be interested in trying this out on my box, too.
P.S. Make sure you're only exposing dockers to the internet and NOT your whole server. Unraid wasn't built or designed to be completely exposed to the internet
> We highly, highly recommend not exposing your server to the internet or placing it in the DMZ of your router unless you know what you are doing and are following strong security protocols. > No matter how locked down you think you have your server, it is never advisable to place it in the DMZ on your network. By doing so, you are essentially forwarding every port on your public IP address to your server directly, allowing all locally accessible services to be remotely accessible as well. Regardless of how "locked down" you think you actually have the server, placing it in the DMZ exposes it to unnecessary risks.
You have to set up loki properly. Grafana does not store any data, loki does. You need to feed that log file to loki using promtail.
See: https://grafana.com/docs/loki/latest/getting-started/get-logs-into-loki/
So I use grafana and influx within a container, I have them built through docker-compose
Here is what the compose file looks like: ```grafana: image: grafana/grafana container_name: grafana ports: - 3000:3000 restart: unless-stopped
influxdb: image: influxdb container_name: influxdb volumes: - ./influxdb:/var/lib/influxdb ports: - 8086:8086 restart: unless-stopped ```
I am using this dashboard: https://grafana.com/grafana/dashboards/10095
Install the telegraf package, then input the basics I did.
Check enable
TeleGraf Output: InfluxDB
InfluxDB Server: <synology nas ip:8086>
InfluxDB Database: pfsense
InfluxDB Username: root
InfluxDB Password: <whateveryousetyourpassword>
This should get you 99% of the way there, when you use the dashboard above it just asks for your WAN interface on setup and the rest just is setup.
In portainer you can add the compose info under stacks
as just a single copy/pasta I believe, I do not use portainer much anymore but that should work. If not just build out each part with the corresponding pieces in a portainer image build like you normally do.
This. Right now, it's early days, and just about every "platform" is either trying to lock you in now, or will realize that they can't capture the market, so they will start milking their existing customers.
​
Your "Version 1" can be as simple as "all the devices simply POST to a webserver that stuffs the data into a database." Then iterate:
When writing a system like this, everything seems easy until you realize you made a big ball of mud.
The way to avoid problems is to make each part as modular as possible (so it's easy to replace), and eliminate coupling between components. (e.g. if your data analysis part is embedded in your webserver, it will be hard to iterate. 'data analysis' should be de-coupled from 'data collection'.) That's why having a pubsub/queue system really helps.
You probably don't want to spend your time building the dashboard page itself. There are a number of open-source tools built for exactly this, e.g. Grafana, [Dashing](dashing.io), Dashbuilder, and many more. They will solve problems for you that you don't even realize need solving yet.
Using one of these tools will get you up and running much faster than trying to build it yourself. You will be able to spend your time making useful visualizations of the data you are getting from the test campaigns instead of figuring out how to deal with storing data over time, user accounts, etc.
Now, that said: if your company has multiple verification teams and doesn't already have good insight into what's happening, something seems amiss. I can't for the life of me imagine running even a single test group without something like TestRails. Test case management and release verification is not a new problem. There is an entire industry of products designed for exactly what your management wants.
You may win a lot of points if you just help the company evaluate and set up the tool they should have been using in the first place.
Grafana Cloud is hosted Prometheus. You can use the Grafana Agent or use Prometheus Agent mode.
And, yea, I don't use those. We've been using a SaaS service at work for a while, and we're finally getting rid of it in favor of running Prometheus+Thanos+Grafana ourselves. For the price of our SaaS, my salary is paid multiple times over. It's just plain cheaper to run it yourself as soon as you get past trivial scale.
Sorry to hear this; out intention with AGPL is that it’s totally fine to use the projects, even expose them on the internet - as long as you release any changes you make. If you don’t make any changes, you have nothing to worry about.
We published a FAQ on the blog which goes into more detail https://grafana.com/blog/2021/04/20/qa-with-our-ceo-on-relicensing/
As far as I know, you don't. That's just the way Grafana works. Since most of the logic for that kind of thing is done in the javascript UI code, the server just acts as a dumb proxy.
What you're asking about is an Enterprise Feature. You need to pay a license.
It is Grafana (https://grafana.com/). And it runs against Zabbix as the collecter.
It is good but not great. I would love the ability to link so status into one. For example the large sections of Green Squares are showing my SAN drive status, I would rather have just 1 sqaure showing all of them and it changes if any one of the sub-squares need addition instead of taking up so much space for 24 squares. Same with servers/switches.
If you go the /r/Grafana/ reddit you can see some of the dashboards I have loaded.
Grafana Labs! Remote-first (anywhere in the world). Come work on Grafana, Prometheus, Loki, Tempo and many other soon-to-be-announced projects.
We're basically converging on the Grafana stack. Grafana as the front-end to Prometheus, Loki, etc.
You can pay for their all-in-one cloud service. Or you can run it yourself.
No just enable the external api for it and use the azure plugin: https://grafana.com/plugins/grafana-azure-monitor-datasource/installation
I actually learned about grafana through aws guides and implementing it with that. All the Aws guides are amazing but azure documentation is lacking. Maybe I should start a blog.
Have you looked at Grafana? We use it for all our monitoring dashboards. It looks really nice and can be customized. You may need to do a bit of work to get data into a supported data source, but that all depends on what you're using now. Most of our data is being stored in InfluxDB. Grafana then uses the different databases for our dashboards. InfluxDB uses HTTP for its API so it can be trivial to make a small tool that gets data from something and puts it into InfluxDB.
Personally I wouldn't recommend doing it this way. Promtail has a syslog listener. You should have syslog-ng send the logs to Promtail, and then have Promtail forward them to Loki
https://grafana.com/docs/loki/latest/clients/promtail/scraping/#syslog-ng-output-configuration
https://grafana.com/docs/loki/latest/clients/promtail/configuration/#syslog
Grafana panels are made with Javascript so if you could find someone who's made the tachometer in Javascript you might be able to adapt it into ReactJS and a Grafana panel plugin.
Edit: A quick google showed at least one similar project.
Prometheus is not involved, its just mentioned in the documentation as the service discovery mechanism is the same, ie prometheus has screap configs as does Promtail.
Fyi grafana is running a webinars that go through all these questions.
https://grafana.com/docs/loki/latest/clients/promtail/
https://grafana.com/go/webinar/opinionated-observability-stack-prometheus-loki-tempo/?pg=videos&plcmt=featured-1
Hi hi! Cortex author here :-)
I recommend these two talks for comparing the two projects; and old one with Bartek + I and a more recent one with Bartek + Marco:
(links to write up on our blog, but feel free to just watch the youtube)
In your particular case you call out multiple tenants, and I'd argue (with all my biases) that this is something Cortex might do slightly better at than Thanos - its baked in from the start and some of the isolation primitives (QoS on query path, per-tenant limits, shuffle sharding) are super cool. Thanos docs are better and it has a bigger community of end users though - so you'll probably find that easier to get started with.
Its marginal though - and the two systems are way more similar than you might think: both use Prometheus TSDB, both use the PromQL engine, both even use the same code for query optimisation! And with the Thanos receiver both do remote write now.
Let me know if you have any questions.
Don't pre optimize for k8s when you don't need it now, when the time comes moving containers should be ok since the images are compatible.
How do you monitor anything before docker? Do the same here. See this for example
Same as first point, also take a look at some loadbalancers like traefik and auto discovery tools and how easy they integrate with docker.
It's been years since I ran Zabbix, but if it helps at all I contributed a Prometheus end-point to Uptime Kuma so you can pull the data from U-K into Zabbix using that format.
I use U-K to monitor my web assets, then pull the data via the Prom exporter into Prometheus so I can graph it in Grafana.
More details are available on the U-K wiki and I've uploaded a dashboard for Grafana as well.
Rather than have a different dashboard for each environment, you can use Grafana variables to select which environment you want to see on a single dashboard.
Then in your Prometheus query, you use the Grafana variable in the query string like foo_metric{env="$env"}
.
Here's an example, but it selects different nodes: https://grafana.demo.do.prometheus.io/d/DP0Yo9PWk/use-method-node
MSSQL comes with SQL Reporting Servicea built in. You can really do a lot in a short amount of time with Report Builder and just leveraging the wizard.
If you want more dashboarding / visualization capabilities that are more interactive in nature, you could use Grafana.
Here it says:
>Where the section name is the text within the brackets. Everything should be uppercase, . and - should be replaced by _.
So in your case it should be: GF_PLUGIN_MARCUSOLSSON_CSV_DATASOURCE_ALLOW_LOCAL_MODE=true
Hello I have been monitoring a similar setup w/ stable version on rpi3b+, and for metrics I use Prometheus exporter. This was lot easy to setup following this: https://grafana.com/grafana/dashboards/11147 Hope it helps.
Theres a few moving parts that I run all in Docker:
(Sorry I cant reply fast, "redit noob", it keeps rate limiting me)
The Graph panel (among others) has an option that alters how it deals with null values, as outlined here.
The one you’re looking for is called ‘Null Value’, and you’d need to change it to either “null” or “null as zero”, as it’s likely set to “connected” right now.
Grafana is pretty, and I'm a visual person. It really depends on what your use-case is. In one install, I have six 55" screens set up, all showing different grafana dashboards. You can immediately see status for a lot of data.
If you're using Google Sheets, you can post directly from an esp to google without IFTTT. I might have to write this up sometime.
Very useful tips, mrpink57! I now have Grafana and Influxdb running in docker. I am able to use this dashboard as a starting point for my pfsense. I'll keep piling on different systems now that I have a working Grafana instance. Thank you all!
I don't think you quite understand the power of Grafana and once you get it going and become familiar, you will realize how powerful it is. I just upgraded my server's CPU and Grafana works beautifully to let me keep an eye on the temperature (alerts also setup to send me a text message and email if too high).
Here's a sample:
edit - by the way, take a look at this. It's more powerful than you'd think ;-)
Telegraf has a plugin built-in for APC UPS that will read from apcupsd
through a HTTP connection. This is the dashboard that I copied in and edited (fixing up the series names to make the graphs work, enabling the time picker so I can adjust the graphs how I want them to be, hardcoding for $, etc.).
I'm not complaining, just trying to be helpful.
You should use variables instead of hardcoding things. It helps if you are going to add more machines and also nice when sharing it (not everyone has named their pfsense instance pfsense-master-home.home). It's a lot more work changing every graph after you build a big dashboard so it is better to do it from the start.
For the hostname you can use SHOW TAG VALUES FROM system WITH KEY=host
as the query and then just use /^host$/
for the hostname.
You can read more about it here and I'd suggest to download a template someone else made to look at examples. You can also have a look at this for more variables.
Grafana ... Grafana .... and Grafana
https://grafana.com/dashboards
​
Using it on a Raspberry PI with influxdb in the back. Works absolutely flawlessly and is highly customizable.
You may be interested in a Grafana pluigin I made. It's a panel that displays GPS points on an embedded map.
I mention this because someone else posted a screenshot in an issue comment on the project showing them using it to track their Tesla like you're doing.
For some reason I am happy with pfSense boxes running Snort (with auto blocking) and pfBlockerNG. I am running pfBlockerNG Beta in production. Currently I am looking into Grafana as I would like to build a security dashboard. Yes, this is another approach and it isn't the same but it fits my requirements.
You might want to look into metrics-based monitoring solutions like Prometheus or the TICK Stack.
They can collect your errors as metrics and then you can query them for display on Grafana.
They can also alert you with lists of errors, so rather than looking at a dashboard, you just get an email summary of the errors.
For proxmox this is the information I have found so far: https://grafana.com/dashboards/1147 I have not been able to get it working but it seems others have. I have other data in graphite/carbon that works in Grafana, but the proxmox data is not visible in graphite.
I am not sure what the generic name would be, but you might consider a discrete value 'ribbon' chart. An example can be seen in the following Grafana plugin:
Managing everything with one tool sounds like a bad idea. You'll end up stuck with the typical enterprise crap software that might do one thing well, and everything else badly. And then you're stuck with it, because you can't modularly replace one piece due to sunk cost fallacy.
Use the right tool, for the right job.
I've heard good things about netbox as an IPAM.
I work on Prometheus for monitoring, and recommend Grafana for visualization/dashboards.
To give an example of Prometheus and Grafana using the Node Exporter, this is a dashboard you can just import: https://grafana.com/dashboards/1860 There are loads of other dashboards there you can just import for lots of different exporters/metrics, just search under datasource: prometheus
Not really directly what you are looking for but here's what we do:
We run Zabbix agents in active mode on each exchange server, this allows them to report back to the main server over the internet. We applied templates from here, they are for Exchange 2010 but can be adapted to Exchange 2013/2016 without too much grief.
We then have Zabbix pushing alerts, here is an example of a slack script.
This will achieve what you want with the added benefit that all metrics will be stored and if you really want to be fancy can be graphed using Grafana to make pretty dashboards.
edit: note that the templates cover performance related metrics, Zabbix by default will detect and monitor windows services. We have some custom triggers to alert if the live RPC connections drop off too quickly and also monitor the speed of OWA through Zabbix Web scenarios. You can do some ridiculous things once you get into it.
Netflows is the only thing on that list we don't have a good aggregator for. The problem is netflows are events, which is a lot more complicated to "monitor" than standard metrics.
You'll also probably want Grafana to drive your dashboards.
Edit: (formatting)
Live like animated? Check out grafana maybe.
Or do all your coding on Python and use streamlit to get a little web interface going. Then you'd be in full control of how often the data updates.
First, figure out what you really want:
Using Loki, you will still be using Grafana for your frontend. LogQL is inspired by PromQL.
If your logs are not JSON, you can use regex to extract fields as you can with Elasticsearch/Logstash.
Helm chart makes it really easy.
Hi! Here's the doc for the time range controls: https://grafana.com/docs/grafana/latest/dashboards/time-range-controls/
Query options are explained here: https://grafana.com/docs/grafana/latest/panels/queries/#query-options
Hope this helps :)
Consider Grafana Loki. Because it indexes considerably less of the ingested data than ELK and other similar products, it's operationally much cheaper to run and maintain.
(Which also does mean some search operations that are cheap with an ELK stack are expensive or impossible with Loki, but with the 2.0 release, it's become quite powerful)
The Grafana docs have a page on how to configure it, including the database that stores the config.
If you’re running it in a container, you’re going to want to do it via environment variables.
What are your scaling issues? Is it just the sheer number of timeseries? If so, I've heard good things about VictoriaMetrics (which claims to be PromQL compatible, but as always, there's edge cases).
Otherwise, both Thanos and Cortex are viable Prometheus scaling frameworks used in production by multiple companies.
There's always a subjective decision that needs to be made. Should you rush through and get something up and running? Or should you focus on building robust, highly reliable systems that consists of individual components that can each be scaled and developed individually?
My style is placing a violent amount of action towards getting a prototype out. To prove whatever idea you have and I'm greatly believe in the power of momentum. Inspiration is short lived so when it appears, act on it. That being said, what I do before writing a single line of code is architecting both prototype and the robust version of the system. That means diagrams, researching dependencies, infrastructure needed, understanding tradeoffs etc. Then I have a clear project plan and begin coding.
If your prototype works, you can begin architecting v2 of the system where you follow best practices, building components individually, separate responsibilities etc.
btw you also need a monitoring system. Check out https://grafana.com/.
Loki is made specifically for logs and searching through them and plugs into grafana. Prometheus would be useful to them as well for seeing things like resource usage, errors, long running queries, etc.
That sounds like you want to rewrite the timestamp in Promtail. That will make the timestamps match the in-game progression, not time of collection. Though it might be displayed as starting from 1970.
Also, here are the docs for metric queries.
Loki has a Helm chart, but Grafana added the Tanka/ksonnet stuff, because you need to scale the Loki executable / docker image into different roles (ingester, distributor, querier)
https://grafana.com/docs/loki/latest/operations/scalability/
It would be great to have that in the Helm chart, as it would simplify a lot of things for companies which use Helm but not ksonnet. Not sure how easy this is to solve though.
You can export Grafana dashboards to JSON, so it becomes just a matter of storing those exported dashboards in your infra repo and importing via a cli tool. Same with all the other tools, just export as YAML and you're good to go.
This documented here under Step 9.
You specify the Grafana role for a group in the manifest using the "value" property.
I have to agree with you.
Grafana + Loki and this dashboard https://grafana.com/grafana/dashboards/12559 will do what OP asks for and will do it for free
I've been using this forked version and haven't noticed any problems with it. I know that's not helpful as far as translating things to a Prometheus configuration, but there you go. I'm using grafana-server 7.1.1 and influx 1.8.3.
Hi! I’d start with the docs here: https://docs.influxdata.com/chronograf/v1.9/introduction/getting-started/
And for Grafana, check out this guide: https://grafana.com/docs/grafana/latest/getting-started/getting-started-influxdb/
There’s a good slack community and forums linked to on the Influx docs - stop by and say hello if you need a hand!
> Is it possible to get all 10k lines from 1 stream ?
In the grafana UI, no. But you can use the logcli tool to do a larger query and just up the limits.
https://grafana.com/docs/loki/latest/getting-started/logcli/
PUID and PGID are not how you define the user for this grafana image. You use user: 1000:1000
. Documentation here.
What do the logs say when the grafana container crashes? The logs will say why it's not starting up. They almost always do.
I cant upload to Grafana.com as the original dashboard I imported and built on top of is too old a version and missing stuff in the json file
but I have uploaded and shared the json file here for anyone to look at
The home dashboard is editable and copyable, or you can use any other dashboard you want as your home dashboard, your team's home dashboard or organisation's home dashboard!
Loki does multitenancy :
https://grafana.com/docs/loki/latest/operations/multi-tenancy/
And authn and authz are entirely up to you, so you need something in front of Loki doing both. There's this project that does a multitenant reverse proxy:
In Zabbix go Dashboards, create a new one and create the graphs you're looking for with the items you're fetching from your devices. Grafana go download the Zabbix data source (https://grafana.com/grafana/plugins/alexanderzobnin-zabbix-app/) and graph away in much the same way.
I'm also still fairly new to Zabbix, I found the Zabbix dashboards are nice for things you can put on a graph, but showing things like gauges for CPU and memory usage (Or a simple traffic light for up/down) seems to work much much better in Grafana.
I would switch to openwrt, and use this guide instead:
​
​
also, you cloud use netdata, scrape prometheus, and use remote write to send to cloud.
You don’t really say why AGPL is a problem for you, and I’d encourage you to reconsider - IMO all unmodified usage of Grafana should be fine.
But if you work for an organisation where they’ve implemented a blanket ban on AGPL (due to all the fud about the license), we also make a free-as-in-beer version of our Grafana enterprise product available:
> for those that don’t intend to modify the code, simply use our Enterprise download. This is a free-to-use, proprietary-licensed, compiled binary that matches the features of the AGPL version
https://grafana.com/blog/2021/04/20/qa-with-our-ceo-on-relicensing/
Maybe that’s what you need?
If I'm understanding correctly, it sounds like you are looking for event monitoring for that case. Maybe something like Loki would work better for that:
https://grafana.com/oss/loki/
Hey there!
You should actually be able to do this kind of thing via a Loki query, as long as you are running a Loki version >= 2.0.
We have documentation over here on metric queries with LogQL.
There's also some blog posts that talk a bit more about this feature. See the "Graph" subsection of this post and also have a look at this other post.
That said, parsing metrics out of log lines is usually something that people do when they thing they want to monitor doesn't have easy support for metrics extraction to a time-series database already, such as supporting Prometheus for example.
So, while you can do it this way, if there's existing support for a regular metrics solution like /u/oh-y said already, that might be an easier way to go about things in this specific case.
Hope this helps some :)
I personally prefer prometheus as a data source
Basically Grafana doesn't actually store any metric data - it is a visualisation tool instead. So you need to decide first which data source you would like to go for. Influx is completely fine depending on what you want to do
Check this out https://grafana.com/docs/grafana/latest/datasources/
It has a list of the ones you can use.
isnt Grafana for metrics.
database visualizations be a totally different thing.
edit: TIL stuff like https://grafana.com/grafana/plugins/grafana-mongodb-datasource exists. Interesting.
This might be overkill for what you want to do, but a quick Google search found a pre-built Grafana dashboard that pulls from a Prometheus data store populated by a data exporter (links below). I run several services locally and am planning to set up Prometheus and Grafana when I get some time. I’ve previously used Grafana with InfluxDB and love the ability to create dashboards from various application data using one central interface.
AdGuard exporter: https://github.com/ebrianne/adguard-exporter Grafana Dashboard: https://grafana.com/grafana/dashboards/13330
Personally I'm a big fan of Loki, it's a lesser known log aggregator but it does it's job really well for me. You might also want to pair it up with Prometheus in case you need a powerful metrics database.
Nooot really, slash not sure. Looks like https://computingforgeeks.com/monitor-linux-server-with-netdata-and-grafana/ is in my visited links, and I looked around the published grafana dashboards to pull stuff from, like https://grafana.com/grafana/dashboards/10922 and https://grafana.com/grafana/dashboards/2701. I think I ended up going with influxdb over graphite or something else, but I don't remember. All the netdata instances I set up were one-and-done, pfSense has it all configured if you just install it, as does freenas. You just have to add the database backend.
Yep - https://grafana.com/grafana/plugins/jasonlashua-prtg-datasource
Although I initially used the prtg api via my own script and stuffed results into a mysql database, which grafana then read from.
I don't think that's possible. The information on why it is not possible is a little bit spread out between two places:
If you look in Loki's best practices, there's a recommendation to not do that. It's a bit counter-intuitive, but it comes from Loki being a very different tool compared to Elasticsearch, Elasticsearch is optimized for having great indexing capabilities whereas Loki is optimized to reduce data storage costs and sift through large amounts of unindexed data quickly. Because of these design differences, having too many indexes can end up hurting Loki's performance.
Instead, you'll want to parse the JSON at query time using LogQL.
Has anyone tried this? How well does it scale? I tried this but we have so many hosts and VMs it would take about 4 minutes to render any screen. Even with that being the only thing on the Grafana machine with 16 vCPUs and 128GB RAM, it was so slow Grafana would try to refresh the screen before it could draw it the first time.
No, but if you put the password in a file you can configure Grafana to read it from that file instead of from the configuration, see https://grafana.com/docs/grafana/latest/administration/configuration/#variable-expansion
You shouldn't have (to) change that part of the config, just paste it as is. By entering bogus information (the ip address) you broke the config.
> Some Objects do not have instances to select from at all. Here only one option is valid if you want data back, and that is to specify Instances = ["------"]
.
https://github.com/influxdata/telegraf/tree/master/plugins/inputs/win_perf_counters#instances
By the way, there is an existing dashboard for Hyper-V: https://grafana.com/grafana/dashboards/2618
You should add the shown config to the config supplied with Telegraf (it also 'wants' the other metrics.)
P.S. GitHub repo to make things easier for those who haven't seen it yet: https://github.com/flant/grafana-statusmap (Official Grafana plugins directory offers previous releases ATM.)
Well you said it on your post, you’re not going to be able to install addons. So you can install each of these but they won’t have any preconfigured integrations. Because you’re not using addons, you won’t get ingress, which means that you also are going to need a reverse proxy to access these securely.
Follow the grafana documentation for installing on docker.
https://grafana.com/docs/grafana/latest/installation/docker/
The VSCode server is poorly documented in my opinion, you’re gonna have to find the documentation on how to install it and then also install the plugins or extensions or whatever they are called that come natively in the addon.
I’d recommend the linuxserver.io letsencrypt docker image for your reverse proxy.
Based on you asking this question, you really ought to go with a home assistant install. This gives you a fully supported home assistant setup with addons and snapshots.
Follow these instructions to get a clean install of home assistant on your proxmox box. You’ll have supervisor, addons, and it’s going to be no different than using the official pi image, it’ll just be a proxmox vm.
Yeah, I use this one as it has pretty much everything that I want on there.
https://grafana.com/grafana/dashboards/11074
But, its not exactly "easy" to get at what I want all the time because it is a bit cluttered.
funny you post this. I just setup a test instance last night with this dashboard. wish it could also bring over the dhcp clients.
care to share yours? :)
> stats with omv, seeing my nodes for my cluster and their satas on temps, cpu usages, etc.
You may have some luck with grafana (https://grafana.com/) for that type of information, but it won't serve as the
>"hub" for them
that you'd want.
I think the line they are towing is that "no one else in the MSP reporting and dashboarding space comes close", line. They haven't really had a challenger that directly talks to, and understands the MSP space. I'm finding in the MSP space I'm playing in now (we are an ITOM specialist firm using Orion as our go-to tooling, with SNOW as our SD tooling) - we've completely outgrown BrightGauge and they'll never keep up.
I'm also finding that with PowerBI and others, they don't have a single vision connected to the Systems Integrations and Service Providers that makes having a reporting and dashboarding solution easy to leverage but also scale (and sell back out to customers).
https://grafana.com/ is kind of cool but still probably needs a bit of capability in house to connect things.
I think the person in the MSP space that makes "everything connected to everything out of the box with a commercial model that MSP's can jump at - will become very rich overnight.
It's a big and overwhelming task - all data available everywhere, from all systems, sliceable and diceable, all the time.
I guess it speaks to Splunk and Tableaus success.
There's a big gap between what BrightGauge does and what data scientists and visualisation experts are capable of producing.
Prometheus can do this pretty easily. You can use Grafana's hosted cloud service to provide the centralized view of all customers.
At each site, you need a small Prometheus polling server. It's very efficient compared to the other options you listed. It all depends on how much data you're looking to gather.
For SNMP, you'll need the snmp_exporter to act as a bridge.
Grafana could be the platform tool you might probably want to look into.
Or browse from n1trux’s awesome-sysadmin directory: list of Monitoring services and list of Metric and Metric Collection services
https://grafana.com/dashboards/10048
​
This is pretty much everything you can do (at least that I figured out) from just the external stats that proxmox produces.
You just fill prometheus with snmp information from devices. There are dashboard samples in grafana labs page. You can download json formatted dashboard and load it in your grafana, one or two slight changes and it is ready.
Sample dashboard: https://grafana.com/dashboards/1124
We use grafana with influx as well. A script running to fetch drops on hardware and send this data to influx database, then we just visualize it on grafana.
I can also share our grafana dashboard in json, once I connect to office network.
ELK etc's biggest advantage is the NoSQL DB (Elasticsearch) so without putting the logs there you lose that.
Graylog is better suited out of the box for IT logs but still wants all the logs in there.
You are better seeing if there is a Grafana or other frontend that can look at you Azure DB or changing your mind on where you put your logs.
looks like there is
https://grafana.com/plugins/grafana-azure-monitor-datasource
I recommend InfluxDb/Grafana for monitoring backend and whatever as client such as Telegraf, bash/powershell scripts etc. For instance I use flea to send whatever metrics to StatsD server (implemented via Telegraf daemon).
Influx is great and fast and in 2.0 it will have both push and pull methods (technically it can do that now via Telegraf). I don't like Kibana much, Grafana is 100X better for me and more intuitive, and way better looking.
I use EK (without L) to send logs to central place, I don't use beats nor I like them. Actually, anything not shell is show stopper for me as soon you will need some metric that isn't covered and you don't have a time to go deep into go or whatever to make a plugin when you already know shell. Regarding authentication, I use Elastic without any. FYI, GrafanaLabs are adding logging capabilities (Loki and it has great access management OTB.
I havent set it up yet so I can't speak to its stats, but ntopng package can be a data source in grafana - it has a plugin. It's on my list of things to set up...
I've been browsing around on the Grafana site and downloading pre-made dashboards to use as inspiration to build my own. You can re-use/copy the queries already built into those dashboards or use the pre-made dashboards and simply remove stuff you don't need. That's what I've been doing to slowly build up one screen that shows me everything at a glance. I'm not quite there yet. Since my biggest challenge has been getting remote data from various devices and third-party applications like Pi-hole running on an Rpi into grafana. I have grafana in docker on a different server. Here is one Pi-hole dashboard on their site that someone built using pihole_export as the datasource. (which you already have running).