If you want something simple, pleasant and privacy-friendly, I really like GoAccess, a FOSS program that parses your server logs. At its simplest you just run a single command telling it where the logs are and it generates a fancy HTML report like this. No need to install anything on your website or depend on a third party. And it can't be blocked either, so it's as accurate as you can get.
Alternatively, you can parse the logs yourself fairly easily with a regex. Sample for nginx logs to get the visitor count:
visitors = Set.new # A set of all unique IPs that visited our site File.foreach(log_file) do |line| match = line.match /^([^ ]+) [^ ]+ [^ ]+ [.?] "[^"]" [^ ]+ / if match visitors.add match[1] else raise "Log line doesn't match regex: #{line}" end end puts visitors.count
If you don’t need complex logging system for all your apps and need to solve only webserver logs system - try https://goaccess.io I can’t say for high volume usage, but for average newsmaker sites it’s suitable simple analysis tool based on logs
I use https://goaccess.io/ alot. You'll need to understand YOUR data and how to pipe it exactly the way you want it but once you do goaccess can be an invaluable tool to peek into your data for answers to various questions (who, what, where, when). The "why" are usually bots and script kiddies.
Like everyone else says, GA is solid. It's my front door analytics.
Taking it a bit further from just opening terminal windows, you could use a logs visualizer like http://logstalgia.io/ or https://goaccess.io/ if a more visual representation would be helpful.
As much as I like reading logfiles, they usually only "click" with me when they're things I'm already familiar with. When I'm trying to see something new, it helps to have the pretty graphs.
You have multiple different questions here. Monitoring web traffic can be a case of analysing logs e.g. with goaccess.
Monitoring traffic more generally without focusing on a specific application is a completely different problem.
If you want turnkey reporting, including geolocation, try GoAccess.
Note that to truly self-host geolocation, you'd need to buy a license or use the (old, inaccurate) GeoLite2 downloadable dataset -- I'm not aware of any self-hosted solution for GeoIP which is both free, and accurate/up-to-date.
You do need a domain. You also need hosting. Once that's set up, you can get a free SSL certificate from Letsencrypt. You may find this article useful. You will also probably find a tool such as pm2 to be very useful if you are running a server using node. It's easy to set up and use. Once that is dealt with you may want to set up some logging. A tool such as goacces is probably going to be helpful for that.
Good luck!
How much access do you have to your web server? Can you SSH to it? If you can, and you’re happy with using the terminal, then GoAccess ( https://goaccess.io/ ) can produce stats by analysing your server logs. It also has an option to filter out traffic from crawlers. You could see whether its filtered results are so different from GA.
Personally I use Plausible Analytics which is fantastic and open source. But if you want to use it for free, you’ll need to self-host, which takes a bit of setting up.
Your log files will be in /var/log directory. Then depending on your web server technology you would see an nginx and apache2 directory. These directories will contain the access logs for the site. You will then be able to analyse them using Go Access.
Download your logs locally from S3 in an EC2 instance then analyze them with GoAccess (open source), it's supporting Cloudfront and ELB formats (among others)
GoAccess can produce HTML reports or CLI UI :)
Good lord, AyrA_ch, your reply is a treasure trove of info. Going to bookmark this for future reference.
I have a GoAccess log analyzing / stats panel set up for this client, but I don't think they've ever used it. It's the only thing I can think of that they'd ever notice, and I find it highly unlikely they'll even see that.
It's a new business, and they're so wrapped up in running it that I'm a little shocked they even set up Cloudflare to begin with. They mostly just hired me to make their web presence and they get in touch when they need something added to the website.
I'm going to leave it as-is until any issues — but, as I mentioned, I'm bookmarking this for future reference.
Thank you thank you!
Pretty much any log analysis tool will do what you want, though they may need some format tweaking to understand however you have your nginx logs formatted. eg;
Ok, I thought there were more out there, but I'm sure you can find something that meets your needs. It's been ages since I've looked at log analytics though, last time I did it was before Google turned Urchin into Google Analytics.
Worst comes to worst, you can do manual analytics using awk/grep/whatever to grab the request URI log field and do whatever tabulation you want.
I really like goacess
^1, it gives me all a need. It's an executable that parses log files, so you don't have to host anything and do not have to include any scripts. It still gives you a very good idea of where your visitors come frome, what browsers and operating systems they use and which pages they look at at which time and date. It can also generate a dashboard^2 with all this information, no matter where your logfile is. I usually pipe nginx logs from my server and run goaccess locally, but that's up to you.
I think you set it up like every other container.
As you can see in the documentation: https://goaccess.io/download#docker you should mount some directories. So make sure the logs from the other container are written somewhere and mount them in the GoAccess container under /srv/logs
.
Ya it might do it with a couple of things combined. Use a cron job for every 5min and put together some string that would allow you to narrow the time frame.
Go here: https://goaccess.io/man
Scroll down to: "DIFFERENT OUTPUTS" and see the example with cvs. Combined with some of the time options above, it might work. Never tried myself.
If you are using a web server in front of your app, like an nginx reverse proxy, then it should be saving logs and you can use https://goaccess.io/ to visualize those logs.
I would have to know more about your setup to give further recommendations.
I just started using goaccess on a project. Switched away from google analytics because it’s more than I needed. Doesn’t hurt that ditching google analytics for a locally hosted solution is kinder to your visitors!
I find goaccess to be pretty good! No complaints yet.
I just started using goaccess on a project. Switched away from google analytics because it’s more than I needed. Doesn’t hurt that ditching google analytics for a locally hosted solution is kinder to your visitors!
I find goaccess to be pretty good! No complaints yet.
Take a look at this: https://goaccess.io/
> While the terminal output is the default output, it has the capability to generate a complete real-time HTML report (great for analytics, monitoring and data visualization), as well as a JSON, and CSV report.
Terminal view/stats fully available (default actually) plus some nice HTML for when you do want a report or something like that.
Features Page shows some of the terminal dashboards available.
Quoting from the man page:
> Virtual Hosts: This panel will display all the different virtual hosts parsed from the access log. This panel is displayed if %v is used within the log-format string.
Tableau can only really handle delimited text files. So if the file you'd be providing the end-user isn't already parsed into a schema it'd be the wrong tool.
Is something like this more of what you're in the market for?