I wrote earlier about alternatives to Google Analytics that are more focused on privacy. I have been using GA on this site for a while, but it feels weird to collect so much data about users, and wanted to try something new.
Of all the options I considered, I found GoAccess to be the most enticing. It doesn't rely on any scripts at all—it just analyses logs from whatever web server you use. Basically, if your web server has served a page 1,000 times, it'll tell you that you've served it 1,000 times.
This blog is hosted on Ghost, so in this case, we're talking about Nginx.
How to install GoAccess
To install GoAccess, you need access to your own server, like on DigitalOcean. If you don't, I highly recommend you do use a self-hosted option. It's cheap, it's easy to use, and it's basically infinitely scalable!
Enter these four commands:
echo "deb http://deb.goaccess.io/ $(lsb_release -cs) main" | sudo tee -a /etc/apt/sources.list.d/goaccess.list
wget -O - https://deb.goaccess.io/gnugpg.key | sudo apt-key add -
sudo apt-get update
sudo apt-get install goaccess
You now have GoAccess installed! You can try out analysing the data.
Try it out with command line
Go to where your log file is located. On a standard DigitalOcean Ghost droplet, this will be in /var/log/nginx
. Type in goaccess access.log
.
You'll be asked to choose a format. Choose the first one!
You'll get an example of what to expect:
Try out the HTML version
Go to where your web server is, and create a static HTML file.
# cd /var/www/html
# goaccess /var/log/apache2/access.log -o report.html --log-format=COMBINED
Now you can go to your browser and type in, for example, hooshmand.net/report.html
(no, it's not there now).
You'll get a nice interactive HTML report like the following:
This is an interactive HTML5 site displaying the data parsed from the static log file.
Bear in mind that this is only a log file of traffic for one day. It'd be good to have access over a larger period, but that would require custom configuration.
Analysis
This is more traffic than I think is actually occurring. It'd be nice to imagine that I'm getting 1,000 unique visitors a day but I think that just isn't true. I believe the traffic is coming from spiders.
How to check this? Well, I could exclude spiders, but I could also check to see if the shape of the traffic mirrors that from other sources, like Google Analytics.
Note: The above screenshots were done in 2018, when a lot fewer visitors were visiting this site.
Conclusion
While it's nice to have a privacy-focused alternative to Google Analytics active, it's not convenient enough for me to use as a total replacement for GA.
I tried Matomo (previously referred to as Piwik) later, and while I think it's worth it, I didn't want to pay for something this early in this site's evolution.