Rybbit - Open source Google Analytics replacement
from Goldflag@lemmy.world to selfhosted@lemmy.world on 22 Nov 02:35
https://lemmy.world/post/39137192

Hi guys, I’ve been working on a self-hostable web analytics platform since the start of this year after being frustrated with Google Analytics and Plausible.

I’ve packed a bunch of cool web analytics features into Rybbit, but I’ve tried very hard to keep the interface simple to use,

https://github.com/rybbit-io/rybbit

Check it out!

#selfhosted

threaded - newest

Goldflag@lemmy.world on 22 Nov 02:48 next collapse

A few more screenshots in case you don’t want to leave the site <img alt="" src="https://lemmy.world/pictrs/image/842fccc2-7ebf-4979-bcae-de1b51a6bce9.png"> <img alt="" src="https://lemmy.world/pictrs/image/7a1612ed-6819-43ec-a479-c71f27223058.png"> <img alt="" src="https://lemmy.world/pictrs/image/0ce7c094-c7d4-4864-b7f3-52ca968191c6.png"> <img alt="" src="https://lemmy.world/pictrs/image/4665890b-fe21-4891-a1ee-010e0118c221.png"> <img alt="" src="https://lemmy.world/pictrs/image/c861d306-b53c-44d2-ab5e-3dee7158eb68.png">

MakingWork@lemmy.ca on 22 Nov 04:20 next collapse

I have no idea how to use this, but this is amazing!

bear@lemmy.blahaj.zone on 22 Nov 08:32 next collapse

What should we try with the live demo? Neat stuff, it’s this a long-term project?

helix@feddit.org on 22 Nov 13:20 collapse

You’re awesome. Thanks!

lung@lemmy.world on 22 Nov 02:54 next collapse

Wow holy crap, great work - the world badly needs this. Im assuming the mechanism is the same, you inject a js script into your site. I’m also very interested in pure server side solutions for analytics, but they can’t hit all the features you did in a generic way afaik

Goldflag@lemmy.world on 22 Nov 03:00 next collapse

Yea, we use a client-side script like almost everyone else. The major difference is that we don’t use cookies so you can avoid a lot of the cookie banner/GDPR nonsense.

Rybbit definitely isn’t the first open source cookieless web analytics platform (Plausible and Umami are the two other big ones), but it’s probably the most “all-in-one” of all these alternatives.

criss_cross@lemmy.world on 22 Nov 11:56 collapse

Do you use fingerprinting instead? Or what’s the mechanism you use?

x00z@lemmy.world on 22 Nov 13:09 collapse

GoAccess uses your server side access log.

mesamunefire@piefed.social on 22 Nov 03:04 next collapse

I would love for this to work on yunohost.

artyom@piefed.social on 22 Nov 04:17 collapse
solrize@lemmy.ml on 22 Nov 03:21 next collapse

Aren’t there already tons of these already? Piwik has been around for a quite a while, plus there are others mentioned in the comments.

_cryptagion@anarchist.nexus on 22 Nov 04:20 next collapse

variety is the spice of life.

quick_snail@feddit.nl on 22 Nov 20:50 collapse

Matomo*

lIlIlIlIlIlIl@lemmy.world on 22 Nov 03:47 next collapse

How would this compare to something like PostHog?

Goldflag@lemmy.world on 22 Nov 04:07 collapse

Posthog makes it almost impossible to actually self-host since they try to push you onto the cloud as much as possible. They say that the self-hosted version only works well up until 100k events … which is insane since their cloud free tier is 1 million events. It’s actually the reason why I built Rybbit. I tried to self-host posthog on my server but it ran it up to 100% CPU on 8 cores and didn’t even work.

Ok posthog rant done.

The other main difference is that Posthog has like 10+ different products all in one. Their web analytics is good, but it’s just kind of bland (imo) because it’s not their main focus.

jarhead@pie.jarofmilk.cloud on 22 Nov 03:52 next collapse

Thats fuckin baller!!

Goldflag@lemmy.world on 22 Nov 03:55 collapse

🐸

otter@lemmy.ca on 22 Nov 04:10 next collapse

You mentioned being frustrated at Plausible. What did you not like about it?

I haven’t tried Plausible, but it seemed popular

Goldflag@lemmy.world on 22 Nov 04:42 collapse

it didn’t have enough features, especially since the community version is heavily nerfed (it’s missing even funnels)

baatliwala@lemmy.world on 22 Nov 05:04 next collapse

This looks… Great? Nice work

osprior@lemmy.world on 22 Nov 05:34 next collapse

Question is the self-hosted version less featured than the paid hosted version?

This looks amazing btw.

Goldflag@lemmy.world on 22 Nov 05:38 next collapse

Only very slightly so. One of the reasons I created Rybbit is because platforms like plausible and fathom have much inferior self-hosted versions (very limited featureset and basically never updated). We have a comparison here

osprior@lemmy.world on 22 Nov 05:55 next collapse

That’s excellent and very clear, thank you for the explanation.

spacelord@sh.itjust.works on 22 Nov 11:25 collapse

@Goldflag

I appreciate the intent behind Rybbit, but I have to respectfully disagree with the “only very slightly so” characterization. Looking at your official comparison table, the self-hosted version is missing:

  • Pages View
  • Web Vitals
  • Email reports
  • Google Search Console integration
  • VPN/Crawler/ASN tracking
  • Google/GitHub OAuth
  • Email support

That’s 7 significant features—which seems more than “very slightly” different.

More importantly, this raises AGPL compliance questions. Under AGPLv3 Section 13, if users interact with modified AGPL software over a network (your cloud version), you’re required to make the complete corresponding source code available to those users. If these cloud-only features are integrated into the same AGPL-licensed codebase, withholding them from the public repo while running them as a network service appears to conflict with the license terms.

There are really only two compliant scenarios here:

  1. These features exist in the public repo but are just marketed as “cloud-only” (in which case the comparison table’s misleading)
  2. These features are truly separate proprietary code that interfaces with Rybbit without being part of the AGPL-licensed work (which would require careful architectural separation)

If it’s neither—if these are AGPL-covered features running in your cloud service but withheld from the repo—that’s exactly the “loophole” the AGPL was designed to close. The irony is that you criticized Plausible and Fathom for having “much inferior self-hosted versions,” yet this appears to be a similar approach.

Could you clarify the licensing status of these cloud-only features? Are they in the public repo but disabled by default, or are they proprietary additions that don’t derive from the AGPL codebase?

ripcord@lemmy.world on 22 Nov 13:24 next collapse

Thank you for your service.

Goldflag@lemmy.world on 22 Nov 14:02 next collapse

Everything is in the repo and cloud features are just toggled off in the self-hosted build.

spacelord@sh.itjust.works on 22 Nov 14:24 collapse

@Goldflag,

Thanks for clarifying! Good to hear everything’s in the repo and that it’s truly AGPL compliant.

Since as self-hosters we already carry the burden of maintenance, updates, security, and infrastructure costs that cloud users don’t, would you consider documenting how to enable the cloud features in self-hosted setups?

I see the docs cover basic environment variables, but not for Pages View, Web Vitals, or VPN/ASN tracking. Even if some features need extra config (SMTP, OAuth creds), having that documented would help those of us willing to do the work.

That would truly differentiate Rybbit from Plausible/Fathom—not just code parity, but empowering self-hosters with full feature access.

custard_swollower@lemmy.world on 23 Nov 09:13 next collapse

AGPL means they are licensing it to you, they are not bound by the license because they are the copyright owners.

spacelord@sh.itjust.works on 23 Nov 12:56 collapse

That’s misleading. While copyright owners aren’t bound by their own license, AGPL Section 13 requires that when they run AGPL software as a network service, they must make the complete source available to users.

The AGPL was specifically designed to close the “SaaS loophole.” Being the copyright owner doesn’t exempt you from AGPL’s network service requirements if you’re distributing under that license.

GameGod@lemmy.ca on 25 Nov 13:24 collapse

This is flat out wrong. If you’re the copyright owner, you’re not licensing the code to yourself. The AGPL is the license under which they’re making the open source version available to YOU. The version they run themselves is proprietary.

ITGuyLevi@programming.dev on 25 Nov 18:46 collapse

OAuth is one thing I hate to see locked behind a paywall; it’s one thing for the pretty, management-geared stuff (dashboards and charts) to be a paid feature, but not security.

EarMaster@lemmy.world on 24 Nov 10:13 collapse

The free self hosted version is heavily limited. I will stick with Plausible which may be simpler but also doesn’t want to push me into a subscription.

starkzarn@infosec.pub on 22 Nov 05:49 next collapse

Glad to see you post this here. I’ve been experimenting with selfhosted analytics for a while now and have attempted your project here a couple times. The thing that kills me is the Clickhouse requirement. It makes it impossible to host on a lightweight VPS. Like why should my analytics platform require so much more compute than my simple static site? Am I missing something?

Goldflag@lemmy.world on 22 Nov 05:56 collapse

Clickhouse definitely takes a lot of resources! There’s unfortunately no way around that, though in my experience it runs fine on the cheapest Hetzner instances which are like $3-4 a month for 2GB of RAM. How lightweight is your VPS?

And yea, you don’t need clickhouse for a simple static site. I chose clickhouse because it Postgres or MySQL does not scale well since the main site I personally use Rybbit for sends around 20 million events a month.

It pains me to plug my competitors, but check out Umami or Goatcounter if you want a platform that uses postgres.

starkzarn@infosec.pub on 22 Nov 15:30 collapse

Hey thanks so much for the engagement. I was trying to run it on a VPS that cost $35/year. 2GiB of RAM wasn’t quite enough to make it work for me, granted that was with the webserver and ancillary supporting services.

I’ll find an opportunity to test it out though, as rybbit looks great. I appreciate the mention on the other FOSS products, that’s a good look for you. I have plenty of experience with umami already. Cheers!

houjou@jlai.lu on 22 Nov 08:52 next collapse

it looks beautiful!! do you plan on making the wcv available for the self hosted version in the future?

parmesancrabs@sh.itjust.works on 22 Nov 10:39 next collapse

Aways a fan of alternate options, this looks quite tidy! I had a few thoughts / queries. Not at my system right now but I will test it out later.

I noticed in the screenshots you have a “users” page - but with a cookieless tracking system I would have assumed it wouldn’t be reliable to identify a long term user past individual sessions? Are you doing some hefty finger printing?

Looking at your features table has a few statements that might need adjusting. Such as GA4’s segmentation sequencing / filtering can be quite complex, I’d argue its not limited and potentially more advanced than Rybbit (not tested yet). It also has a user exploration feature.

Do you have any plans for a drag and drop style report creation, so that I could create reports with any dimensions / metrics and filter accordingly? I think that would bring a lot of flexibility to the platform for an individuals bespoke needs.

cupcakezealot@piefed.blahaj.zone on 22 Nov 11:27 next collapse

hmm interesting im using matomo but im not liking how its increasingly becoming bloated and subscription based

helix@feddit.org on 22 Nov 13:20 collapse

You can try goaccess.io or plausible.io aswell. Ribbit is very cool though!

NarrativeBear@lemmy.world on 22 Nov 16:36 next collapse

How can I run this on unraid, and can I point it at multiple domains and sub-domains?

MangoPenguin@lemmy.blahaj.zone on 22 Nov 19:58 collapse

Its a docker compose deployment so should just work on any system with docker installed. Copy the docker compose file and env file if it has one, and run ‘docker compose up -d’ in that directory.

It can collect analytics from multiple places.

quick_snail@feddit.nl on 22 Nov 20:49 next collapse

Docker is a security risk. Is it possible to install securely?

partofthevoice@lemmy.zip on 22 Nov 21:18 next collapse

Docker is a security risk? … excuse me, what? Can’t you just, idunno, secure the environment that docker runs in? Use rootless images? Use immutable images?

And, are you asking for something that runs on bare metal? Couldn’t you just install the ISO that the dockerfile uses, then convert the dockerfile logic to an sh script?

LordKitsuna@lemmy.world on 22 Nov 22:24 next collapse

In its default state i think thats fair. Example docker bypasses most firewalls as it runs before iptables rules process. So if you don’t either use 127.0.0.1:port:port (many compose files offered by projects do not do this) or add specialized iptables rules to fix that up you can end up directly exposing services with meaning to or even realizing.

And yeah privilege escalation etc. There are solutions like what you mentioned but it can be a lot of work to set all that up so most people won’t

quick_snail@feddit.nl on 22 Nov 23:10 collapse

Doker pull is insecure

It’s the download that’s not verified

Appoxo@lemmy.dbzer0.com on 22 Nov 23:13 next collapse

Download the image manually with something like curl???

quick_snail@feddit.nl on 22 Nov 23:16 collapse

Hahahahahaha good luck.

partofthevoice@lemmy.zip on 23 Nov 00:03 collapse

You can verify the checksum to ensure the contents pulled are exactly the same as what was published. You can also use a private container registry.

How exactly would docker pull be any more insecure than something like pip install? Or, really anything… Let’s go with your preferred alternative, how are you going to get it on your machine in a more secure way than docker provides?

Docker uses TLS with registries, layers and manifests have cryptographic digests, checksums, and you can verify the publisher yourself. Push it into your own registry if you want, or just don’t use latest.

quick_snail@feddit.nl on 23 Nov 00:18 collapse

Yeah, that’s the insecurity I’m talking about.

If you want to know how to implement this properly, look at apt. Its a known issue in docker; they just haven’t prioritized the fix yet (DCT)

partofthevoice@lemmy.zip on 23 Nov 00:21 collapse

What are you talking about, “yeah that’s the insecurity I’m talking about.”

I didn’t mention an insecurity and neither have you. Would you mind being a little more clear than “Docker pull is insecure?”

Frankly, I was expressing confidence in dockers security. It goes without saying though, any user can do insecure things like download from untrusted sources. That’s not dockers problem though, it’s the users.

Edit: I see now that you added “it’s the download that’s not verified.” Integrity is verified, so I assume you mean authorship (via signing)? I guess you’re saying that, if admin credentials are stolen from a container publisher and the thief force pushes malicious code into the registry under a pre-existing tag—then you would be exposed to that?

Even in that case, though, a digest cannot be overwritten. Tags can. So you’d just pin the digest to avoid this one attack vector?

quick_snail@feddit.nl on 23 Nov 00:30 collapse

Checksums are not for security. You need signatures. I’m not making claims that aren’t clearly documented.

partofthevoice@lemmy.zip on 23 Nov 00:39 collapse

You’re talking about authorship. Sure. But if you verify the container yourself as secure and pin the digest, what’s the issue?

quick_snail@feddit.nl on 23 Nov 00:48 collapse

What you just described cannot be done. You can’t verify it, because its not signed.

partofthevoice@lemmy.zip on 23 Nov 01:05 next collapse

You’re making big claims on security here, like “cannot be done,” and each time you do I feel like we’re talking past each other a bit. I never claimed you can verify that the person who pushed the container had access to a private key file. I claimed you can verify the security of a container, specifically by auditing it and reviewing the publisher’s online presence. Best practices. Don’t upgrade right away, and pin digests to those which can be trusted.

When you pin a digest, you’re not going to get a container some malicious agent force pushed after the fact. You pinned the download to an immutable digest, so hot-swapping the container is out the window. What, as I understand, you’re concerned with is the scenario that a malicious actor (1) compromised the registry login beforehand, (2) you pinned the digest after hand, and (3) the attack is unnoticed by you and everyone else.

I’m trying to figure out under what conditions this would actually occur, and thus justifies the claim that docker pull is insecure. In a work setting, I only see this being an issue if the process to test/upgrade existing ones is already an insecure process. Can you help me understand why I should believe that, even with best practices in place, Dockers own insecurities are unacceptable? Docker is used everywhere and I’m reluctant to believe everyone just doesn’t care about an unmanageable attack vector.

quick_snail@feddit.nl on 23 Nov 01:55 collapse

Dude, just search the github for “docker content trust” and you can read all the issues. I’m not making big claims that aren’t known already by the devs

partofthevoice@lemmy.zip on 23 Nov 02:48 collapse

Again we’re talking past each other. I’m sure those results are available and I’m aware docker doesn’t verify signatures automatically, but I’m asking how that necessarily makes docker insecure in spite of best practices being implemented. It’s about pinning yourself to trusted digests and having a verification process (like time) before updates. Why would you need authorship verification in that case? If there’s a good answer to that, I’d consider alternatives too. I’m just saying I don’t think it’s inherently insecure over this, and at face value It boils back down to the classic: don’t download untrusted software.

partofthevoice@lemmy.zip on 23 Nov 03:21 collapse

I was curious and, yeah, it seems like docker hub not requiring signature means many popular publishers don’t bother to sign. But that’s not to say it can’t be done. For example: github.com/sigstore/cosign

Today, cosign has been tested and works against […] Docker Hub

yessikg@fedia.io on 22 Nov 22:43 collapse

I imagine you can use Podman instead

quick_snail@feddit.nl on 22 Nov 23:14 collapse

I think that has the same problems, no? Or does podman do signature verification on all the layers it downloads from the container registry?

yessikg@fedia.io on 23 Nov 13:26 collapse

Podman runs rootless by default

quick_snail@feddit.nl on 23 Nov 14:52 collapse

You didnt read what I wrote. The security problem is how it downloads layers. It doesn’t verify them.

yessikg@fedia.io on 23 Nov 15:28 collapse

Ah, I don't know enough about that to give you an answer

quick_snail@feddit.nl on 22 Nov 20:50 next collapse

What’s the advantages over awstats?

danhab99@programming.dev on 22 Nov 21:32 next collapse

The same advantages as all free and open source solution, it’s free and open source. That means how much it’s going to cost to your business is directly under your control. You can make a decision on how you acquire hardware based on your business’s needs. If you want to add or change features you can decide how to do that based on the deals you have with your programmers (like pick the developer you have with the best skills and the lowest cost), and then you get to control how much it costs you and how reliable the result is going to be.

If you feel like the support you get from customer service from Amazon or Google or Microsoft is reliable enough and you don’t need more reliability then go ahead and stick with paid products. But if you already have a team of really expensive and talented engineers you might as well let them solve problems with free and open source equipment.

Goldflag@lemmy.world on 22 Nov 22:32 collapse

I think he’s referring to en.wikipedia.org/wiki/AWStats

quick_snail@feddit.nl on 22 Nov 23:12 collapse

Yes. It predates aws lol

Goldflag@lemmy.world on 22 Nov 22:32 collapse

from what i know, awstats gets analytics from server-side logs while Rybbit uses a client side script. So not really and apples to apples comparison

rekabis@lemmy.ca on 23 Nov 01:33 next collapse

OpenBSD does not have a docker engine. Can this be installed without docker?

Vex_Detrause@lemmy.ca on 23 Nov 04:34 next collapse

Is it “reebit”? OR “Raybit” ?

Goldflag@lemmy.world on 23 Nov 04:45 collapse

it’s “ribbit”

Fmstrat@lemmy.world on 23 Nov 12:23 next collapse

I’ve been using Plausible for a long time, will definately be checking this out.

Fmstrat@lemmy.world on 23 Nov 12:26 next collapse

Is there any plans for a data migration feature from Plausible?

Goldflag@lemmy.world on 23 Nov 15:09 next collapse

We have data migration plans in work but it doesn’t appear that a plausible migration is possible

farcaller@fstab.sh on 25 Nov 09:40 collapse

I don’t think that’s plausible.

vegyk0z6@lemmy.ml on 23 Nov 14:19 next collapse

This looks great. I’m interested in building similar dashboards but for a different use case. Are you using a particular typescript framework for this?

Goldflag@lemmy.world on 23 Nov 15:09 collapse

Next.js, TailwindCSS, shadcn. the usual stuff

moseschrute@lemmy.world on 23 Nov 17:28 collapse

I know modern tools get a lot of hate on Lemmy, but tainwind and shadcn have been amazing to work with. Next.js has been a little bumpy the last few years, but if you know what you’re doing, you can deliver a great UX with React. I’ve been enjoying Vite + React for anything that doesn’t need SSR.

EarMaster@lemmy.world on 24 Nov 10:14 collapse

Just a word of warning for everyone: The free self hosted version is heavily limited. I will stick with Plausible which may be simpler but also doesn’t want to push me into a subscription.

Goldflag@lemmy.world on 24 Nov 15:25 collapse

as opposed to plausible community edition which is even more limited and is only updated a couple times a year?

EarMaster@lemmy.world on 24 Nov 16:09 collapse

The community edition allows me to have multiple sites, multiple users and is way easier to set up. If I ever need additional features like funnels I would need a subscription for both - Plausible is less expensive.