Does anyone self host Kiwix (offline wikipedia)?
from SuspciousCarrot78@lemmy.world to selfhosted@lemmy.world on 28 Apr 17:48
https://lemmy.world/post/46173167

[deleted by user]

#selfhosted

threaded - newest

comrade_twisty@feddit.org on 28 Apr 17:54 next collapse

I do on my TrueNAS in a docker container. Ihave about 1TB of zim files hosted including pre-LLM copies of German, English and French Wikipedia as well as the last two current versions in these languages.

Aditionally I have project Gutenberg Books in german and english as well as lots of random technical, medical, survival, etc stuff that I came accross - a lot of that is trash though, but sorting is too time consuming and my NAS has 48TB so who cares…

[deleted] on 28 Apr 18:01 collapse

.

comrade_twisty@feddit.org on 28 Apr 18:04 collapse

Yep exactly. Also you can have other people (friends/family) have access via VPN, Tailscale, etc.

IcedRaktajino@startrek.website on 28 Apr 18:03 next collapse

Yep, and I love it.

I’ve got a little Banana Pi M4 Zero (PiZero form factor but much more powerful and with 4 GB RAM) loaded up with, among other useful tools, Kiwix and the full Wikipedia dump. I just refreshed it with the 2026-02 full dump, so I’m caught up for the year. I’ve also got a lot of other offline docs loaded up (React, Bun, and the devdocs for several libraries I use) and it’s nice to have local copies of those instead of googling every time.

Surprisingly, the full ~130 GB Wikipedia dump works fine on a regular Pi Zero 2 with 512 MB RAM. I don’t know how ZIM works but it does work very very well.

[deleted] on 28 Apr 18:11 next collapse

.

IcedRaktajino@startrek.website on 28 Apr 18:17 collapse

130GB for the entire thing? And the pi doesn’t choke on indexing / searching it?

That was my thought. I knew it couldn’t hold it in RAM but thought it would be doing crazy IO and limited by being on SD, but it seems to not be a problem. Like I said, I don’t know how ZIM does it, but it does it well. Must have some kind of index that lets it fast travel to the correct blocks or something. I dunno lol.

how capable is the search engine (I assume it has one?)

Yep, it has search. It’s…okay but kind of primitive. It’s not slow, and if you’re searching for something that’s fairly unique (as far as keywords go), it does well. But if you’re searching something like an acronym where it shows up as a regular word in other entries, it’s a lot more hit or miss.

clif@lemmy.world on 29 Apr 11:50 collapse

Similar setup here. Orangepi zero that starts kiwix server at boot and switches the wifi to AP mode. Just plug it in, connect to kiwix WiFi, access kiwix.local via phone browser, and shazam.

IcedRaktajino@startrek.website on 29 Apr 11:55 collapse

Nice! Those AllWinner boards are a little tricky to get going and have some quirks, but the price is great for the extra horsepower you get. Granted, I use the latest Armbian since the manufacturer’s images are all quite old.

clif@lemmy.world on 29 Apr 15:35 collapse

When I saw the default configured repos were hosted by Huawei I did a double take, then installed Armbian too : D

surfrock66@lemmy.world on 28 Apr 18:31 next collapse

Yes, and I actually use it to train a local llm so I’m not hammering the internet. I have a ton of storage, and like to keep my kids in the sandbox, so we have wikipedia, project gutenberg, kahn academy, and a bunch of others all hosted behind an apache reverse proxy which is using mellon so there’s LDAP auth.

[deleted] on 28 Apr 18:40 next collapse

.

surfrock66@lemmy.world on 28 Apr 18:49 collapse

I also try to participate in some of the farms, running zimit and mwoffliner to help make more archives. Feels like I’m helping.

domi@lemmy.secnd.me on 29 Apr 10:11 collapse

Do you actually train the LLM or use RAG? I have been looking for a local LLM + Wikipedia RAG solution for a while now.

For now I just have kiwix-serve + searxng doing a simple search but the Kiwix search is…questionable.

[deleted] on 29 Apr 12:29 next collapse

.

surfrock66@lemmy.world on 29 Apr 14:49 next collapse

So this is actively in progress, and right now I’m having trouble getting my tesla P4’s working in my proxmox environment. The P4 is supported for vgpu out of the box, allegedly, but the installer I used is forcing a kernel version pin which isn’t making me happy:

github.com/anomixer/proxmox-vgpu-installer/…/16

So at this time, I’m just connecting API’s.

[deleted] on 29 Apr 20:09 collapse

.

shadybraden@programming.dev on 28 Apr 18:42 next collapse

Yup! Here’s my setup:

github.com/shadybraden/compose/…/compose.yaml

skip0110@lemmy.zip on 28 Apr 19:47 next collapse

Yes, I self host the English Wikipedia dump, as well as a few cooking sites and topic specific stack exchange dumps available in zim format.

My goal is:

  • reduce dependence on public internet. In the event of an outage or restriction I’d like some books and other content I can use to entertain myself
  • locally preserve a snapshot of information before it is possibly diluted by LLM edits
Archer@lemmy.world on 28 Apr 20:11 next collapse

Is there an actual download link? They want $20 for the Raspberry Pi image

vegetaaaaaaa@lemmy.world on 28 Apr 21:33 collapse
panda_abyss@lemmy.ca on 28 Apr 21:22 next collapse

Yes, it’s helpful

shems@piefed.social on 28 Apr 20:38 next collapse

I switched to an N150 some time ago, but I previously had it running perfectly on a Pi 4 with only 2GB of RAM. There’s actually a lot more content available than just Wikipedia! You can even archive your own websites using https://zimit.kiwix.org

It’s fun and Kiwix is impressively lightweight, it uses less than 50 MB of RAM, even with an article loaded.

https://imgur.com/a/DmmqJdh

vegetaaaaaaa@lemmy.world on 28 Apr 21:31 next collapse

Yes. This is my ansible role that deploys it

Silent9218@lemmy.zip on 28 Apr 22:52 next collapse

I got Kiwix at home and on my iPhone it rocks.

Decronym@lemmy.decronym.xyz on 29 Apr 12:00 next collapse

Acronyms, initialisms, abbreviations, contractions, and other phrases which expand to something larger, that I’ve seen in this thread:

Fewer Letters More Letters
AP WiFi Access Point
HTTP Hypertext Transfer Protocol, the Web
NAS Network-Attached Storage
VPN Virtual Private Network

4 acronyms in this thread; the most compressed thread commented on today has 6 acronyms.

[Thread #261 for this comm, first seen 29th Apr 2026, 12:00] [FAQ] [Full list] [Contact] [Source code]

DisgruntledGorillaGang@reddthat.com on 29 Apr 15:08 next collapse

Yes, I host Wikipedia, Wiktionary, and a few other resources. Very convenient, and the full Wikipedia is only like 100 gigs.

bgrayburn@lemmy.world on 29 Apr 20:55 next collapse

Checkout internet in a box. Easy to setup, update and select datasources which include wikipedia among many. internet-in-a-box.org

sonalder@lemmy.ml on 29 Apr 21:15 collapse

The full Wikipedia is saved on my mobile phone thanks to Kiwix