[Go back to Index]
Wikipedia4epub is command line application which creates the offline ebook from articles on wikipedia.
It doesn't provide the complete offline wikipedia but only the selected articles:
(wiki4e-mkepub-firefox)(wiki4e-mkepub-subtree)See example ebook from EPUB article using wiki4e-mkepub-subtree.
Simple diagram how the wiki4e-mkepub-firefox command works: 
The best is to use HackageDB, the GHC 6.12.1 and newer is required.
$ cabal install wikipedia4epub
Please be aware that it is still ALPHA quality software.
There are following commands:
wiki4e-mkepub-firefox [<name of ebook>]wiki4e-mkepub-cache [<name of ebook>]wiki4e-mkepub-subtree http://en.wikipedia.org/<acticle>$ wiki4e-mkepub-firefox Wikipedia123
Please close your Firefox if you see this message longer than 5 seconds...
Going to connect on Firefox SQLite DB: /home/dixie/.mozilla/eclipse/places.sqlite
Going to connect on Firefox SQLite DB: /home/dixie/.mozilla/firefox/2zox86mc.default/places.sqlite
# STAGE 1/4 - Download Articles...
# STAGE 2/4 - Sanitize Articles...
# STAGE 3/4 - Download Images...
# STAGE 4/4 - Constructing EPUB...
Wikipedia123.epub constructed.
Done.
$ wiki4e-mkepub-subtree http://en.wikipedia.org/wiki/EPUB
# STAGE 1/4 - Fetch starting article: http://en.wikipedia.org/wiki/EPUB
[1/1] Already cached. Skipping download. /home/dixie/.wiki4e/wiki4e_fetch/EPUB
# STAGE 2/4 - Fetch children articles: 113
[1/113] Already cached. Skipping download. /home/dixie/.wiki4e/wiki4e_fetch/EPUB
[2/113] Fetching : http://en.wikipedia.org/wiki/Filename_extension
...
[11/113] Fetching : http://en.wikipedia.org/wiki/DTBook
[12/113] Fetching : http://en.wikipedia.org/wiki/Website
...
[113/113] Already cached. Skipping download. /home/dixie/.wiki4e/wiki4e_fetch/Main_Page
# STAGE 3/4 - Sanitize articles
# STAGE 4/4 - Download images
Count = 352
[1/352] Already cached. Skipping download. /home/dixie/.wiki4e/wiki4e_images/100px-EBookreal.jpg
[2/352] Already cached. Skipping download. /home/dixie/.wiki4e/wiki4e_images/50px-Question_book-new.svg.png
...
Done.
For performance and debug reasons download of article or image is cached. Cache has to be cleanup manually.
For caching purpose following directories are constructed:
$HOME/.wiki4e/wiki4e_fetch/$HOME/.wiki4e/wiki4e_sanitize/$HOME/.wiki4e/wiki4e_images/The quality of the screenshots are not very good since they are taken with mobile phone.

Book Listing. First is Wikipedia from Firefox

Context Table. Each Chapter is one Article

Article Content, Top

Article Content, Scrolled
Darcs repositories:
For reporting the bug or questions please write me email to
It is standarized & open format for ebooks, but basically it is ZIP-ed XHTML pages, images with some metadata. For the details see EPUB Article on Wikipedia
It is suppored with my ebook reader (PRS-505) and also my new ebook reader HanLin V5.
The "full blown" open source software for ebook management - Calibre - supports the EPUB too.