Binary Data Files

Here, I will regularly publish precompiled .bin files for zelph that you can load and use directly. These files contain prepared semantic networks, mainly based on Wikidata data, but also on other domains. The focus is on efficiency: compared with JSON files, which can take hours to read, .bin files load in just a few minutes, depending on the hardware.

I plan to upload new .bin files regularly based on current Wikidata dumps (see Wikidata Dumps for transparency), and also to provide files for other data sources in the future.

Available Files

All .bin files are available on Hugging Face.

Currently, I offer the following Wikidata variants:

File	Variant	Nodes	File Size	RAM Usage	Name Entries (`wikidata` / `en`)	Load Time
`wikidata-20260309-all.bin`	Current full Wikidata dump for high-memory systems	983,424,620	82 GiB	223.7 GiB	119,231,266 / 83,261,799	23m 23s
`wikidata-20260309-all-pruned.bin`	Current pruned Wikidata dump optimised for substantially lower memory requirements	74,608,727	5.6 GiB	15.4 GiB	13,610,498 / 6,778,692	58.7s
`wikidata-20171227.bin`	Historic full Wikidata dump from 2017	203,190,311	18 GiB	44.6 GiB	42,187,613 / 27,960,315	3m 20s
`wikidata-20171227-pruned.bin`	Historic pruned Wikidata dump from 2017 with reduced memory requirements	17,407,259	1.4 GiB	3.8 GiB	4,307,749 / 2,324,957	14.0s

The values above reflect observed loading statistics from zelph on my system. Actual loading times and memory usage may vary depending on hardware and build configuration.

Using the Files

To load a .bin file in zelph, start zelph in interactive mode and use the command:

.load /path/to/file.bin

This loads the network directly into memory. Afterwards, you can execute queries, define rules, or start inference (e.g. with .run). For Wikidata-specific work, first load the script wikidata.zph (see Wikidata Integration) and adjust the language:

.import sample_scripts/wikidata.zph
.lang wikidata

Tip: if you work with the full JSON file, zelph automatically creates a .bin cache file during the first import to speed up future runs.

Generation of the Pruned Files

The pruned versions mentioned above were created by systematically pruning (removing) large knowledge domains from the corresponding full Wikidata dumps. The goal was to reduce biological, chemical, astronomical, and geographical domains in order to lower the memory requirement without losing the core data. The process involved loading the data, targeted removal of nodes and facts based on instance (P31) and subclass relationships (P279), and cleanup operations. For details, please refer to the corresponding log files, see https://github.com/acrion/zelph/tree/main/logs