README File content

This README file contains description of the PWKB database: which is described in AAAI 2016 paper: pwkb.pdf , pwkb-slides.pdf

PWKB contains part-whole knowledge for three relations (physicalPartOf, memberOf, substanceOf).
The zipped file (90MB/ 220MB unzipped) contains physical-partof, memberof, substanceof relations to download: webchild-pwkb-data.


pwkb_p.txt

This file contains physicalPartOf relation in the following format:
part synset-idpart synset-wordrelationwhole synset-idwhole synset-wordisVisualcardinalityscore
e.g.104357121sunroofphysicalPartOf102958343cartrue11.0
can be understood as:
a car has one sunroof, and this is visible, and, the assertion has a confidence score of 1.0. The entries are mapped to WordNet synsets, e.g., the entry 102958343 is the synset-id of the WordNet entry for the relevant sense of the word car.

Note that some words are highly ambiguous and their senses can be quite diverse, thus it is important to refer to the gloss corresponding to their synset ids.
For example, in the triple [105449268 (blood cell) physicalPartOf 105400860 (A)], A refers to the seventh sense of the noun A, denoting a blood group.

The details of these synset-ids can be looked up at: noun.gloss.txt


pwkb_m.txt

This file contains memberOf relation in the following format:
part synset-idpart synset-wordrelationwhole synset-idwhole synset-wordcardinalityscore
e.g.110340312musicianmemberOf108247021duet21.0
can be understood as:
a duet has 2 member musicians, and, the assertion has a confidence score of 1.0. The entries are mapped to WordNet synsets, e.g., the entry 108247021 is the synset-id of the WordNet entry for the relevant sense of the word duet.
The details of these synset-ids can be looked up at: noun.gloss.txt


pwkb_s.txt

This file contains substanceOf relation in the following format:
part synset-idpart synset-wordrelationwhole synset-idwhole synset-wordcardinalityscore
e.g.114802450steelsubstanceOf102863750boileruncountable0.8
can be understood as:
a boiler constitutes of steel, and, the assertion has a confidence score of 0.8. The entries are mapped to WordNet synsets, e.g., the entry 102863750 is the synset-id of the WordNet entry for the relevant sense of the word boiler.
The details of these synset-ids can be looked up at: noun.gloss.txt