The Soccer Extractor

Comments Off on The Soccer Extractor
Share

The Italian DBpedia team is pleased to announce the release of a new dataset: once more, we targeted the Italian Wikipedia soccer player articles (circa 52,000) and extracted semi-structured data from a table-formatting Wikipedia template.

In two words: Table extraction

Acknowledgments: Federico, Marco, and Simone, Marche Polytechnic University

Code

tsiteam’s Bitbucket

Main Library

JSONpedia
repo

Slides

Here

Numbers

Total triples: 1,462,100
DBPO mapping effort needed: 8 properties

Dataset

Download (ttl.gz)

Query

Italian DBpedia endpoint

#Try this!
SELECT ?player, ?team
FROM <http://it.dbpedia.org/soccer-extractor/>
WHERE {
    ?player dbpedia-owl:careerStation ?o .
    ?o dbpedia-owl:team ?team .
}
LIMIT 100

Stats

DBPO property Triples
careerStation 267,684
endYear 267,684
nationalTeam 807
numberOfMatches 216,834
startYear 264,537
team 249,232
youthClub 362

Comments are closed.