mediawiki - How to obtain a list of titles of all Wikipedia articles -
i'd obtain list of titles of wikipedia articles. know there 2 possible ways content wikimedia powered wiki. 1 api , other 1 database dump.
i'd prefer not download wiki dump. first because it's huge, second because i'm not experienced querying databases. problem api on other hand couldn't figure out way retrieve list of article titles , if need > 4 mio requests me blocked further requests anyway. question 1. whether there way obtain titles of wikipedia articles via tha api , 2. whether there way combine multiple request/queries one. or have download wikipedia dump?
the allpages
api module allows that. limit (when set aplimit=max
) 500, query 4.5m articles, need 9000 requests.
but dump better choice, because there many different dumps, including all-titles-in-ns0
which, name suggests, contains want (59 mb of gzipped text).
Comments
Post a Comment