Bug #7
closedOptimize wikipedia history fetching
Description
Currently, Wikipedia pageview history has to be requested every day for each stock. This kindof limits the amount of stocks there can be. I think there is a possibility to use page view dumps from wikipedia to avoid constantly battering their servers. However, it might overload our host, the files once un-bz2-d can reach upwards of 3GB or more and our host only has around 3GB or ram. Theoretically, it would be possible but with PHP being an ass language, I might have to do some fuckeries with cursors and shit to extract the data I need and discard the rest. Maybe extract the file, only keep data for en.wikipedia reducing some of the file size and then keep only the lines that have the articles I am using further decreasing the file size.