PowerUp
Talend World Bank components

tWorldBank components



Background : The World Bank website (www.worldbank.org/data) contains a huge set of country related data that can be freely used by anyone.
A set of indicators is maintained (i.e. GNI per capita, country income levels, energy related indicators, health indicators, demographic...) and made available to rhird parties via a webservice interface.
No registration is required, in fact the World Bank encourages the usage of this data.
While browsing the data on the website is extremely easy thanks to the provided query and export functionality, the integration in a Data Warehouse (or other analytical solution) is a bit more complex.

The solution

Talend Data Integration can easily assist in integrating data flows from different sources, so a logical solution was to design specific compoentnts to generate data flows from the World Bank Webservices.

Data consists in two main sets : dimensional data and facts.
The only "facts" present are the indicator values which have some basic dimensions : 1) Indicator, 2) Time, 3) a gepgraphical territory (country, region..) and finally 4) a source.



Current version

Version : 0.4
Release Date : Aug 3 2011
Status : Beta

Four different compoents are provided to access this data :

ComponentRelated Data
tWorldBankLookupCan be used to retrieve Topics, Regions,Sources,Income Levels and Lending Types.
A property allows the selection of the specific lookup to be loaded.The same output schema is used for all the lookups, this is not the case in the WB webservices.

tWorldBankCountryThe Country table provides an interesting set of information.Besides the iso codes (2 letters and 3 letters), the lending type and current income level are provided with the capital city name and its longitude/latitude.
Note that this table does not contain only iso countries, but also different typoe of aggregates for which some indicator values are also provided.Non iso countries can be recognized by the Admin Region being "NA"
tWorldBankIndicatorThe indicator component retrieves the list of the available indicators, with their topics (the field theoretically could contain more than one value, comma separated) and the source

tWorldBankValueThis component is used to retrieve the value sassociated to the indicators.Note that different time aggregations may be present and may also vary between indicators.Not all the countries have all the value, therefore extra caution is suggested in performing aggregations and in general is advisable to use the pre-aggregated data whenever possible.
Also the World Bank sends via it's web services sets of null data when not available (it cannot be filtered out on the webservice itself), for data warehousing purposes we strongly suggest you filter it in the output of the component




The properties available in the components correspond to filters that can be applied to the extractions (i.e. it is possible to extract specific indicators for specific countries, regions etc (values are comma separated).
These filters are exposed directly by the webservices, you can chek the WB website for more details.

Should you need help in using these components or in automating the integration of the World Bank data in your database / data warehouse, feel free to contact us using our feedback module.
The components use a java library (jar included) we produced specifically to interface these webservices, the same library can be freely used outside the components if needed.

Example

A temporary example using the data retrieved with the components can be found here, a (small) seelection of indicators is provided, data is stored on a MySQL database (about 1.5 Million records in the fact table).




Downloads
  • Download components
  • Download sample Job
  • Download Talend Table schemas
  • Download Mysql Table creation sql code



  • License

    THIS SOFTWARE IS PROVIDED BY POWERUP ``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL POWERUP BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.