If you are a journalist who does not know how to scrape websites, a new tool is set to launch which will enable you to extract data in a user-friendly way and automatically import it into a spreadsheet.
The free online tool called import.io (pronounced import-eye-oh) will let you extract large amounts of data from a web page into an Excel spreadsheet.
For example, you could go to an estate agent's website, find details on houses for sale, and extract the data to a table, defining which column headers, such as house price and location, you want to collect the data for.
The tool will also allow you to aggregate various sources of data. For example, you could extract data on house prices from 10 different estate agent websites, pulling the information into a single table.
This idea is to "democratise" data, Sally Hadadi from import.io told Journalism.co.uk. "Big data is extremely messy and hard to get a hold of in a simple, easy manner. import.io aims to solve this problem and make big data available to everyone with a simple, easy-to-use interface. We turn the web into a database, allowing you extract data from websites into rows and columns, normalising the selected information."
She added: "We want journalists to get the best information possible to encourage and enhance unique, powerful pieces of work and generally make their research much easier."
import.io is currently in private developer testing and set to launch at some point this year.
The tool will be offered free of charge with import.io looking to monetise by charging those who pull in high volumes of data.
The London-based team which created import.io, first created a tool aimed at banks which allows for the searching and analysis of online and internal data.
Chris Alexander, a developer from import.io, gave a lightning pitch at last week's Hacks/Hackers London.
Disclaimer: I help organise Hacks/Hackers London monthly meet-ups.
Free daily newsletter
If you like our news and feature articles, you can sign up to receive our free daily (Mon-Fri) email newsletter (mobile friendly).
Related articles
- WAN-IFRA launches gaming tool NewsArcade with six-month free trial
- Why DC Thomson's data journalists are keeping tabs on high street businesses
- Nine AI hacks for newsroom leaders to promote employee wellbeing
- Tackling new challenges for data journalism, with DC Thomson's Lesley-Anne Kelly and Ema Sabljak
- Updated global directory features 3,000 independent digital media companies