Hirikikas - Web scraping: how to extract structured data from a website - Montera34, Santiago Espinosa de los Monteros Lobato | Tabakalera - Donostia / San Sebastián

Web scraping: how to extract structured data from a website

Web scraping workshop: how to extract structured data from a website

Web scraping is a technique that uses different software tools to extract data or information from a web page. It is used to collect data without structure and convert it into structured data to be later processed in databases or spreadsheets. The workshop will adopt a practical approach to web scraping with the aim of allowing attendees to carry out the processing of useful information in their own projects.

The meeting will establish an ongoing line of work focusing on data and the viewing of data overseen by the Montera34 group and following on from the Maps&Data workshops held in 2016 and 2017 in Hirikilabs, one of the results of which was the Report on the Airbnb effect in Donostia and the Basque Country. The objective of this new line of work, consisting of meetings and workshops, is to feed into the DataCommonsLab, a new open group that will be carrying out ongoing work on data and which will meet periodically in Hirikilabs.



February 6, Tuesday

Introduction: Presentation of the activity, establishment of the context and explanation of the aims of the workshop.

Introduction to scraping: Explanation of web operation (HTML, JSON, APIs ...), and introduction to forms of information storage obtained.

Scraper development: Explanation and practical application of initial tools to carry out scraping (Postman, Python, Beautiful Soup, etc.).


February 7, Wednesday

Scraper development: Continuation of the previous day's session.

Introduction to advanced scraping techniques: JavaScript execution, use of proxies, other issues arising in the carrying out of the workshop.



Web development, data visualisation, and digital art projects

You may also be interested in

03-04-2017 > 05-04-2017 - 17:00-20:00

Last year Montera34 gave a Workshop on Public...


+ Info