Thu, Feb 20, 2025

1 PM – 2:15 PM EST (GMT-5)

Add to Calendar

Private Location (sign in to display)

View Map

Registration

Register

Details

Web scraping is the process of extracting text and numeric data from a webpage using statistical software applications like R. Join this workshop to discover how R makes it easy to extract the data you need from a webpage and store it in a data frame for further analysis. We recommend attendees arrive with R (windows, mac) and RStudio installed on their laptops and a basic understanding of how to run R code and open programming files.

Learning Objectives:
- Identify and extract HTML data from a webpage with the inspect feature.
- Use R’s Rvest package to extract data and store it in a data frame.
- Demonstrate how R’s RSelenium package allows for advanced data extraction using website interactions such as mouse clicks, filling out forms, etc.
- Understand HTML format, including tags, attributes, etc.
- Distinguish between extracting data through web scraping vs APIs. 

Pre requisites:
- Install R (windows, mac)
- Install RStudio
- Basics of R (Open R File, Run code in RStudio)

Instructor: Jacob Grippin

Hosted By

Cornell Center for Social Sciences | Website | View More Events

Contact the organizers