How Web Scraping is Used to Explore Indian Restaurants in Canada?

4 mn read

This blog is the result of working on a real dataset that works as a part of the IBM data science professional program Capstone project and gaining a feel of what scientists think in their life. The main goals of this project were to create a business problem, search the web for data, and evaluate several districts in Toronto using Foursquare location data to determine which neighborhood is best for starting a new food business. We will use step-by-step strategies to get the desired objectives in this project.

Problem Description

Consider the case of an individual who wishes to launch a new Indian restaurant. And the individual is Indo-Canadian and resides in Toronto, Canada’s most populous city. As a result, he is unsure whether or not opening a restaurant is a wise idea. And if it’s a good idea for him to open his new restaurant in which neighborhood, for it to be profitable.

Advantages

This project will assist a diverse range of individuals.

  • Entrepreneur who wishes to open a new restaurant in the area.
  • Indians who desire to relocate to areas with an abundance of Indian eateries and culture.
  • A data analyst or data scientist who uses statistical and exploratory data analysis to analyze the neighborhood.

Data Acquisition

There are various sources from which the data can be collected and used for different purposes:

1. List of postal codes from Canada

Here is the list of postal codes of the neighborhoods in Canada from Wikipedia.

Link:

https://en.wikipedia.org/wiki/List_of_postal_codes_of_Canada:_M

How Web Scraping is Used to Explore Indian Restaurants in Canada

2. Geographical Co-ordinates

Here shown is the CSV file that consists of the Latitude and Longitude of the neighborhoods in Canada.

Link for CVS:

https://en.wikipedia.org/wiki/List_of_postal_codes_of_Canada:_M

geographical-co-ordinates

3. Fetching Details of the Venue

Here we will use Foursquare API for extracting the details and location of the venue. Here, the venue is used as a threshold and finally, we will use Folium. From Foursquares API

https://developer.foursquare.com/docs

fetching-details-of-the-venue

You will fetch the following for every venue:

  • Name: Venue name
  • Category: The type of the category defined by the API.
  • Latitude: The Latitude value of the venue
  • Longitude: The Longitude value of the venue.
  • Likes: Likes of the venue, which the user liked
  • Rating: Rating of the venue
  • Tips: Tips provided by the users.

Data Cleaning

Cleaning the Postal Code Data

There will be three columns in the data frame: Postal Code, Borough, and Neighborhood.

Only the cells with a borough assigned are processed. Ignore cells that have an unassigned borough.

In a single postal code location, more than one neighborhood may exist. For example, M5A is listed twice in the table on the Wikipedia page, and it contains two neighborhoods: Harbourfront and Regent Park. As seen in row 11 of the preceding table, these two rows will be consolidated into one row, with the neighborhoods separated by a comma.

When a cell has a borough but no neighborhood specified, the neighborhood is the same as the borough.

cleaning-the-postal-code-data

Adding Geographical Co-ordinates

For this we will use a CVS file that will consist of the latitude and longitude of the neighborhoods in Canada.

Link for CSV:

http://cocl.us/Geospatial_data

adding-geographical-co-ordinates

We will now only work with boroughs that are part of Toronto.

we-will-now-only-work-with-boroughs-that-are-part-of-toronto

Indian Restaurants in Toronto

Fetching all the Indian Restaurants in the Toronto

indian-restaurants-in-toronto
indian-restaurants-in-toronto

Exploratory Analysis

Let us analyze the number of Indian Restaurants present in each Borough

exploratory-analysis
exploratory-analysis

Let us also scrape the Indian restaurants present in each neighborhood

let-us-also-analyze-the-number-of-the-indian-restaurants
let-us-also-analyze-the-number-of-the-indian-restaurants

Get Ratings, Likes and Tips of the Restaurants using Foursquare API

Extracting the ratings, likes, tips of the restaurants using Foursquare API

get-ratings-likes-and-tips
get-ratings-likes-and-tips

The Average Ratings

Getting the average rating of the restaurants in a specific neighborhood

the-average-ratings

Here we will have the extracted list of the top Indian restaurants.

Looking to explore Indian Restaurants in Canada?

Contact Foodspark today and request a quote!!

know more : https://www.foodspark.io/how-web-scraping-is-used-to-explore-indian-restaurants-in-canada.php

Enjoy The
Magic Of Words From The Authors

A wonderful serenity has taken possession of my entire soul.
I am alone, and feel the charm of existence in this spot!

Discover TrendyRead

Welcome to TrendyRead, an author and reader focussed destination.
A place where words matter. Discover without further ado our countless stories.

Build great relations

Explore all the content from TrendyRead community network. Forums, Groups, Members, Posts, Social Wall and many more. You can never get tired of it!

Become a member

Get unlimited access to the best articles on TrendyRead and support our  lovely authors with your shares and likes.