create a script that will parse data from Rotten Tomatoes, a movie reviews website. The work you have to do is identical to what we covered in lectures 4 and 5, albeit for a different website. Please read and follow the instructions below very carefully. Step 1 Your script should begin by defining two variables (after importing libraries, etc) movie a string variable indicating the movie for which reviews will be parsed pageNum the number of review pages to parse For example, to parse the first 3 pages of the Gangs of New York reviews, set movie ‘’gangs_of_new_york” and pageNum = 3. Your code should go to the movie’s All Critics reviews page of rotten tomatoes, and parse the first three pages of reviews. Pagination on rotten tomatoes happens by clicking on the “Next” button. Step 2 For each review contained in each of the pages you requested, parse the following information The critic This should be 'NA' if the review doesn't have a critic’s name. The rating. The rating should be 'rotten' , 'fresh', or 'NA' if the review doesn't have a rating. The source This should be 'NA' if the review doesn't have a source. The text. This should be 'NA' if the review doesn't have text. The date. This should be 'NA' if the review doesn't have a date. Continuing with our Gangs of New York example: Step 3 After parsing the data, save them in a file that is called firstname_lastname_movie.txt The file should include one line for each review. The reviews in the file should appear in the same order as they do on the website. The 5 values that you write for each movie should be written in the order listed in step 2. The 5 values should be separated by a TAB character ('\t'). For example, I would save my data to “apostolos_filippas_gangs_of_new_york.txt”. If I had to parse the first three pages of reviews for that movie, my .txt output would look and be named like this (parsed on 10/09/2021).

Take 2 inputs from the user namely, movieName and no.of pages. Convert movieName to proper format…

Answered: create a script that will parse data from Rotten Tomatoes, a movie reviews website. The work you have to do is identical to what we covered in lectures 4 and 5,…

Database System Concepts

7th Edition

ISBN: 9780078022159

Author: Abraham Silberschatz Professor, Henry F. Korth, S. Sudarshan

Publisher: McGraw-Hill Education

See similar textbooks

Step 1

Your script should begin by defining two variables (after importing libraries, etc)

movie a string variable indicating the movie for which reviews will be parsed
pageNum the number of review pages to parse

For example, to parse the first 3 pages of the Gangs of New York reviews, set movie ‘’gangs_of_new_york” and pageNum = 3. Your code should go to the movie’s All Critics reviews page of rotten tomatoes, and parse the first three pages of reviews. Pagination on rotten tomatoes happens by clicking on the “Next” button.

Step 2

For each review contained in each of the pages you requested, parse the following information

The critic This should be 'NA' if the review doesn't have a critic’s name.
The rating. The rating should be 'rotten' , 'fresh', or 'NA' if the review doesn't have a rating.
The source This should be 'NA' if the review doesn't have a source.
The text. This should be 'NA' if the review doesn't have text.
The date. This should be 'NA' if the review doesn't have a date.

Continuing with our Gangs of New York example:

Step 3

After parsing the data, save them in a file that is called firstname_lastname_movie.txt

The file should include one line for each review.
The reviews in the file should appear in the same order as they do on the website.
The 5 values that you write for each movie should be written in the order listed in step 2.
The 5 values should be separated by a TAB character ('\t').

For example, I would save my data to “apostolos_filippas_gangs_of_new_york.txt”. If I had to parse the first three pages of reviews for that movie, my .txt output would look and be named like this (parsed on 10/09/2021).

Expert Solution

Program approach

Take 2 inputs from the user namely, movieName and no.of pages.
Convert movieName to proper format after reviewing rotten Tomatoes website.

The movie name is all smalls and has _ between words.

Convert movie name entered by user to above format.
Format the link and store it in the url.
Write a function that takes 2 parameters url and pages.
Get all the reviews. Parse through the reviews as long as number of pages is greater than 0 while number of pages is decremented by 1 for each iteration.
Extract the reviews and append to a list. Return the list.
The extracted data has many fields. Extract the required fields to a dictionary
Convert the dictionary to a frame and then to a .txt file.

Similar questions

SEE MORE QUESTIONS

Recommended textbooks for you

Database System Concepts
Computer Science
ISBN:9780078022159
Author:Abraham Silberschatz Professor, Henry F. Korth, S. Sudarshan
Publisher:McGraw-Hill Education
Starting Out with Python (4th Edition)
Computer Science
ISBN:9780134444321
Author:Tony Gaddis
Publisher:PEARSON
Digital Fundamentals (11th Edition)
Computer Science
ISBN:9780132737968
Author:Thomas L. Floyd
Publisher:PEARSON
C How to Program (8th Edition)
Computer Science
ISBN:9780133976892
Author:Paul J. Deitel, Harvey Deitel
Publisher:PEARSON
Database Systems: Design, Implementation, & Manag...
Computer Science
ISBN:9781337627900
Author:Carlos Coronel, Steven Morris
Publisher:Cengage Learning
Programmable Logic Controllers
Computer Science
ISBN:9780073373843
Author:Frank D. Petruzella
Publisher:McGraw-Hill Education