Your final script will scrape a website IMPORTANT you will not follow any links that veer outside of the Virtual Environment IMPORTANT you are allowed to use standard python libraries and any 3rd party library Your script will generate a report that contains the following information. 1) Unique URLs of all the pages found on the website 2) Unique URL links to images found on the website 3) Extract and phone numbers found on the website 4) Extract all text content from each of the pages and store them in a string variable 5) Extract any Zip Codes NOTE for Items 6-8 you will be utilizing NLTK to process all the text found on the web site, using the text content you extracted during item 4 above. 6) A list of all unique vocabulary found on the website 7) A list of all possible verbs 8) A list of all possible nouns
Computer Science
Your final script will scrape a website
IMPORTANT you will not follow any links that veer outside of the Virtual Environment
IMPORTANT you are allowed to use standard python libraries and any 3rd party library
Your script will generate a report that contains the following information.
1) Unique URLs of all the pages found on the website
2) Unique URL links to images found on the website
3) Extract and phone numbers found on the website
4) Extract all text content from each of the pages and store them in a string variable
5) Extract any Zip Codes
NOTE for Items 6-8 you will be utilizing NLTK to process all the text found on the web site, using the text content you extracted during item 4 above.
6) A list of all unique vocabulary found on the website
7) A list of all possible verbs
8) A list of all possible nouns
Trending now
This is a popular solution!
Step by step
Solved in 4 steps with 4 images