Computer Networking: A Top-Down Approach (7th Edition)
7th Edition
ISBN: 9780133594140
Author: James Kurose, Keith Ross
Publisher: PEARSON
expand_more
expand_more
format_list_bulleted
Question
Assume we capture students' webpage browsing records. We separate the records into two files: male and female, each with more than 10 billion records. The format of each record is: [student ID], [gender], [date], [time], [url], and they are not sorted. The main memory we can use is 16GB. We try to find the common urls visited by male and female students.
Please use MapReduce to solve this problem. Some important points you must clarify:
1. How to separate data?
2. What is the Map task? What's the format of <key, value> pairs?
3. What is the Reduce task? How do we get the final results?
You may earn extra credits if you can optimize your soultion.
Expert Solution
This question has been solved!
Explore an expertly crafted, step-by-step solution for a thorough understanding of key concepts.
Step by stepSolved in 2 steps
Knowledge Booster
Similar questions
- Consider using an array as a dictionary. Now assume the peculiar situation that the client may perform any number of insert operations but will only ever perform at most one lookup operation. What is the worst-case running-time of the operations performed on this data structure under the assumptions above? Briefly justify your answer.arrow_forwardWrite a script that creates a dictionary of name-age pair. Add some entries to your dictionary. Add a menu to support ADD, REMOVE, SERACH, AGE INCREASE, PRINT operations.The “Age Increase” option increases the age by 1 for all the entries in the dictionary.Test the four operationsarrow_forwardImplement a hash table and write a function to find the most frequently occurring word in a large text file.arrow_forward
- Create an Ordered Doubly Linked List in C++. Remember that an ordered Linked List is one where inserts automatically place a new node so that all the nodes retain a certain order. This list should be templated where items should be in "ascending" order (i.e., if it is a list of numbers, it should be of order of least to greatest).arrow_forwardCREATE DATABASE COUNTRIES; USE COUNTRIES; DROP TABLE IF EXISTS `City`; CREATE TABLE `City` ( `ID` int(11) NOT NULL AUTO_INCREMENT, `Name` char(35) NOT NULL DEFAULT '', `CountryCode` char(3) NOT NULL DEFAULT '', `District` char(20) NOT NULL DEFAULT '', `Population` int(11) NOT NULL DEFAULT '0', PRIMARY KEY (`ID`) ) ENGINE=MyISAM AUTO_INCREMENT=4080 DEFAULT CHARSET=latin1; -- -- Dumping data for table `City` -- -- ORDER BY: `ID` INSERT INTO `City` VALUES (1,'Kabul','AFG','Kabol',1780000); INSERT INTO `City` VALUES (2,'Qandahar','AFG','Qandahar',237500); INSERT INTO `City` VALUES (3,'Herat','AFG','Herat',186800); INSERT INTO `City` VALUES (4,'Mazar-e-Sharif','AFG','Balkh',127800); INSERT INTO `City` VALUES (5,'Amsterdam','NLD','Noord-Holland',731200); INSERT INTO `City` VALUES (6,'Rotterdam','NLD','Zuid-Holland',593321); INSERT INTO `City` VALUES (7,'Haag','NLD','Zuid-Holland',440900); INSERT INTO `City` VALUES (3068,'Berlin','DEU','Berliini',3386667); INSERT INTO `City` VALUES…arrow_forwardjava Draw a picture of the HashSet created by the data shown below. Assume the HashSet (1) has an initial size of 11, (2) uses separate chaining, and (3) inserts new nodes at the start (not end) of the chain. -4, 2, 18, 23, -15, 47, 87, 2032, 5393, 2, 53432arrow_forward
- Suppose you generate the following RSA key pairs: •p = 37199 •q = 49031 •N = p•q = 1823904169 • (p - 1) · (q - 1) = 1823817940 •e = 65537 (for encryption) •d = 1578812933 (for decrption) Assuming a=01, b=02, ..., z=26, and we group the digits in groups of three. (ii) If you receive strings of digits from a sender: 199770170, 1288754980, 324346846, 1370682962, decrypt the message.arrow_forwardThe dotted-decimal notation for the IP address 10010000 10101010 01011011 00101000 The dotted-decimal notation for the IP address 10000001 10100101 01010011 00110011 2 15 19 30 41 59 77 81 85 100 Search for element 45 in the list above using a binary search algorithm. Clearly, show the first, middle, and last indexes as well as the target in every iteration. Show all the steps taken. Use the table below as a guideline with the provided format. Instructions: Start your index at 0 Show all steps Mark the first row as header(The text will be in bold). Use the index number to show the first, last, and mid Use the floor function to find the middle element If struggling to insert a table, you may work this from a Word document and upload your work using the plus sign on the editing options. First last mid 0 1 2 3 4 5 6 7 8 9 target = 45arrow_forwardIn a 384-well plate luciferase assay, we drugged 4 receptors (A, B, C, D) with their endogenous agonists at a range of doses. Below is a subset of the resulting data frame (please use the full data frame attached to the original email as a CSV file). Each row of this data frame corresponds to a single well of the 384 well plate. The columns are as follows: receptor: the receptor tested in the well agonist: the endogenous agonist used for a receptor agonist_nM: the dose of the endogenous compound (in nanomolar) ● RLU: raw luminescence of the well (our measure of receptor activity) To answer this 3 part question, we’ll ask you to submit a PDF file containing all of your source code, plots, and written interpretation. (Using something like a Jupyter Notebook with Python or Rmarkdown is recommended, but not required.) You should provide brief explanations of your thought process at each step. Part A: Different receptors have different basal rates of activity. To better facilitate…arrow_forward
arrow_back_ios
arrow_forward_ios
Recommended textbooks for you
- Computer Networking: A Top-Down Approach (7th Edi...Computer EngineeringISBN:9780133594140Author:James Kurose, Keith RossPublisher:PEARSONComputer Organization and Design MIPS Edition, Fi...Computer EngineeringISBN:9780124077263Author:David A. Patterson, John L. HennessyPublisher:Elsevier ScienceNetwork+ Guide to Networks (MindTap Course List)Computer EngineeringISBN:9781337569330Author:Jill West, Tamara Dean, Jean AndrewsPublisher:Cengage Learning
- Concepts of Database ManagementComputer EngineeringISBN:9781337093422Author:Joy L. Starks, Philip J. Pratt, Mary Z. LastPublisher:Cengage LearningPrelude to ProgrammingComputer EngineeringISBN:9780133750423Author:VENIT, StewartPublisher:Pearson EducationSc Business Data Communications and Networking, T...Computer EngineeringISBN:9781119368830Author:FITZGERALDPublisher:WILEY
Computer Networking: A Top-Down Approach (7th Edi...
Computer Engineering
ISBN:9780133594140
Author:James Kurose, Keith Ross
Publisher:PEARSON
Computer Organization and Design MIPS Edition, Fi...
Computer Engineering
ISBN:9780124077263
Author:David A. Patterson, John L. Hennessy
Publisher:Elsevier Science
Network+ Guide to Networks (MindTap Course List)
Computer Engineering
ISBN:9781337569330
Author:Jill West, Tamara Dean, Jean Andrews
Publisher:Cengage Learning
Concepts of Database Management
Computer Engineering
ISBN:9781337093422
Author:Joy L. Starks, Philip J. Pratt, Mary Z. Last
Publisher:Cengage Learning
Prelude to Programming
Computer Engineering
ISBN:9780133750423
Author:VENIT, Stewart
Publisher:Pearson Education
Sc Business Data Communications and Networking, T...
Computer Engineering
ISBN:9781119368830
Author:FITZGERALD
Publisher:WILEY