I am a newbie to python. I am working on a CSV file where it has over a million records. In the data, every Location has a unique ID (SiteID). I want to filter for and remove any records where there is no value or mismatch between SiteID and Location in my CSV file. (Note: This script should print the lines number and mismatch field values for each record.) lines = [] count = 0 # read line with open(r"air-quality-data-continuous.csv",'r') as fp:          # read an store all lines into list     lines = fp.readlines()

Database System Concepts
7th Edition
ISBN:9780078022159
Author:Abraham Silberschatz Professor, Henry F. Korth, S. Sudarshan
Publisher:Abraham Silberschatz Professor, Henry F. Korth, S. Sudarshan
Chapter1: Introduction
Section: Chapter Questions
Problem 1PE
icon
Related questions
Question

I am a newbie to python. I am working on a CSV file where it has over a million records. In the data, every Location has a unique ID (SiteID). I want to filter for and remove any records where there is no value or mismatch between SiteID and Location in my CSV file. (Note: This script should print the lines number and mismatch field values for each record.)
lines = []
count = 0

# read line
with open(r"air-quality-data-continuous.csv",'r') as fp:
    
    # read an store all lines into list
    lines = fp.readlines()
    
    print(str(len(lines)) + 'lines in input file')
    
with open(r"crop.csv", 'w') as fp:
    
    fp.write(lines.pop(0))
    
    #iterate each line
       for number, line in enumerate(lines):
        if (line[4] == 'NaN'):
            print('Empty Site ID found in line:' + str(number))
            continue
        if(str(line[4]) != line[17]):
            fp.write(line)
            count +=1
        
print(str(count + 1) + 'lines written to filter.csv')
        
print(str(count + 1) + 'lines written to crop.csv')

Expert Solution
steps

Step by step

Solved in 2 steps with 2 images

Blurred answer
Knowledge Booster
File Input and Output Operations
Learn more about
Need a deep-dive on the concept behind this application? Look no further. Learn more about this topic, computer-science and related others by exploring similar questions and additional content below.
Recommended textbooks for you
Database System Concepts
Database System Concepts
Computer Science
ISBN:
9780078022159
Author:
Abraham Silberschatz Professor, Henry F. Korth, S. Sudarshan
Publisher:
McGraw-Hill Education
Starting Out with Python (4th Edition)
Starting Out with Python (4th Edition)
Computer Science
ISBN:
9780134444321
Author:
Tony Gaddis
Publisher:
PEARSON
Digital Fundamentals (11th Edition)
Digital Fundamentals (11th Edition)
Computer Science
ISBN:
9780132737968
Author:
Thomas L. Floyd
Publisher:
PEARSON
C How to Program (8th Edition)
C How to Program (8th Edition)
Computer Science
ISBN:
9780133976892
Author:
Paul J. Deitel, Harvey Deitel
Publisher:
PEARSON
Database Systems: Design, Implementation, & Manag…
Database Systems: Design, Implementation, & Manag…
Computer Science
ISBN:
9781337627900
Author:
Carlos Coronel, Steven Morris
Publisher:
Cengage Learning
Programmable Logic Controllers
Programmable Logic Controllers
Computer Science
ISBN:
9780073373843
Author:
Frank D. Petruzella
Publisher:
McGraw-Hill Education