Read a text file one line at a time with a for loop

In One big string, we assumed that the input text file was small enough to fit into one big Python string. If it isn’t, you can still input the file but you have to do it one line at a time as shown in the following program. Even if the file is small enough to fit into one string, it still might be more convenient to process it one line at a time.

Change Myname to your name. We can loop through lines because lines is iterable.

The input function removes the newline character from the end of every line that it reads in. But the following for loop does not remove the newline from the lines that it reads in. That’s why we print the lines with end = "".

filereader.py

The output of the script is a copy of the file README in the demo directory.

This directory contains a collection of demonstration scripts for
various aspects of Python programming.

beer.py        Well-known programming example: Bottles of beer.
eiffel.py      Python advanced magic: A metaclass for Eiffel post/preconditions.
hanoi.py       Well-known programming example: Towers of Hanoi.
life.py        Curses programming: Simple game-of-life.
markov.py      Algorithms: Markov chain simulation.
mcast.py       Network programming: Send and receive UDP multicast packets.
queens.py      Well-known programming example: N-Queens problem.
redemo.py      Regular Expressions: GUI script to test regexes.
rpython.py     Network programming: Small client for remote code execution.
rpythond.py    Network programming: Small server for remote code execution.
sortvisu.py    GUI programming: Visualization of different sort algorithms.
ss1.py         GUI/Application programming: A simple spreadsheet application.
vector.py      Python basics: A vector class with demonstrating special methods.

Things to try

  1. Change
        print(line, end = "")   #The line already ends with a newline.
    
    to the following. The newline character is an example of a whitespace character.
        line = line.rstrip() #Remove the newline from the end of the line.
        print(line)          #Print the line, followed by a newline.
    
  2. Some macOS text files you can input one line at a time:
    1. /usr/share/dict/propernames (1308 lines)
    2. /usr/share/dict/words (235,886 lines)
    3. /Library/Frameworks/Python.framework/Versions/3.8/share/doc/python3.8/examples/Tools/pynche/X/rgb.txt (753 lines)
    4. /var/log/system.log
  3. Instead of reading lines from a text file, read the lines of text that are output by a command. Suppose i was 768. In binary, that’s
    0000001100000000
    The expression
    i >> 8
    is i shifted 8 places to the right:
    0000000000000011,
    which is 3.
    """
    Read and display the lines of text output by a command.
    Also display the command's exit status.
    """
    
    import sys
    import os   #operating system
    
    command = "ls -l /"       #list the root directory
    lines = os.popen(command) #pipe open
    
    for line in lines:
        print(line, end = "")
    
    i = lines.close()
    print()
    
    if not i:
        print("The command returned exit status 0.")
    elif i > 0:
        print(f"The command returned exit status {i >> 8}.")
    else:
        print(f"The command was killed by signal number {-i} before it could return an exit status.")
    
    sys.exit(0)
    
    total 13
    drwxrwxr-x+ 66 root  admin  2112 Aug 26 10:11 Applications
    drwxr-xr-x+ 66 root  wheel  2112 Jan 27  2019 Library
    drwxr-xr-x   2 root  wheel    64 Oct 24  2018 Network
    drwxr-xr-x@  5 root  wheel   160 Sep 21  2018 System
    drwxr-xr-x   6 root  admin   192 Oct 24  2018 Users
    drwxr-xr-x@  5 root  wheel   160 Aug 18 03:34 Volumes
    drwxr-xr-x@ 37 root  wheel  1184 Aug 13 04:43 bin
    drwxrwxr-t   2 root  admin    64 Oct 24  2018 cores
    dr-xr-xr-x   3 root  wheel  4385 Aug 13 04:45 dev
    lrwxr-xr-x@  1 root  wheel    11 Oct 24  2018 etc -> private/etc
    dr-xr-xr-x   2 root  wheel     1 Aug 30 08:18 home
    -rw-r--r--   1 root  wheel   313 Aug 17  2018 installer.failurerequests
    dr-xr-xr-x   2 root  wheel     1 Aug 30 08:18 net
    drwxr-xr-x   3 root  wheel    96 Sep 26  2016 opt
    drwxr-xr-x   6 root  wheel   192 Oct 24  2018 private
    drwxr-xr-x@ 64 root  wheel  2048 Aug 13 04:43 sbin
    lrwxr-xr-x@  1 root  wheel    11 Oct 24  2018 tmp -> private/tmp
    drwxr-xr-x@ 10 root  wheel   320 Oct 24  2018 usr
    lrwxr-xr-x@  1 root  wheel    11 Oct 24  2018 var -> private/var
    
    The command returned exit status 0.
    

    What type of object is lines?

  4. Write a Python program named producer.py that outputs one or more lines of text. Then in the above program,
    command = "/Library/Frameworks/Python.framework/Versions/3.8/bin/python3 producer.py"
    
  5. """
    Read and display the lines of text output by a command.
    Also display the command's exit status.
    """
    
    import sys
    import os   #operating system
    
    command = 'cd "$HOME/Library/Application Support/Google/Chrome/Default";' \
        'sqlite3 -cmd \'select * from cookies where host_key == ".nytimes.com";\' Cookies < /dev/null | od -c'
    
    lines = os.popen(command) #pipe open
    
    for line in lines:
        print(line, end = "")
    
    i = lines.close()
    print()
    
    if not i:
        print("The command returned exit status 0.")
    elif i > 0:
        print(f"The command returned exit status {i >> 8}.")
    else:
        print(f"The command was killed by signal number {-i} before it could return an exit status.")
    
    sys.exit(0)
    

The Apache web server creates an access_log file.

Whenever anyone in the world points their web browser at any page hosted (stored) on the Linux machine oit2.scps.nyu.edu (e.g., the page you’re looking at now, http://oit2.scps.nyu.edu/~meretzkm/python/string/forline.html), the web browser sends a GET request for this page to the web server running on oit2.scps.nyu.edu. The web server then sends the requsted page back to the browser, and adds a line to the access log file on oit2.scps.nyu.edu recording this transaction. The access log file is constantly growing because the web pages hosted on oit2.scps.nyu.edu are very popular.

oit2.scps.nyu.edu happens to be a Fedora Linux machine, and the web server running on it happens to be an Apache web server. If you’re interested, here’s the wild goose chase I had to go through to find the access log file on oit2.scps.nyu.edu, starting with the search for the web server configuration file httpd.conf. (The web server is sometimes called the “http dæmon”.) For the % codes in the LogFormat, see Custom Log Formats.

find / -type f -name httpd.conf 2> /dev/null
/etc/httpd/conf/httpd.conf

cd /etc/httpd/conf
grep CustomLog httpd.conf
    CustomLog "logs/access_log" combined

grep combined httpd.conf
    LogFormat "%h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\"" combined

grep ServerRoot httpd.conf
ServerRoot "/etc/httpd"

ls -l /etc/httpd/logs
lrwxrwxrwx 1 root root 19 May 18  2015 /etc/httpd/logs -> ../../var/log/httpd

grep '/~meretzkm/python/' /var/log/httpd/access_log-20190325 | tail -3
66.108.88.87 - - [24/Aug/2019:10:08:39 -0400] "GET /~meretzkm/python/string/forline.html HTTP/1.1" 200 10964 "http://oit2.scps.nyu.edu/~meretzkm/python/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/76.0.3809.100 Safari/537.36"
66.108.88.87 - - [24/Aug/2019:10:09:25 -0400] "GET /~meretzkm/python/string/forline.html HTTP/1.1" 200 11034 "http://oit2.scps.nyu.edu/~meretzkm/python/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/76.0.3809.100 Safari/537.36"
93.158.161.47 - - [24/Aug/2019:10:10:56 -0400] "GET /~meretzkm/python/class/iterable.html HTTP/1.1" 200 21085 "-" "Mozilla/5.0 (compatible; YandexBot/3.0; +http://yandex.com/bots)"

I ran the following script on oit2.scps.nyu.edu and got thousands of lines of output.

access_log.py

207.46.13.114 - - [25/Mar/2019:10:04:18 -0400] "GET /~meretzkm/python/INFO1-CE9990/ HTTP/1.1" 200 7277 "-" "Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)"
210.112.160.112 - - [25/Mar/2019:10:07:03 -0400] "GET /~meretzkm/python/string/forURL.html HTTP/1.1" 200 21351 "http://oit2.scps.nyu.edu/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/65.0.3325.181 Safari/537.36"
207.46.13.114 - - [25/Mar/2019:10:31:58 -0400] "GET /~meretzkm/python/list/barchart.html HTTP/1.1" 200 2000 "-" "Mozilla/5.0 (iPhone; CPU iPhone OS 7_0 like Mac OS X) AppleWebKit/537.51.1 (KHTML, like Gecko) Version/7.0 Mobile/11A465 Safari/9537.53 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)"
etc.
93.158.161.47 - - [24/Aug/2019:10:10:56 -0400] "GET /~meretzkm/python/class/iterable.html HTTP/1.1" 200 21085 "-" "Mozilla/5.0 (compatible; YandexBot/3.0; +http://yandex.com/bots)"
66.108.88.87 - - [24/Aug/2019:10:17:23 -0400] "GET /~meretzkm/python/string/forline.html HTTP/1.1" 200 11013 "http://oit2.scps.nyu.edu/~meretzkm/python/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/76.0.3809.100 Safari/537.36"
66.108.88.87 - - [24/Aug/2019:10:17:32 -0400] "GET /~meretzkm/python/string/forline.html HTTP/1.1" 200 10942 "http://oit2.scps.nyu.edu/~meretzkm/python/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/76.0.3809.100 Safari/537.36"

To get only three lines of output, change the loop to

total = 0
mention = 0

for line in lines:
    total += 1
    if "/~meretzkm/python/" in line:
        mention += 1
        last = line

lines.close()
print(f"{mention} of the {total} lines in {filename} mentioned /~meretzkm/python/.")
if mention > 0:
    print("The most recent line is")
    print(last)
18750 of the 1004239 lines in /var/log/httpd/access_log-20190325 mentioned /~meretzkm/python/.
The most recent line is
46.229.168.138 - - [24/Aug/2019:10:24:22 -0400] "GET /~meretzkm/python/lutz.html HTTP/1.1" 200 1138 "-" "Mozilla/5.0 (compatible; SemrushBot/6~bl; +http://www.semrush.com/bot.html)"