Python Write to Text File Line by Line

The Best Practice of Reading Text Files In Python

Combine multiple files into a single stream with richer metadata

Christopher Tao

Reading text files in Python is relatively easy to compare with most of the other programming languages. Usually, we just use the "open()" function with reading or writing mode and then start to loop the text files line by line.

This is already the best practice and it cannot be any easie r ways. However, when we want to read content from multiple files, there is definitely a better way. That is, using the "File Input" module that is built-in to Python. It combines the content from multiple files that allow us to process everything in a single for-loop and plenty of other benefits.

In this article, I'll demonstrate this module with examples.

0. Without the FileInput Module

Photo by DGlodowska on Pixabay

Let's have a look at the "ordinary" way of reading multiple text files using the open() function. But before that, we need to create two sample files for demonstration purpose.

          with open('my_file1.txt', mode='w') as f:
f.write('This is line 1-1\n')
f.write('This is line 1-2\n')
with open('my_file2.txt', mode='w') as f:
f.write('This is line 2-1\n')
f.write('This is line 2-2\n')

In the above code, we open a file with the mode w which means "write". Then, we write two lines in the file. Please be noticed that we need to add the new line \n. Otherwise, the two sentences will be written in a single line.

After that, we should have two text files in the current working directory.

Now, let's say we want to read from both the text files and print the content line by line. Of course, we can still do that use the open() function.

          # Iterate through all file
for file in ['my_file1.txt', 'my_file2.txt']:
with open(file, 'r') as f:
for line in f:
print(line)

Here we have to use two nested for-loops. The outer loop is for the files, while the inner one is for the lines within each file.

1. Using the FileInput Module

Photo by Free-Photos on Pixabay

Well, nothing prevents us from using the open() function. However, the fileinput module just provides us with a neater way of reading multiple text files into a single stream.

First of all, we need to import the module. This is a Python built-in module so that we don't need to download anything.

          import fileinput as fi        

Then, we can use it for reading from the two files.

          with fi.input(files=['my_file1.txt', 'my_file2.txt']) as f:
for line in f:
print(line)

Because the fileinput module is designed for reading from multiple files, we don't need to loop the file names anymore. Instead, the input() function takes an iterable collection type such as a list as a parameter. Also, the great thing is that all the lines from both files are accessible in a single for-loop.

2. Use the FileInput Module with Glob

Photo by jarmoluk on Pixabay

Sometimes, it may not be practical to have such a file name list with all the names that are manually typed. It is quite common to read all the files from a directory. Also, we might be only interested in certain types of files.

In this case, we can use the glob module which is another Python built-in module together with the fileinput module.

We can do a simple experiment before that. The os module can help us to list all the files in the current working directory.

It can be seen that there are many files other than the two text files. Therefore, we want to filter the file names because we want to read the text files only. We can use the glob module as follows.

          from glob import glob          glob('*.txt')        

Now, we can put the glob() function into the fileinput.input() function as the parameter. So, only these two text files will be read.

          with fi.input(files=glob('*.txt')) as f:
for line in f:
print(line)

3. Get the Metadata of Files

Photo by StockSnap on Pixabay

You may ask how can we know which file exactly the "line" is from when we are reading from the stream that is actually combined with multiple files?

Indeed, using the open() function with nested loop seems to be very easy to get such information because we can access the current file name from the outer loop. However, this is in fact much easier in the fileinput module.

          with fi.input(files=glob('*.txt')) as f:
for line in f:
print(f'File Name: {f.filename()} | Line No: {f.lineno()} | {line}')

See, in the above code, we use the filename() to access the current file that the line comes from and the lineno() to access the current index of the line we are getting.

4. When the Cursor Reaches a New File

Photo by DariuszSankowski on Pixabay

Apart from that, there are more functions from the fileinput module that we can make use of. For example, what if we want to do something when we reach a new file?

The function isfirstline() helps us to decide whether we're reading the first line from a new file.

          with fi.input(files=glob('*.txt')) as f:
for line in f:
if f.isfirstline():
print(f'> Start to read {f.filename()}...')
print(line)

This could be very useful for logging purpose. So, we can be indicated with the current progress.

5. Jump to the Next File

Photo by Free-Photos on Pixabay

We can also easily stop reading the current file and jump to the next one. The function nextfile() allows us to do so.

Before we can demo this feature, please let me re-write the two sample files.

          with open('my_file1.txt', mode='w') as f:
f.write('This is line 1-1\n')
f.write('stop reading\n')
f.write('This is line 1-2\n')
with open('my_file2.txt', mode='w') as f:
f.write('This is line 2-1\n')
f.write('This is line 2-2\n')

The only difference from the original files is that I added a line of text stop reading in the first text file. Let's say that we want the fileinput module to stop reading the first file and jump to the second when it sees such content.

          with fi.input(files=glob('*.txt')) as f:
for line in f:
if f.isfirstline():
print(f'> Start to read {f.filename()}...')
if line == 'stop reading\n':
f.nextfile()
else:
print(line)

In the above code, another if-condition is added. When the line text is stop reading it will jump to the next file. Therefore, we can see that the line "1–2" was not read and output.

6. Read Compress File Without Extracting

Photo by kaboompics on Pixabay

Sometimes we may have compressed files to read. Usually, we will have to uncompress them before we can read the content. However, with the fileinput module, we may not have to extract the content from the compressed files before we can read it.

Let's make up a compressed text file using Gzip. This file will be used for demonstration purpose later.

          import gzip
import shutil
with open('my_file1.txt', 'rb') as f_in:
with gzip.open('my_file.gz', 'wb') as f_out:
shutil.copyfileobj(f_in, f_out)

In the above code, we added the file my_file1.txt into a compressed file using gzip. Now, let's see how fileinput can read it without extra steps for uncompressing.

          with fi.input(files='my_file.gz', openhook=fi.hook_compressed) as f:
for line in f:
print(line)

By using the parameter openhook and the flag fi.hook_compressed, the gzip file will be uncompressed on the fly.

The fileinput module currently supports gzip and bzip2. Unfortunately not the other format.

Summary

Photo by Free-Photos on Pixabay

In this article, I have introduced the Python built-in module fileinput and how to use it to read multiple text files. Of course, it will never replace the open() function, but in terms of reading multiple files into a single stream, I believe it is the best practice.

If you are interested in more Python built-in modules that are less used but powerful, please check out these related articles.

Don't Use Python OS Library Any More When Pathlib Can Do

Do You Know Python Has Built-In Array?

How To Make Fewer "Mistakes" In Python

Generate Whatever You Want In Python

Python Write to Text File Line by Line

Source: https://towardsdatascience.com/the-best-practice-of-reading-text-files-in-python-509b1d4f5a4

0 Response to "Python Write to Text File Line by Line"

Post a Comment

Iklan Atas Artikel

Iklan Tengah Artikel 1

Iklan Tengah Artikel 2

Iklan Bawah Artikel