How To Navigate the Filesystem with Python’s Pathlib



Image by Author

 

In Python, using regular strings for filesystem paths can be a pain, especially if you need to perform operations on the path strings. Switching to a different operating system causes breaking changes to your code, too. Yes, you can use os.path from the os module to make things easier. But the pathlib module makes all of this much more intuitive.

The pathlib module introduced in Python 3.4 (yeah, it’s been around for a while) allows for an OOP approach that lets you create and work with path objects, and comes with batteries included for common operations such as joining and manipulating paths, resolving paths, and more.

This tutorial will introduce you to working with the file system using the pathlib module. Let’s get started.

 

Working with Path Objects

 

To start using pathlib, you first need to import the Path class:

 

Which allows you to instantiate path objects for creating and manipulating file system paths.

 

Creating Path Objects

You can create a Path object by passing in a string representing the path like so:

path = Path('your/path/here')

 

You can create new path objects from existing paths as well. For instance, you can create path objects from your home directory or the current working directory:

home_dir = Path.home()
print(home_dir)

cwd = Path.cwd()
print(cwd)

 

This should give you a similar output:

Output >>>
/home/balapriya
/home/balapriya/project1

 

Suppose you have a base directory and you want to create a path to a file within a subdirectory. Here’s how you can do it:

from pathlib import Path

# import Path from pathlib
from pathlib import Path

# create a base path
base_path = Path("/home/balapriya/Documents")

# create new paths from the base path
subdirectory_path = base_path / "projects"https://www.kdnuggets.com/"project1"
file_path = subdirectory_path / "report.txt"

# Print out the paths
print("Base path:", base_path)
print("Subdirectory path:", subdirectory_path)
print("File path:", file_path)

 

This first creates a path object for the base directory: /home/balapriya/Documents. Remember to replace this base path with a valid filesystem path in your working environment.

It then creates subdirectory_path by joining base_path with the subdirectories projects and project1. Finally, the file_path is created by joining subdirectory_path with the filename report.txt.

As seen, you can use the / operator to append a directory or file name to the current path, creating a new path object. Notice how the overloading of the / operator provides a readable and intuitive way to join paths.

When you run the above code, it’ll output the following paths:

Output >>>
Base path: /home/balapriya/documents
Subdirectory path: /home/balapriya/documents/projects/project1
File path: /home/balapriya/documents/projects/project1/report.txt

 

Checking Status and Path Types

Once you have a valid path object, you can call simple methods on it to check the status and type of the path.

To check if a path exists, call the exists() method:

path = Path("/home/balapriya/Documents")
print(path.exists())

 

 

If the path exists, it outputs True; else, it returns False.

You can also check if a path is a file or directory:


print(path.is_file())
print(path.is_dir())

 

 

 

Note: An object of the Path class creates a concrete path for your operating system. But you can also use PurePath when you need to handle paths without accessing the filesystem, like working with Windows path on a Unix machine.

 

Navigating the Filesystem

 

Navigating the filesystem is pretty straightforward with pathlib. You can iterate over the contents of directories, rename and resolve paths, and more.

You can call the iterdir() method on the path object like so to iterate over all the contents of a directory:

path = Path("/home/balapriya/project1")

# iterating over directory contents

for item in path.iterdir():
    print(item)

 

Here’s the sample output:

Output >>>
/home/balapriya/project1/test.py
/home/balapriya/project1/main.py

 

Renaming Files

You can rename files by calling the rename() method on the path object:


path = Path('old_path')
path.rename('new_path')

 

Here, we rename test.py in the project1 directory to tests.py:

path = Path('/home/balapriya/project1/test.py')
path.rename('/home/balapriya/project1/tests.py')

 

You can now cd into the project1 directory to check if the file has been renamed.
 

Deleting Files and Directories

You can also delete a file and remove empty directories with the unlink() to and rmdir() methods, respectively.

# For files
path.unlink()   

# For empty directories
path.rmdir()  

 

 

Note: Well, in case deleting empty directories got you curious about creating them. Yes, you can also create directories with mkdir() like so: path.mkdir(parents=True, exist_ok=True). The mkdir() method creates a new directory. Setting parents=True allows the creation of parent directories as needed, and exist_ok=True prevents errors if the directory already exists.

 

Resolving Absolute Paths

Sometimes, it’s easier to work with relative paths and expand to the absolute path when needed. You can do it with the resolve() method, and the syntax is super simple:

absolute_path = relative_path.resolve()

 

Here’s an example:

relative_path = Path('new_project/README.md')
absolute_path = relative_path.resolve()
print(absolute_path)

 

And the output:

Output >>> /home/balapriya/new_project/README.md

 

File Globbing

 

Globbing is super helpful for finding files matching specific patterns. Let’s take a sample directory:

projectA/
├── projectA1/
│   └── data.csv
└── projectA2/
	├── script1.py
	├── script2.py
	├── file1.txt
	└── file2.txt

 

Here’s the path:

path = Path('/home/balapriya/projectA')

 

Let’s try to find all the text files using glob():

text_files = list(path.glob('*.txt'))
print(text_files)

 

Surprisingly, we don’t get the text files. The list is empty:

 

It’s because these text files are in the subdirectory and glob doesn’t search through subdirectories. Enter recursive globbing with rglob().

text_files = list(path.rglob('*.txt'))
print(text_files)

 

The rglob() method performs a recursive search for all text files in the directory and all its subdirectories. So we should get the expected output:

Output >>>
[PosixPath('/home/balapriya/projectA/projectA2/file2.txt'), 
PosixPath('/home/balapriya/projectA/projectA2/file1.txt')]

 

And that’s a wrap!

 

Wrapping Up

 

In this tutorial, we’ve explored the pathlib module and how it makes file system navigation and manipulation in Python accessible. We’ve covered enough ground to help you create and work with filesystem paths in Python scripts.

You can find the code used in this tutorial on GitHub. In the next tutorial, we’ll look at interesting practical applications. Until then, keep coding!

 

 

Bala Priya C is a developer and technical writer from India. She likes working at the intersection of math, programming, data science, and content creation. Her areas of interest and expertise include DevOps, data science, and natural language processing. She enjoys reading, writing, coding, and coffee! Currently, she’s working on learning and sharing her knowledge with the developer community by authoring tutorials, how-to guides, opinion pieces, and more. Bala also creates engaging resource overviews and coding tutorials.



Recent Articles

Related Stories

Leave A Reply

Please enter your comment!
Please enter your name here