Given a directory of image files that all have the
extension regardless of their image type we need to rename
their extension based on the “actual” image type using
We can use the
file command to view the “actual” image type
of each file.
$ file 1.jpg 1.jpg: JPEG image data, EXIF standard 2.21
This looks easy enough we can just extract the extension from the 2nd column of the output, right?
Well, not quite. If a filename contains spaces we cannot rely on “column” position.
$ file 1\ 2.jpg 1 2.jpg: GIF image data, version 89a, 500 x 715
Well, how about we use
: as the “field delimiter”?
Again, if a filename contained
: it would mess that up.
$ file dir/3\:4.jpg dir/3:4.jpg: PNG image, 2000 x 938, 8-bit grayscale, non-interlaced
Generally it is safe to assume that “using bash” just means
something that can be run from the command-line and as parsing
the output of the
file command looks to be problematic we will
outsource this problem to Python.
Python has the imghdr module which “determines the type of image contained in a file or byte stream”.
It also has os.walk which can process a directory tree recursively.
Here’s what the code looks like.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21
import imghdr, os start_dir = '.' for root, dirs, files in os.walk(start_dir): for f in files: old = os.path.join(root, f) what = imghdr.what(old) if what: if what == 'jpeg': what = 'jpg' new = os.path.splitext(f) new = os.path.join(root, new + '.' + what) print('---') print('old:', old) print('new:', new) os.rename(old, new) print('---')
os.walk() returns 3 things
rootbeing the path of the current directory being processed
dirsbeing a list of the names of the directories contained in the current directory
filesbeing a list of the names of the files contained in the current directory
So we loop through
files on line 6 but as those are just the filenames we
need to use
os.path.join() to give us the full path.
>>> os.path.join('path/to/dir', 'filename') 'path/to/dir/filename'
imgur.what() method can take a filepath and it will return
if it is not an image.
None is a falsey value in Python meaning that
if what: will only
True for image files.
imgur.what() will return
jpeg for “JPEG data in JFIF or Exif formats”
but we don’t want to use
jpeg as the file extension.
This is the reason for the check on line 11 although we could instead use
continue to move to the next loop iteration i.e.
if what == '.jpg': continue
After getting the new extension we need need to generate the new filepath to
use in our
os.rename() command which we do on lines 13-14 with
which allows us to extract the original filename without the extension.
>>> os.path.splitext('my.file.name.jpg') ('my.file.name', '.jpg') >>> os.path.splitext('my.file.name.jpg') 'my.file.name'
Which we then add the new extension to and use
again to give the final full path.
Let’s run the code and check the output.
$ python rename-images.py --- old: ./1 2.jpg new: ./1 2.gif --- old: ./1.jpg new: ./1.jpg --- old: ./dir/3:4.jpg new: ./dir/3:4.png ---
find to see if the files were actually renamed.
$ find . ./1 2.gif ./1.jpg ./dir ./dir/3:4.png ./rename-images.py