Intro
I have a bird feeder with a Raspberry Pi camera pointed at it. That raspberry pi creates a video feed of the bird feeder accessible on my local network. My local network has a Synology 918+ with a IP camera license. The Synology configured IP camera detects motion from birds (and wind unfortunately), and saves a short video of the motion events to disk.
The Plan
At first I will manually watch each motion event video to organize the videos by bird species. I'll do this by placing each video into a folder with the name of the bird species as the folder's name. Then, using Python I can split the videos into individual images which I can use to train a machine learning model that automatically identifies the bird species that is visiting the bird feeder.
I'm planning to base my machine learning model off of another machine learning problem described in a book containing example code. To minimize the amount of new code I have to write, I'll organize the training images in the same way as the example code I'm following. So that is to say all images will be saved into the same directory while a csv file is used to describe their file locations and category labels. I'll need to write the code to make that happen though.
The Code
The script below traverses the video files organized by species and saves each video frame to it's own file. The dataset.csv file referenced by the DATA_LOG variable keeps track of all of the file paths and their respective labels.
At this point, there is a problem because significant portions of each video contain no birds when they either haven't landed yet or have flown away. This means some frames are incorrectly labeled as a particular species when in fact they contain no birds at all. To address this, I'll move any frames that don't contain birds into a folder called "Empty feeder". Later iterations of the below script will traverse this folder, so that these frames can be correct labeled as containing no birds so our future machine learning models can also learn to recognize images without birds in them.
import csv
import os
import time
import cv2
from imutils.video import FileVideoStream
DATA_PATH = "ML/Birds/DATA"
DATA_LOG = "bird_ml/dataset.csv"
def get_video_file():
for root, dirs, files in os.walk("ML/Birds"):
try:
dirs.remove(DATA_PATH)
except ValueError:
pass
for file in files:
try:
file_extension = file.split(".")[1]
except IndexError:
continue
if file_extension == "mp4":
yield os.path.join(root, file)
def split_and_label(video_file, file_number):
print("[*] Processing", video_file)
video_stream = FileVideoStream(video_file).start()
# Allow the buffer some time to fill
time.sleep(2.0)
while video_stream.more():
frame = video_stream.read()
if frame is None:
break
else:
write_path = os.path.join(DATA_PATH, str(file_number) + ".jpg")
print("\t[*] Saving file number", file_number)
cv2.imwrite(write_path, frame)
file_number += 1
label = video_file.split('/')[-2]
with open(DATA_LOG, 'a') as dataset:
fieldnames = ['Image Filename', 'Species']
writer = csv.DictWriter(dataset, fieldnames=fieldnames)
writer.writerow({'Image Filename': write_path, 'Species': label})
# Sleep again so we don't overtake the buffer
time.sleep(0.01)
return file_number
def get_next_filename():
last_row = open(DATA_LOG, 'r').readlines()[-1]
file_path = last_row.split(',')[0]
file_name = file_path.split('/')[1]
file_number = file_name.split('.')[0]
return int(file_number) + 1
def main():
next_file_num = get_next_filename()
for video_file in get_video_file():
next_file_num = split_and_label(video_file, next_file_num)
# Delete the original video file so we don't duplicate training data
os.remove(video_file)
if __name__ == "__main__":
main()
def main():
next_file_num = get_next_filename()
for video_file in get_video_file():
next_file_num = split_and_label(video_file, next_file_num)
# Delete the original video file so we don't duplicate training data
os.remove(video_file)
if __name__ == "__main__":
main()