In a practical deep learning approach, you write code for everything. You preprocess your data using code, create and train the model using code, create output visualizations through code, everything through code. What if I say you could minimize your coding efforts using a tool that requires only drag and drop and does the behind stuff all by itself. Sounds interesting.

Available GUI-based options in the market

There aren’t many such tools that provide GUI based approach for deep learning. Big platforms like Azure have these kinds of utilities in their Machine Learning module but they come with a big cost. Even the cloud platforms like…

In this article, I will show you how to determine facial landmarks using the dlib library, how to calculate EAR(Eye Aspect Ratio), and use the concept of EAR to detect drowsiness.

Before you begin with the code part of this article, you would have to install dlib library in python. There are some prerequisites for installing dlib library and I would recommend you to check this article.

What are facial landmarks that dlib detects

The dlib library can be used to detect a face in an image and then find 68 facial landmarks on the detected face.

68 facial landmarks that dlib finds in a face

I will not go into details about how does it…

To all the people who are wondering how can this concept be covered inside an article, I would say things sound complicated until you have explored the depths of it. I will not say that the article would be very easy but yes it would be built on top of pretty basic computer vision concepts. Please do not presume that this is exactly what Tesla would be using in their cars, but it might be something similar.

What are the prerequisites? Some basic knowledge of OpenCV would be good. If not, don’t worry I will try to explain the OpenCV…

Answers to most computer vision problems lie in finding and analyzing the shapes present in the image, and getting the contour is one such approach for it. To a beginner, I would explain a contour as “simply as a curve joining all the points lying on the boundary of that shape”.

Let's say I have the following image of my hand, the contour of the hand would be the curve represented by the green line. The red dots show the points by connecting which we are making up the contour curve.

Contour points(Red). Connecting contour points, we create the contour curve(Green)

Following would be contours for the common shapes:

Some of the popular techniques for counting fingers in a image are by training a CNN or using contours and convexity hull. I have actually tried both of these techniques and in this section would like to mention the challenges that I faced with these techniques(putting links to check work done).

Experiments and challenges

  1. CNN Approach: The model is able to achieve good training and validation accuracy. The final plot also looks good. But when it comes to detection in real-life images, the model fails very badly. I tried tuning the hyperparameters, applying data augmentation, transfer learning, learning rate decay, tuning model architecture…

If I say you wanted to do something with hand gesture recognition in python, what would be the first solution that will pop in your mind: train a CNN, contours, or convexity hull. Sounds good and feasible, but when it comes to actually make use of these techniques, the detection is not very good and requires special conditions(like a proper background or similar conditions that you used while training).

Recently I came across a super cool library called Mediapipe which makes things pretty much simple for us. I would suggest you go through its official site to read more about…

A long time back one of my friends asked me to check this cool idea which was simply a small display device that “keeps you updated for anything important for you”. Somehow the idea did not succeed in raising funds and eventually scraped out.

Just for reference purposes. No intention to create the exact device. This image is completely part of The red part is what we are trying to achieve through this article

The idea

Though it was not possible for me to create hardware for such an idea(feel free to contact me if you have that level of skill) but I tried to achieve the software part of this idea. What I tried to create was a centralized hub that provides me unread notifications count that I have on different…

In the most simple terms, image binarization means that you are converting a image to black and white format.

Original Image(c) GrayScaleImage(b) and Binary Image(a)

Most computer vision programs start by converting the image to binary format(trust me your first step in every computer vision application would be conducting image binarization). The more raw the image is, it becomes easier for the computer to process it and understand the underlying features of the image(which may be quite easy for normal human eyes to figure out).

Different ways to perform Image binarization in OpenCV

There might be multiple factors that decide how the binarization is conducted. …

Dhruv Pandey

A machine learning and computer vision enthusiast working as a web developer in Finland.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store