1. 程式人生 > >Dibakar Saha Talks About his Image Processing Projects in Python.

Dibakar Saha Talks About his Image Processing Projects in Python.

Do you know OpenCV, Machine Learning and Image Processing and you find it difficult to come up with cool amazing projects?

Today’s guest, Dibakar Saha(a.k.a EvilPort) talks about his image processing and machine learning projects.

Basically, he is a beginner in Python with experience in Image Processing and a little bit in machine learning.

He has designed a very simple classification programs like spam detection and sentiment analysis using machine learning in Python.

Using image processing he has also designed a very simple gesture recognition system. He has also designed a gesture-driven keyboard. And presently he is working on an app that he calls NFS Most Wanted 2013 Remote, that can control the cars in the game using your phone’s accelerometer.

In this interview, EvilPort was able to share his programming experience, he gave an insight of how he overcame the difficulty of coming up with amazing projects, and less I forget;

He also revealed some tips that will help a lot of programmers out there, especially the newbies.

Dibakar Saha Shares his Programming Experience

Dibakar Saha Talks About His Image Processing and Machine Learning Projects. Basically he is a beginner in Python with experience in Image Processing and a little bit in machine learning. He has designed a very simple classification programs like spam detection and sentiment analysis using machine learning in Python. Using image processing he has also designed a very simple gesture recognition system. He has also designed a gesture driven keyboard. And presently he is working on a app that he calls NFS Most Wanted 2013 Remote, that can control the cars in the game using your phone's accelerometer.

Photo of Dibakar Saha

Godson: First of all, I would like to sincerely thank you for devoting time for this interview. Kindly tell us about yourself (full name, hobbies, nationality, education, and experience in programming)?

Dibakar Saha: My name is Dibakar Saha aka EvilPort. I am from West Bengal, India. Presently, I am a student of Bachelor of Technology (B. Tech) in Computer Science and Engineering from Bengal College of Engineering and Technology which is affiliated to Maulana Abul Kalam University of Technology.

I like programming a lot. Literally speaking, I have been programming half of my life. It all started with the language Logic Oriented Graphic Oriented (LOGO) when I was like 8 or 9. But I did not like programming that much as I like it now.

As of today, I am experienced in programming languages like C, C++, Java, Android, VB.NET, Shell Programming and Python. I also have some experience with HTML, PHP, JavaScript and Intel 8085 microprocessor. As you have already guessed, I am more of a backend developer.

When I am free I spend my time learning new concepts in programming, making some new projects, playing video games, hanging out with friends etc.

Godson: Can you narrate your first programming experience and what got you to start learning to program?

Dibakar Saha:  As I said earlier, my first programming language was LOGO. It was a fun and an educational programming language. But I did not like programming very much back then.

I started to like programming when I was introduced to GW-BASIC (perhaps in 2005 or 2006).

My first program was printing Hello World. On successful execution, I realized that if I could make a computer greet “Hello World”, I could make it do anything.

It was after that incident when I started to have a very deep interest in Computer programming. I was determined to learn more about computers as much as I could. The rest is history.

Godson: What inspired you to venture into machine learning and image processing and drove you to come up with amazing projects?

Dibakar Saha:  If you go, like 5 months back from today and asked me,

“What is machine learning?”

I could not have answered it.

My interest for machine learning came when in my Facebook news feed I saw a post that stated,

“Face Detection using Python within 25 lines”.

I opened it and saw how easy it was.

But then a question arose inside my head that how did they create the Haar cascade classifier file.

I looked it up and found that the classifiers were created using Machine Learning and Image Processing. It is from there I found myself reading about OpenCV library and machine learning.

As for the projects that I make; I am mostly inspired by films and video games.

The virtual keyboard and the gesture recognition is inspired from Jarvis, the AI from the Iron Man films. Spy movies inspired me to create a motion detection and facial recognition program.

Recently, I created an Android App to control cars using the phone’s accelerometer in NFS Most Wanted 2013 for PC. The idea for it came to me when I found my friend’s brother tilting his PS4 controller while playing NFS Rivals as if the tilt helped in turning the car more.

Godson: On your blog, you claimed to have been programming literally half of your life. Can you tell us which language you have worked with and which is your personal best and your reason?

Dibakar Saha:  Like I said earlier, I have learned programming languages like LOGO, GW-BASIC, C, C++, Java, Android, VB.NET, Shell Programming and Python.

I have also worked with HTML, PHP and JavaScript.

My personal favourite language, as of today, is Python.

The big reason for that is the flexibility with data types.

You can assign a variable with an integer in one line, and in the next line, the same variable can take a string value. That is a big advantage I found it over any other language that I know of.

Also, the abundance of libraries for Python. There are so many libraries for Python that if you search for a keyword like “natural language processing” or “machine learning” or anything using pip search, there is a 95% chance that you will find a library for the task.

Godson: What made you start programming so early in your life and how did you learn it?

Dibakar Saha: The simple answer to the question is school curriculum. This is why I owe much of my programming experience to my school. Though at first, I had difficulty in learning anything about programming but with time the same or similar kind of tasks felt much easier.

At first i.e when I was learning LOGO, I had problems drawing a simple polygon like a triangle or a square. With practice, I could draw much more complex figures like stars, circles, a TV with 2 knobs etc. In short, I practised programming a lot to get to what I am today.

Godson: How do you learn Machine learning, is it through books, YouTube videos or online courses?

Dibakar Saha:  I consider myself a total newbie in machine learning.

As of today, I created only a few programs using machine learning. I am still learning it.

As for the sources, I learn from PDFs that I find online, research papers and also YouTube videos. Some of the best places to learn machine learning are Machine Learning Mastery and Sentdex’s blog.

Sentdex has a YouTube channel by the same name. His channel has some awesome content.  You should check him out.

EvilPort Talks About His Image Processing Project

Godson: Can you tell us more about your Gesture driven Virtual Keyboard using OpenCV + Python (What are the requirement for the project, any limitations and link to the source code)?

Dibakar Saha:  The Virtual Keyboard is something that I am very proud of.

The logic is really simple.

First, you design the keyboard the way you want.

In my case, I made the keyboard so that it consisted of 26 alphabets and a space bar. The designing is done by simple maths and nothing else. I mean first, you decide the width and height of a key and where you are going to put the first key i.e ‘q’.

According to the position of the first key, you can keep on adding the key width to get the position of the next key. And to get the position of the next row you just add the key height.

Now simply loop it.

You will get the keyboard.

The function that I used to make the keys is OpenCV’s rectangle() function.

Now for the click gesture, the logic is simple here too.

First of all, what I do is Color Segmentation.

I wear a yellow (or any coloured) paper on my finger.

Using colour segmentation I can separate the yellow paper from the rest of the image. Then I take the centre of the paper. The centre acts as a pointer for which key is to be pressed.

Now if you think carefully if the area of the paper first increases and then decreases (for the camera), then it is a click. The area increases if you bring the paper closer to the camera and decreases when you bring it away from it.

This is what my program detects. Of course, there are some thresholds that are set or else every little change in area will be considered as a click.

The libraries needed for the Gesture driven Virtual Keyboard are:

  • OpenCV
  • Numpy

There are some limitations though.

Firstly, fast typing is very hard.

Secondly, you need to wear a yellow paper on your finger which is sometimes uncomfortable.

Demo video of the Gesture Driven Virtual Keyboard using OpenCV + Python

Godson: Your project, Motion Gesture Recognition within 200 lines using OpenCV +Python is amazing, can you tell us more about this project? 

Dibakar Saha: The motion gesture recognition is a simple Image Processing project.

I did not use any machine learning or deep learning or neural networks for this project.

There is both an advantage and a disadvantage.

The advantage is that you can open the program and run it right away without any training or testing.

The disadvantage is that if you want to add any new gestures then the gesture must consist of only straight lines.

Now let’s see the logic for this project.

Here again, I do colour segmentation.

So you need to wear a yellow paper or any other coloured paper. After separation of the yellow paper from the picture, I take the centre position of the paper. This centre acts as the pencil to draw the motion of the paper.

Now think about it.

If you need to make a square you need to move your hand towards North first, then towards east, then south, and then finally west.

Right?

My code actually does that. It detects the direction of movement of the paper. All the shapes that my code can detect are stored as a list of directions. For example, the letter N consists of North, South West and North.

So if the direction of movement matches with any of the shapes defined then a specific action is taken by emulating keyboard presses.

The very first version that I created supported only one hand gestures. But the present version supports two hand gestures as well.

Video Demo of this Project

The required libraries for the Motion Gesture Recognition project are:

  • OpenCV
  • Pyautogui,
  • Imutils
  • Thread

Godson: Could you tell us about the Android App you created to control cars using the phone’s accelerometer in NFS Most Wanted 2013 for PC.

Dibakar Saha: The app which I call “Need For Speed Most Wanted 2013” is the most interesting and easiest project of all three.

The basic concepts used here are:

  • Socket Programming
  • DirectInput using Python
  • Real-time reading
  • Sending of accelerometer data.

What happens here is this.

A TCP server is created on the PC which has NFS MW 2013 installed and running. The phone is the TCP client.

The phone takes the server IP address and port number to connect to the server.

Screenshot of the phone as the TCP client

The phone takes the server IP address and port number to connect to the server. The phone sends accelerometer data via sockets to the TCP server on the PC. The server now interprets the data and according to the interpretation, a DirectX input is fed to the game.

Now DirectX input is specifically needed as most of the games that are created today expect a DirectX input and not a virtual key input.

So if you are thinking to use Pyautogui then you are out of luck here. I learnt it after spending almost 6 hours on it by giving different delays and all.

The very first version of this project had very bad handling. The handling as of today is very smooth and is much better.

Video Demo of the Project

The app as of today also has some extra functionalities like changing a car, turning down phone’s screen brightness, a sensitivity factor for handling the car which can be set by the user based on his car and his usage etc.

Dibakar Saha's Android App to control cars using the phone’s accelerometer in NFS Most Wanted 2013 for PC

Dibakar Saha’s Android App to control cars using the phone’s accelerometer in NFS Most Wanted 2013 for PC

The biggest limitation for this project is that you need a Wifi Router so that you are able to connect to the PC.

The source code is available here.

Godson: It was nice talking to you EvilPort. Any word of advice for a newbie programmer?

For someone that would love to contact you, how can they?

Dibakar Saha:  For any newbie programmer I would suggest that they learn C programming language very well.

And by very well I mean learning the basics of C very well.

I am specifically talking about C as it is to me the “father of all programming languages”.

Also, if you know C you can grasp any other programming languages very easily. That is just a suggestion.

What is very important for any programmer is their problem-solving skills.

In my case, if I am facing a problem which seems unsolvable I start to break the problem into smaller sub problems. I keep on breaking it until all the sub problems can be solved. If all the sub problems are solved, the problem itself gets solved.

Also nowadays you have StackOverflow, Google, XDA-developers and all other sites which one should make full use of. This will help someone solve their problems with very much ease.

Here are my contact details-

Conclusion

Did you enjoy this post?

Let’s have the rest of the conversation via the comment section.

Should you have any contribution or question as regards this interview; kindly use the comment box below.

Try being social as well – share this post on your Social Media accounts, I will really appreciate that.

Pin15Tweet189 Shares