1. 程式人生 > >Sumit Raj Discusses Python Web Development, Chatbot, Natural Language Processing and Python Development in India

Sumit Raj Discusses Python Web Development, Chatbot, Natural Language Processing and Python Development in India

Sumit Kumar Raj is currently working as the principal engineer in LodgIQ. He lives in India and studied in Bangalore.

His interest is in building bots, natural language processing, machine learning and data science.

He is the top most viewed writer on Python web framework on Quora.

You can see more information about Sumit Kumar Raj on his resume at the end of the discussion.

The conversation is a bit lengthy because it is very informative, but is interesting as well. Enjoy!

Don’t forget to join my newsletter to get more exclusive contents that I only share with my subscribers.

PHOTO OF SUMIT RAJ.

Sumit Raj at Pycon Pune 2017

GODSON: Good day Sumit Raj, thanks for taking time out for this interview. Could you please kindly tell us about yourself like your full name, hobbies, nationality, education, and experience in programming.
SUMIT RAJ: Hi Godson, Thanks for having me. My full name is Sumit Kumar Raj and I like watching Cricket, Swimming and learning new stuff. I am an Indian by nationality and have completed my graduation with major in Information Science Engineering. I started coding well before my graduation because I always wanted to learn it irrespective of anything. So, I can say I have more than 5+ years of experience in programming and building applications. GODSON: So which type of application do you build? Is it web, GUI?
SUMIT RAJ: Most of my development work requires me to work on the web. I like web development more as it requires you to think broadly and gives you a scope of end-to-end development. GODSON: What are the major challengings of learning programming especially python in India and how did you overcome them? SUMIT RAJ: When I started learning programming that was almost 10 years back. I come from a very small town in a state called Bihar in India where we had very limited resources. People used to wish for 24*7 electricity, so broadening the learning was the biggest challenge. You could not have learned anything unless you have expert people around you. But then, after a couple of years, I moved to the Silicon Valley of India, Bangalore to pursue my graduation. There is a great community of Python developers in Bangalore called BangPypers. They help you very proactively to learn Python. This is also the reason why I also like to help beginners. Python’s biggest strength and advantages over other languages is its community and the culture of giving back to the community. I had my arms and ammunition with me. One laptop and internet and I never stopped from there. I never pigeonholed myself on college curriculum to learn to program but I started attending conferences, meetups and started learning online myself. I used to do a lot of courses on coursera. Any programming term that I use to come across which is unheard to me, I used to pause reading or the video and used to Google it to know about it. This is really painful but trust me it really helps you in long term. So, never skip even the smallest thing that you don’t know about the thing you are passionate about. GODSON: How can someone in India join BangPypers? GODSON: So are you a full stack developer or you are a back-end or front-end developer? SUMIT RAJ: I am a Full Stack Developer with more weight on Backend technologies.
GODSON: Are you trying to say if you follow the school curriculum, you won’t end up being a good python programmer?

SUMIT RAJ: More or less yes. I know I am being candid here but that’s the truth unless the college curriculum and teaching techniques are not ages old.

College system forces you to be in a timeline with fix set of things to learn. You can learn the basics but there are lots of things that you need to know outside the curriculum to be a better programmer.

Also, In India, not so many graduation schools have Python in their syllabus. In my case, I got to know about it when I started exploring Hacking techniques and ended up using Python and fell in love with it.

GODSON: So which python web framework is your favorite and why is it? SUMIT RAJ: Python has a variety of frameworks and it’s quite difficult to choose one and call it your favorite. Well, I started with Flask and liked it a lot because it’s very simple to setup and build things on. At the same time, based on your need for building bigger and complex web applications Django is preferred as it comes with inbuilt ORM and templating engine.
GODSON: To avoid our readers from searching google, can you tell us what ORM and templating engine mean? SUMIT RAJ: ORM:-ORM stands for Object Relation Mapping which provides a very high-level abstraction upon a database that helps developers concentrate of writing code instead of database queries to create, read, update and delete data and schemas in their database. It’s just like a Google Translate box to understand/interact with all languages of the world with the need of learning them. Templating Engine:-
Quoting from Django documentation, Being a web framework, Django needs a convenient way to generate HTML dynamically.

The most common approach relies on templates.

A template contains the static parts of the desired HTML output as well as some special syntax describing how dynamic content will be inserted.

For a hands-on example of creating HTML pages with templates. Now, in simple terms to understand “templating engine”, suppose you have to build 1000 houses of the same architecture, the templating engine helps you define the static parts of it(like the master bedroom, living room, bathroom, balcony). You can build them for once and use them for others with the facility that one may have a recliner in the living room and others many just have a couple of bean bags.
GODSON: Which web framework should you advise a beginner to start with and why? SUMIT RAJ: I strongly recommend starting with Flask. The reason is it’s lightweight, easy to install/setup and get it working. Even though you know that you are going to heavily work on Django learn Flask at least for a week. One week is enough for Flask and then move to Django as a majority of the companies tend to choose Django as their framework.
GODSON: So when it comes to web development, is python the top language you can use? Can you use Python for both front-end and back-end? SUMIT RAJ: No, I do not say that Python is the top language. Every other language has its own charm. But yes, Python is the talk of the town, you can’t ignore it when it comes to very quick development and productivity. Yes, for front-end there are some GUI tools like Tkinter, wxPython etc and for backend, you can use Python with any of the web frameworks we talked about. For web application front-end you need to use HTML/CSS/Javascript etc.
GODSON: How would you compare Django and Flask as tools for prototyping and building products that may be rewritten later? SUMIT RAJ: Both are good but for rapid prototyping but I would like to go with Flask, Reason is you don’t wanna grapple over so many complexities at first hand unless you are a pro in Django. Flask can have all the things that Django may provide you in itself. Most of the prototype that we build are mostly read from a database with very minimal writes or with stub data. This is where Flask can be handy where you can work without the need of defining models or ORM. I strongly suggest to assess the need to the Web application you are building, it may be possible that you build the prototype in Flask but it’s better to re-write it in Django.
GODSON: So apart from web development, is there any project you are working on like artificial intelligence, machine learning etc SUMIT RAJ: Yes, On the professional front we use AI/ML techniques heavily for predicting hotel prices at our company. On personal projects standpoint, I am working on AI based chatbots which require ML techniques. Currently, I am not able to give much time to it because of my day job but very soon I will come up with it.
GODSON: Which database technologies do you use and how do to determine which one is best for any project you want to do? SUMIT RAJ: We use MongoDB in our organization for all purposes. It’s a question to figure out the best database for any project as a different project will have its own requirement. If you are to deal with Big Data where you have to do very large scale data processing, you can go with MongoDB/Cassandra. The biggest challenge with any Data Analytics startup is inconsistent data. MongoDB is schemaless and works better in such scenarios. But I must tell MongoDB’s read is good but writing is not so good when you scale. If you are developing a web application with lots of user based interaction then MySQL/Postgres would be good to go. Both of them work like charm with Django.
GODSON: According to your resume on Quora you wrote your interest is in natural language processing, building bots, machine learning, and data science, application scalability.
Have you developed yourself in these fields or you are still working on them? SUMIT RAJ: That’s a pretty good question.
Yes, I started getting into NLP(Natural Language Processing) since my graduation school. I really liked to work in NLP and talk about it.
I keep doing my research on it to learn more about it from application perspective.
To be clear, I am not an ML (Machine Learning)expert. I have worked closely with ML and Data Science team and I understand pretty well how this stuff works but not the underlying maths.
I have been able to develop myself to some good extent because I have not just gone through the tech blogs but I have got my hands dirty with it.
There are still lots of things I am working on to improve when it comes to application architecture & scalability.
GODSON: So how do you learn python and keep current with the latest changes in python libraries?
SUMIT RAJ: Good question. What I do it take help of Social Media/Meetup Groups/Blogs and Mailing List.
  • Follow popular Python developer on Twitter.
  • Read blogs written by Python authors.
  • Follow and read email threads of mailing lists.
  • Read Engineering blogs of companies who use Python.
  • Join Python related groups on Facebook to ask and to answer questions.
Here are some of the links that might be helpful:-
Follow Daniel Roy GreenfeldCo-Author of Two Scoops of Django, open source developer and Follow Raymond Hettinger for cool Python Tricks. Raymond is a Python core developer, freelance programmer, consultant, and trainer. You can also check his blog.Subscribe to Weekly Python Newsletter. There are always lots of stuff on the web worth learning which keeps you updated about the latest. Do follow them and read regularly.
GODSON: With machine learning, data science and artificial intelligence, Do you think programmers will make a lot of people to use their jobs as computers will do the work? And do you advise anyone, no matter your profession to learn at least a programming language?
SUMIT RAJ: Yes, definitely. There are a plethora of work in software domain which can be automated very easily and with the power of AI and ML the automation will be intelligent enough to replace jobs which require mundane tasks to be done.
No really, If you are a professional swimmer/driver/soccer player/businessman etc what help will learn a programming language would do for you unless you want to use it.
GODSON: Are there good paying jobs for a programmer in India?
SUMIT RAJ: Yes, there are. India is growing very fast and the amount of talent it has which definitely comes at a low cost when compared to the US.
But low for the US converts high for India. So, most of the companies having their offices in India pay a good salary.
If you are a kickass programmer who can build things, you can easily earn 1.5K-2K/month in India which good enough to have a stylish life.
People even earn higher having good problem solving and implementation skills.
GODSON: Apart from python, which other language do you use? And what purpose do you use them?
SUMIT RAJ: Being frank, never felt the need to any other language for any of my work, be it Web Development/Scripting/Scraping/Data Science/Automation.
But, the only language I would want to explore after Python would be Go. I see most of the Python programmers are curious to explore this.
GODSON: So can you tell us some projects you have worked on and which was the most challenging and why?
SUMIT RAJ: Yes, most challenging and interesting project I was working on was to build a voice based chatbot for the Real-Estate domain. This was while I was working at CommonFloor.com
We built a quick prototype to solve real-estate queries using a Voice Based assistant. We named it AIRA(Artificial Intelligent Real-estate Assistant).

You could just talk to her and she will help to find a home and resolve your queries.
The biggest challenge to productionize this was to be able to identify various location names in India and also to maintain the context of the user query.
You can watch the demo of the app here:-
GODSON: Can you explain Scripting, Scraping and Automation to avoid our readers from looking it up on google?
SUMIT RAJ: Scripting is just a simple program. It could be a single file code to do various develops work or to do any specific tasks/assignments.
Scraping: – Lots of companies around the world heavily rely on scraping the data from the web. 
It means to get the data from the websites which do not provide any API but has the public data to be accessed by anyone.
So, to use that you scrape the data from their website, clean it and save it in your database to use if further without the need of that particular website.
Automation: – is a cool thing that Python does so fantastically. It means to automate any mundane task, anything that you feel machine can do without any human intervention.
There are various tools in Python to automate stuffs like testing the UI, Auto Scaling the servers. Automation reduces product risks and cost.
GODSON: Do you use books to learn python? And can you recommend the best you have seen so far?
SUMIT RAJ: Yes, I did read some of the books.
If you want to follow a book I would suggest starting with “A byte of Python” by Swaroop CH.
My best one goes with the video, not the book, though:-


Google Python Classes by Nick Parlante. It’s pretty old but good to start with.

I would also recommend going with Code Academy Python classes to develop an interest in Python.
I have answered such question multiple times on Quora with various links.
Feel free to go through them.

GODSON: So to get a good web development job in India, what skills do someone need.
Will they be limited if they only know Django or they must be familiar with Javascript frameworks and libraries like Angular or React? 
Also which database technology should the individual be familiar with?
SUMIT RAJ:  India has tons of IT giant companies as well as hundreds of startups. They all use different technologies as per their need which results in all kinds of software jobs be it in Java, .Net, RoR, Python, C, C++ etc.
You just have to be expert in one to get a good job in web development.
Yes, you will be limited if you just know one framework or language because for web development you need to understand how it works.
Even though you are a backend developer, it’s very critical that you know how your APIs or web services will be consumed to build the web application. Any database is fine but I recommend people to know at least one structured database (MySQL/Postgres) and one NoSQL database(MongoDB).
GODSON:  Someone wants to know, how do you create chatbots? Which python module should the individual use? And can you provide that individual with helpful resources?
SUMIT RAJ:  If want to build a voice based chatbot in relation with IoT then check AVS. Here are some of the resources and Python modules you may wanna look at to build chatbots.

NLTK:- The most popular module in NLP(Natural Language Processing) using Python. A lot of popular apps and startups are using this to power their NLP apps.NLTK 3.0 documentation.
TextBlob: If one has the basic understanding of Stemming, Lemmatizing, POS tagging and Python then he/she can easily develop NLP-based applications on top of TextBlob quickly. TextBlob has very easy APIs to train the model as well. TextBlob: Simplified Text Processing.
OpenNLP: – One of the popular NLP libraries in the domain by Apache Software Foundation. This is the favorite library for Java users.
Other than these while doing the R&D I have found a lot of modules using NLP and providing NLP APIs:-


Spacy:-
On the Recursive Neural Networks for Relation Extraction and Entity Recognition http://web.engr.illinois.edu/~khashab2/files/2013_RNN.pdf

GODSON: When it comes to cyber security, does Python have any feature for that? SUMIT RAJ: I am not a cyber security expert but yes Python rocks when it comes to security.

I don’t think it has anything built-in but the security community writes a lot of open source modules using Python which helps in various security needs like Pen Testing, Malware Analysis, Reverse Engineering and Exploitation Techniques.

GODSON: Do you have any mentor in python and can you briefly describe him/her? SUMIT RAJ: Not really, I would always suggest you be your own mentor. Read everything, listen to everyone. Assume everyone knows at least one thing better than you and you have to learn that thing from him/her, this is what I do. Nobody can suggest you better than yourself. Always do what you love. I don’t believe in loving what you do, what I mean by that is let the thing attract you, don’t push yourself just because everyone is liking it. I can’t really name anyone but I am very much thankful to the Python community in India for relentlessly organizing free workshops/meetups and low-cost conferences which helped me as a newcomer learn Python very easily. You can always reach out to anyone in the community and they will always respond to you. GODSON:  Can python be applied to Information Science Engineering and how? SUMIT RAJ: It’s is already prevailing. It’s all semantic web and a huge volume of information is being generated every second. Python in Machine Learning, Data Science, Data Mining, Mathematics and Physics for new innovations are example of it being applied already. GODSON: How do you handle bugs in your program? Do you have any project that gave you a tough time because of a bug and can you narrate the experience? SUMIT RAJ: Bugs are part of any software. My approach is always to build a quick and dirty model and improvise on top of that. I always tend to write code in modules which are further classified by functionality which makes bug fixing easier after the first cut of your software. Yes, it’s quite interesting. It’s was our final year project in graduation and we were building a Sentiment Analysis tool in Python using Twitter APIs. Everything was all good. The application was working and ready for the demo a week before itself.  The night before the final demo our application stopped working and my teammates were breathing on my shoulders to get it fixed because I was most familiar with Python. We tried whole night but couldn’t figure out what was the problem till 4:30 AM in the morning. Around 5:00 AM we found one article on the web saying that Twitter has deprecated their old API (the one which we were using) and the way of consuming the free API has been changed just a couple of days back. We did everything on our own and had nobody to reach out for help. The new APIs were little finicky and took time to get hold of. We had just a couple of hours left for the demo in the morning but after grappling over the issue whole night without any sleep, we could finally fix the bug and successfully did the demo in the morning. GODSON: So when developing a web application, do you start from the front-end and move to the back-end or vice versa? Which pattern do you do and why? SUMIT RAJ: Most of the development starts in parallel with at least two developers working. One in the front-end and one for the backend.
Even if a Full Stack developer is working on both, it’s not a matter of which starts first but the specification which is defined intuitively. You can either start from front-end or from back-end or both in parallel but you should know what details you need to show in the front-end so that backend guy works on that but meanwhile, front-end guy is not blocked either.
He/She can still work with stub data to build their things as per the specs.
Same for backend guy, they should know the format of the data front-end is expecting and should come up with APIs/Web Services to return dynamic data in the same format to be consumed as per the specs defined. GODSON: Sumit Raj is nice having you on my blog.Thanks for sharing these great resources and knowledge to the python community. Any last words for my esteemed readers? And also do you mind telling them how they can contact you? SUMIT RAJ: Sure, Thanks a lot for having me. One thing that I would convey is that don’t use shortcuts while learning if you really wanna learn and benefit from something. Always avoid negativity and seek positivity everywhere. I will be very happy to help any of your blog readers. They can shoot me a mail  or ask direct question at Quora For more of me, you may just wanna drop-in here:- http://sumitraj.in

More Photos of Sumit Raj.

sumit raj talks on Natural Language Processing at one of the premier graduation schools of India at Indian Institute of Science, Bangalore. sumit raj pycon india at Bangalore 2014 If this interview was informative, please kindly share. Feel free to leave a comment. 

THANKS.

TweetPin131 Shares

相關推薦

Sumit Raj Discusses Python Web Development, Chatbot, Natural Language Processing and Python Development in India

Sumit Kumar Raj is currently working as the principal engineer in LodgIQ. He lives in India and studied in Bangalore. His interest is in building bots

Hands-Natural-language-processing-python 1: NLTK

基本用法: >>> from nltk.tokenize import word_tokenize as wtoken >>> wtoken(samples_tw[20]) >>> from nltk.stem import Porter

Build a chatbot moderator for anger detection, natural language understanding, and removal of explicit images

Summary Learn how to build a chatbot that monitors for angry or inappropriate messages and explicit images. This code pattern explain

Natural Language Processing for Fuzzy String Matching with Python

Fuzzy string search can be used in various applications, such as:A spell checker and spelling-error, typos corrector. For example, a user types “Missisaga”

Create a web-based chatbot with voice input and output

Summary We all know that chatbots are AI's answer to improved customer service and cost savings. Chatbots are available in many user

🚀 100 Times Faster Natural Language Processing in Python

So, how can we speed up these loops?Fast Loops in Python with a bit of CythonLet’s work this out on a simple example. Say we have a large set of rectang

論文閱讀:A Primer on Neural Network Models for Natural Language Processing(1)

選擇 works embed 負責 距離 feature 結構 tran put 前言 2017.10.2博客園的第一篇文章,Mark。 由於實驗室做的是NLP和醫療相關的內容,因此開始啃NLP這個硬骨頭,希望能學有所成。後續將關註知識圖譜,深度強化學習等內

Coursera, Deep Learning 5, Sequence Models, week2, Natural Language Processing & Word Embeddings

roc learn 做了 eat del sin img feature enc Word embeding 給word 加feature,用來區分word 之間的不同,或者識別word之間的相似性.               

語言模型和RNN CS244n 大作業 Natural Language Processing

語言模型 語言模型能夠計算一段特定的字詞組合出現的頻率, 比如:”the cat is small” 和 “small the is cat”, 前者出現的頻率高 同樣的,根據前面所有的字詞序列資訊, 我們可以確定下一個位置某個特定詞出現的頻率, 豎線左邊表示下一個出現詞

CS224n: Natural Language Processing with Deep Learning 學習筆記

課程地址:http://web.stanford.edu/class/cs224n/ 時間:2017年 主講:Christopher Manning、Richard Lecture 1: Introduction NLP:Natural language processing 常見

Recent Trends in Deep Learning Based Natural Language Processing(arXiv)筆記

深度學習方法採用多個處理層來學習資料的層次表示,並在許多領域中產生了最先進的結果。最近,在自然語言處理(NLP)的背景下,各種模型設計和方法蓬勃發展。本文總結了已經用於大量NLP任務的重要深度學習相關模型和方法,及回顧其演變過程。我們還對各種模型進行了總結、比較

Investing in AI: When natural language processing pays off

Investing in AI: When natural language processing pays offFor the past 18 months, my teams at Acxiom Research have worked extensively with a specific form

See this simple introduction to Natural Language Processing (NLP)

Today, with Digitization of everything, 80 percent the data being created is unstructured. Audio, Video, our social footprints, the data generated from co

natural language processing blog: finite state methods

(Can you tell, by the recent frequency of posts, that I'm try not to work on getting ready for classes next week?)[This post is based partially on some co

natural language processing blog: information retrieval

Due to a small off-the-radar project I'm working on right now, I've been building my own inverted indices. (Yes, I'm vaguely aware of discussions in DB/W

natural language processing blog: Yet another list of things we can do to have more diverse sets of invited speakers

Great post Hal, and very timely as we start to consider such issues for NAACL 2019. I think disclosing conflicts of interest between those who are doing

natural language processing blog: structured prediction

Ellen Riloff and I run an NLP reading group pretty much every semester. Last semester we covered "old school NLP." We independently came up with lists o

natural language processing blog: machine translation

Happy new year, all... Apologies for being remiss about posting recently. (Can I say "apologies about my remission"?) This post is a bit more of a revie

natural language processing blog: Many opportunities for discrimination in deploying machine learning systems

A while ago I created this image for thinking about how machine learning systems tend to get deployed. In this figure, for Chapter 2 of CIML, the left co

Deep Learning for Natural Language Processing Archives

Machine translation is the challenging task of converting text from a source language into coherent and matching text in a target language. Neural machine