Presentation: Hands-On Feature Engineering for Natural Language Processing
This presentation is now available to view on InfoQ.com
Watch video with transcriptAbstract
Think of Grammarly, Autotext and Alexa, as many applications in software engineering are full of natural language, the opportunities are endless. The latest advances in NLP such as Word2vec, GloVe, ELMo and BERT are easily accessible through open source Python libraries. There is no better time for software engineers to develop NLP applications.
Feature Engineering is the secret source to creating robust NLP models, because features are the input parameters for NLP algorithms. These NLP algorithms generate output based on the input features.
The aim of this talk is to share various NLP feature engineering techniques from Bag-Of-Words to TF-IDF to word embedding, that includes feature engineering for ML models as well as feature engineering for emerging deep learning approach.
The talk will cover the end-to-end details including contextual and linguistic feature extraction, vectorization, n-grams, topic modeling, named entity resolution which are based on concepts from mathematics, information retrieval and natural language processing. We will also be diving into more advanced feature engineering strategies such as word2vec, GloVe and fastText that leverage deep learning models.
In addition, attendees will learn how to combine NLP features with numeric and categorical features and analyze the feature importance from the resulting models.
The following libraries will be used to demonstrate the aforementioned feature engineering techniques: spaCy, Gensim, fasText and Keras in Python.
Similar Talks
Front End Architecture in a World of AI
Front End Architect @oqtonai
Thijs Bernolet
Machine-Learned Indexes - Research from Google
Senior Research Scientist @Google
Alex Beutel
Using AI to Optimize SQL Query Plans and Performance
Sr. Field Engineer @pepperdata