QCon New York June 15-19, 2020 | | Hands-On Feature Engineering for Natural Language Processing

This presentation is now available to view on InfoQ.com

Abstract

Think of Grammarly, Autotext and Alexa, as many applications in software engineering are full of natural language, the opportunities are endless. The latest advances in NLP such as Word2vec, GloVe, ELMo and BERT are easily accessible through open source Python libraries. There is no better time for software engineers to develop NLP applications.

Feature Engineering is the secret source to creating robust NLP models, because features are the input parameters for NLP algorithms. These NLP algorithms generate output based on the input features.

The aim of this talk is to share various NLP feature engineering techniques from Bag-Of-Words to TF-IDF to word embedding, that includes feature engineering for ML models as well as feature engineering for emerging deep learning approach.

The talk will cover the end-to-end details including contextual and linguistic feature extraction, vectorization, n-grams, topic modeling, named entity resolution which are based on concepts from mathematics, information retrieval and natural language processing. We will also be diving into more advanced feature engineering strategies such as word2vec, GloVe and fastText that leverage deep learning models.

In addition, attendees will learn how to combine NLP features with numeric and categorical features and analyze the feature importance from the resulting models.

The following libraries will be used to demonstrate the aforementioned feature engineering techniques: spaCy, Gensim, fasText and Keras in Python.

Speaker: Susan Li

Sr Data Scientist at Kognitiv Corporation

I am Susan Li, the Sr. Data Scientist at Kognitiv where I specialize in machine learning and NLP. I’m passionate about helping organizations realize the potential of big data and advanced analytics, and helping individuals enhance skills in data literacy. I frequently write and speak about predictive analytics, machine learning and NLP for technical and general audience. In my free time, I can be found training for the next half marathon.

Find Susan Li at

Speaker page

Medium

https://www.linkedin.com/in/susanli/

Track: Machine Learning for Developers

Location: Soho Complex, 7th fl.

Duration: 4:10pm - 5:00pm

Day of week:

Slides: Download SlidesNEW!

This presentation is now available to view on InfoQ.com

Abstract

Find Susan Li at

Similar Talks

Front End Architecture in a World of AI

Thijs Bernolet

Machine-Learned Indexes - Research from Google

Alex Beutel

Using AI to Optimize SQL Query Plans and Performance

Kirk Lewis

Tracks

Tracks

Non-Technical Skills for Technical Folks

Trust, Safety, & Security

Data Engineering for the Bold

Machine Learning for Developers

Software Defined Infrastructure: Kubernetes, Service Meshes, & Beyond

21st Century Languages

High-Performance Computing: Lessons from FinTech & AdTech

Modern Java Innovations

Architecting for Success when Failure is Guaranteed

Building High-Performing Teams

Human Systems: Hacking the Org

Modern CS in the Real World

Developing/Optimizing Clients for Developers

Microservices / Serverless (Patterns & Practices)

Architectures You've Always Wondered About

Follow QCon

Contact

Menu

QCons around the World

Presentation: Hands-On Feature Engineering for Natural Language Processing

Track: Machine Learning for Developers

Location: Soho Complex, 7th fl.

Duration: 4:10pm - 5:00pm

Day of week:

Slides: Download SlidesNEW!

More talks on:

This presentation is now available to view on InfoQ.com

Abstract

Find Susan Li at

Similar Talks

Tracks

Tracks

Follow QCon

Contact

Menu

QCons around the World