A summary of a technical paper attempting to mitigate gender bias in word embeddings

Image for post
Image for post
Photo by Dainis Graveris on Unsplash

Word Embeddings are the bread and butter of Natural Language Processing. But are they free from inherent bias? Imagine googling “cool programmer t-shirts” and Google responds with only male-form tees. With BERT, which has shown signs of gender bias, being incorporated in Google Search, one doesn’t need to imagine. Keeping the “intelligence” that defines AI, inherently bias-free is imperative as we continuously see increased usage of these systems in our daily lives.


Summary of an unconventional approach towards pre-trained language models to determine if they can function as general-purpose decoders

These days BERT, ELMo and Ernie reminds one of pre-trained generative models rather than Sesame Street characters, such has been their hegemony over the Natural Language Processing landscape. These models can serve as general purpose encoders, and can even perform some tasks like text classification without requiring further modification. However, limited research has been conducted on the reverse-case, exploiting these models for use as general purpose decoders. This article is a summary of this paper by researchers at New York University which tries to ascertain exactly this, whether these models can recover an arbitrary sentence from its encoded representation.

Abstract

In order to prove the existence of encoded representations that can be used for recovering a sentence, the paper introduces methods to feed these representations into a recurrent language model trained autoregressively as well as map sentences into and out of this “reparametrized” space, while keeping the main language model parameters frozen. …


An insight into the state-of-the-art ranking systems that can be used for Information Retrieval.

Image for post
Image for post
Ranking of documents is an imperative task in information retrieval. (Photo by Florian Schmetz on Unsplash)

Machine Learning and Artificial Intelligence are currently driving innovation in the field of Computer Science and they are being applied on a multitude of fields across disciplines. However, traditional ML models can be still be broadly categorized into solutions of two types of problems.

  1. Classification — Which aims at labelling a particular instance of data into buckets, depending on various features.
  2. Regression — Where we desire to get a continuous real number as the output for a given feature set.

One relatively less explored application of Machine Learning is the ordering of data by its relevance, which becomes useful in Information Retrieval systems like search engines. These types of models focus more on the relative ordering of items rather than the individual label (classification) or score (regression), and are categorized as Learning To Rank models. …

About

Devansh Goenka

Full-time Software Engineer, Machine Learning enthusiast and foodie.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store