Skip to Content
Technical Articles
Author's profile photo Sanjay Abraham

“Cell states” in Long Short Term Memory (LSTM) – Artificial Neural Networks functioning closer like the human brain

One of the shortfalls of the Recurrent Neural Network (RNN) is that of creating models to solve problems with long term dependencies.

RNN tends to forget information, reference & context which make it unsuitable for such problems.

RNNs are good at handling sequential data but they run into problems when the context is ‘far away’.

Example:

I live in France for the last 20 years and I know ____. The answer must be ‘French’ here

but

if there are some more words in-between ‘I live in France for the last 20 years and work as a consultant for the last 10 years ‘I know ____’. It’ll be difficult for RNNs to predict ‘French’ as it gets lost in multiple contexts.

Vanishing Gradient is a problem with RNN where even a small change in parameters in the input brings huge changes at the output. Unlike RNN which remembers or forgets information in bulk, LSTM does it selectively using a mechanism called “cell states”.

“Sequence Prediction Problems” can be really complex at times and LSTM can selectively remember patterns for a longer time.

 

It works like a ‘conveyor belt’ where information can be added, deleted, or modified as it moves between different layers. This solves the vanishing gradient problem.

LSTM so far is the most successful RNN for solving complex problems. What makes it a successful Artificial Neural Network topology, is its resemblance in functioning like that of the human brain by selectively remembering or forgetting information based on the context.

The biggest advantage of LSTM as it works closely like the human brain is that it could be used to augment design with rapid prototyping & testing. Design of products, materials, structures, etc.

It can design toxic to life-saving ( new drugs discovery with better precision to improve prognosis). It can design food ( with higher nutritional value, that tastes better, costs lesser & stays fresh for longer).

It can design circular from linear ( design out waste & pollution with better material/product design & business models). It can design school dropouts to learners (design community-centric curricula & pedagogy). Usecases & applications are many. It could correct design flaws in our existing materials, products & business models, etc.

It has also been proven very effective in areas like NLP  wherein it trains through a sequence of data to predict the most suitable output.

For product design replace that dataset of millions of phrases/ sentences with a database of millions of characteristics/formulae of chemicals/ metals/ polymers etc. LSTM will predict the most suitable alloy/material. Replace millions of circuit/design it will predict the most suitable architecture.

The possibilities are endless.

Assigned Tags

      Be the first to leave a comment
      You must be Logged on to comment or reply to a post.