In statistics we have various types of test to validate the a single group or multiple groups.Here we can say a group as a feature in the data set.These features can be either categorical or numerical features.

With the help of T-test and Chi-square test we can conclude that if…

Deep learning is a subset of machine learning in artificial intelligence (AI) that has networks capable of learning unsupervised from data that is unstructured or unlabeled. Also known as deep neural learning or deep neural network

Categories in Deep Learning:

Deep learning can be broadly divided into three major categories.

In the previously discussed topics ,in Bag of Words (BOW)& TF-IDF approach, semantic information is not stored. Here BOW give equal preference to each words in corpus where as TF-IDF gives importance to uncommon words.

Semantic means that in a sentence the order & relation of words are important. Like…

As we know from my previous article of Bag of Words,we convert sentences into vectors of words through BOW which converts words as either 0 (when word is not there in sentence) or 1 (when word is there).

Bag of Words just creates a set of vectors containing the count…

When dealing with corpus we come across multiples words which we use in Natural Language Processing(NLP) applications to get meaningful insight .To do that we need to convert those word into something which model can understand. We are going to discuss here some thing know as “Bag of Words”.

Natural Language is the language which is human readable like text, messages. Processing these languages by machine for the use of different applications is called as Natural language Processing or NLP.

Some Practical example of NLP is Sentiment Analysis, Analyzing Restaurant Reviews, google/Alexa voice search which converts speech into text…

Cosine similarity is used to determine the similarity between documents or vectors. Mathematically, it measures the cosine of the angle between two vectors projected in a multi-dimensional space.There …

The ANOVA(one way Anova specifically here) test in statistics is used to determine whether there are any statistically significant differences between the means of three or more independent (unrelated) groups and draw a conclusion based on that for any test.

There are various distributions types used in machine learning to describe the distribution of data in a population or sample of data set.In machine learning it is used to visualize the data distribution and outliers detection in data set.

Topics covered

Gaussian Distribution

Z-Distribution

T- Distribution

Normal Distribution or Gaussian Distribution or Bell Curve: