A lot more happened even before the existing of perceptron. In 1943, Warren McCulloch and Walter Pitts proposed a simple model of biological neuron.
The above diagram is a rough representation of a biological neuron. A neuron has a cell body that receives input signals via tree like things called dendrites. It then performs computational operations on those inputs and based on the computational results, it then passes the electric signals through the long stretched branch of dendrites called axion. It then reaches the branching terminals called axon terminals of the neighboring neurons. This is how all the neurons are interconnected with one another and forms a biological neuron.
McCulloch and Pitts have applied the same logic and invented an artificial neuron. This neuron does all kinds of digital computations with the help of boolean operators like AND, OR, NOT, etc.
I will demonstrate the concept of MCP neuron using the above diagram. Here MCP neuron always follows a binomial distribution hence x1, x2 only can be either 0 or 1.
mathematically, x1, x2 ∈ {0,1}
To compute y, we sum up both x1 and x2 and check if it is ≥ threshold (θ)
i.e.,
if (x1 + x2) ≥ θ ; y = 1,
if (x1 + x2) < θ ; y = 0
what if we have multiple variables (i.e., more than two input variables) then, g(x) is the function where it can sum up all the input variables.
mathematically we can represent it as
g(x) = x1 + x2 + x3 +…+xn = Σxi
then,
y = f(g(x)) =
if g(x) ≥ θ, → y = 1
if g(x) < θ, → y = 0
with one artificial neuron, we can design some basic boolean logic gates such as AND, OR. In AND gate, both x1, x2 need to be true to get y = 1, otherwise y will be 0. But in OR gate, either x1, or x2 need to be true, to get y = 1 otherwise y will be 0.
Imagine a small neuron has
x1 = 1, x2 = 1 → input variables
w1 = 1, w2 = 1 → weights based on how important they are
θ = 2 → threshold
g(x) = x1(w1) + x2(w2)
= 1(1) + 1(1)
= 2 ≥ θ
Lets consider we have two neurons A and B:
Neuron A:
x1 = 1, x2 = 1, x3 =1
w1 = 1 (excitatory), w2 = 1 (excitatory), w3 = -2(inhibitory),
θ = 2
g(x) = 1(1) + 1(1) + 1(-2)
= 0 < θ
then y = 0
This means that even though the two inputs x1, x2 say ‘yes’ but x3 says ‘no’ so, ultimately neuron’s output will be 0.
Neuron B:
Neuron A’s output will be fed as the input for Neuron B.
=> x4 = 0 (since the output for neuron A is 0)
=> x5 = 1 (new input)
=> w4 = 1 (excitatory)
=> w5 = 1 (excitatory)
θ = 1
if g(x) ≥ θ, then y = 1
It is a great mathematical proof, that such neural network architecture could perform any kind of computations.
The weights and θ (threshold) are adjusted manually by engineers, but neuron can’t adjust weights itself. If the configuration you passed is correct, then it computes great, but it itself cannot learn anything from the data. So its a calculator but not a thinker. if we have a predefined logic, we can use this for computations. But it is useless for finding the patterns on its own.
In 1958, Frank Rosenblatt’s perceptron has brought a revolutionary change in this aspect.
when the data is fed to the perceptron, it automatically adjusts weights just by looking data. The weights remember patterns. ex: big horsepower means speed car. In MCP, θ is fixed by hand by users, but in the perceptron, it learns what it works which acts like brain.