Adonis Diaries

Posts Tagged ‘sciences

Neural Network? Sciences or networking? 

I have taken a couple of graduate courses in neural network at its beginning in 1989 and its modeling and how experiments are done and interpreted using this psychology computer learning algorithm.

What Does a Neural Network Actually Do?

There has been a lot of renewed interest lately in neural networks (NNs) due to their popularity as a model for deep learning architectures (there are non-NN based deep learning approaches based on sum-products networks and support vector machines with deep kernels, among others).

Perhaps due to their loose analogy with biological brains, the behavior of neural networks has acquired an almost mystical status. This is compounded by the fact that theoretical analysis of multilayer perceptrons (one of the most common architectures) remains very limited, although the situation is gradually improving.

To gain an intuitive understanding of what a learning algorithm does, I usually like to think about its representational power, as this provides insight into what can, if not necessarily what does, happen inside the algorithm to solve a given problem.

I will do this here for the case of multilayer perceptrons. By the end of this informal discussion I hope to provide an intuitive picture of the surprisingly simple representations that NNs encode.

I should note at the outset that what I will describe applies only to a very limited subset of neural networks, namely the feedforward architecture known as a multilayer perceptron.

There are many other architectures that are capable of very different representations. Furthermore, I will be making certain simplifying assumptions that do not generally hold even for multilayer perceptrons. I find that these assumptions help to substantially simplify the discussion while still capturing the underlying essence of what this type of neural network does. I will try to be explicit about everything.

Let’s begin with the simplest configuration possible: two inputs nodes wired to a single output node. Our NN looks like this:

Figure 1

The label associated with a node denotes its output value, and the label associated with an edge denotes its weight. The topmost node h represents the output of this NN, which is:

h = f\left(w_1 x_1+w_2 x_2+b\right)

In other words, the NN computes a linear combination of the two inputs x_1 and x_2, weighted by w_1 and w_2 respectively, adds an arbitrary bias term b and then passes the result through a function f, known as the activation function.

There are a number of different activation functions in common use and they all typically exhibit a nonlinearity. The sigmoid activation f(a)=\frac{1}{1+e^{-a}}, plotted below, is a common example.

Figure 2

As we shall see momentarily, the nonlinearity of an activation function is what enables neural networks to represent complicated input-output mappings.

The linear regime of an activation function can also be exploited by a neural network, but for the sake of simplifying our discussion, we will choose an activation function without a linear regime. In other words, f will be a simple step function:

Figure 3

This will allow us to reason about the salient features of a neural network without getting bogged down in the details.

In particular, let’s consider what our current neural network is capable of. The output node can generate one of two values, and this is determined by a linear weighting of the values of the input nodes. Such a function is a binary linear classifier.

As shown below, depending on the values of w_1 and w_2, one regime in this two-dimensional input space yields a response of 0 (white) and the other a response of 1 (shaded):

Figure 4

Let’s now add two more output nodes (a neural network can have more than a single output). I will need to introduce a bit of notation to keep track of everything. The weight associated with an edge from the jth node in the first layer to the ith node in the second layer will be denoted by w_{ij}^{(1)}. The output of the ith node in the nth layer will be denoted by a_i^{(n)}.

Thus x_1 = a_1^{(1)} and x_2 = a_2^{(1)}.

Figure 5

Every output node in this NN is wired to the same set of input nodes, but the weights are allowed to vary. Below is one possible configuration, where the regions triggering a value of 1 are overlaid and colored in correspondence with the colors of the output nodes:

Figure 6

So far we haven’t really done anything, because we just overlaid the decision boundaries of three linear classifiers without combining them in any meaningful way. Let’s do that now, by feeding the outputs of the top three nodes as inputs into a new node.

I will hollow out the nodes in the middle layer to indicate that they are no longer the final output of the NN.

Figure 7

The value of the single output node at the third layer is:

a_1^{(3)} = f \left(w_{11}^{(2)} a_1^{(2)}+w_{12}^{(2)} a_2^{(2)}+w_{13}^{(2)} a_3^{(2)}+b_1^{(2)}\right)

Let’s consider what this means for a moment. Every node in the middle layer is acting as an indicator function, returning 0 or 1 depending on where the input lies in \mathbb{R}^2.

We are then taking a weighted sum of these indicator functions and feeding it into yet another nonlinearity. The possibilities may seem endless, since we are not placing any restrictions on the weight assignments.

In reality, characterizing the set of NNs (with the above architecture) that exhibit distinct behaviors does require a little bit of work–see Aside–but the point, as we shall see momentarily, is that we do not need to worry about all such possibilities.

One specific choice of assignments already gives the key insight into the representational power of this type of neural network. By setting all weights in the middle layer to 1/3, and setting the bias of the middle layer (b_1^{(2)}) to -1, the activation function of the output neuron (a_1^{(3)}) will output 1 whenever the input lies in the intersection of all three half-spaces defined by the decision boundaries, and 0 otherwise.

Since there was nothing special about our choice of decision boundaries, we are able to carve out any arbitrary polygon and have the NN fire precisely when the input is inside the polygon (in the general case we set the weights to 1/k, where k is the number of hyperplanes defining the polygon).

Figure 8

This fact demonstrates both the power and limitation of this type of NN architecture.

On the one hand, it is capable of carving out decision boundaries comprised of arbitrary polygons (or more generally polytopes). Creating regions comprised of multiple polygons, even disjoint ones, can be achieved by adding a set of neurons for each polygon and setting the weights of their respective edges to 1/k_i, where k_i is the number of hyperplanes defining the ith polygon.

This explains why, from an expressiveness standpoint, we don’t need to worry about all possible weight combinations, because defining a binary classifier over unions of polygons is all we can do. Any combination of weights that we assign to the middle layer in the above NN will result in a discrete set of values, up to one unique value per region formed by the union or intersection of the half-spaces defined by the decision boundaries, that are inputted to the a_1^{(3)} node.

Since the bias b_1^{(2)} can only adjust the threshold at which a_1^{(3)} will fire, then the resulting behavior of any weight assignment is activation over some union of polygons defined by the shaded regions.

Thus our restricted treatment, where we only consider weights equal to 1/k, already captures the representational power of this NN architecture.

A few caveats merit mention.

First, the above says nothing about representational efficiency, only power. A more thoughtful choice of weights, presumably identified by training the NN using backpropagation, can provide a more compact representation comprised of a smaller set of nodes and edges.

Second, I oversimplified the discussion by focusing only on polygons. In reality, any intersection of half-spaces is possible, even ones that do not result in bounded regions.

Third, and most seriously, feedforward NNs are not restricted to step functions for their activation functions. In particular modern NNs that utilize Rectified Linear Units (ReLUs) most likely exploit their linear regions.

Nonetheless, the above simplified discussion illustrates a limitation of this type of NNs. While they are able to represent any boundary with arbitrary accuracy, this would come at a significant cost, much like the cost of polygonally rendering smoothly curved objects in computer graphics.

In principle, NNs with sigmoidal activation functions are universal approximators, meaning they can approximate any continuous function with arbitrary accuracy. In practice I suspect that real NNs with a limited number of neurons behave more like my simplified toy models, carving out sharp regions in high-dimensional space, but on a much larger scale.

Regardless NNs still provide far more expressive power than most other machine learning techniques and my focus on \mathbb{R}^2 disguises the fact that even simple decision boundaries, operating in high-dimensional spaces, can be surprisingly powerful.

Before I wrap up, let me highlight one other aspect of NNs that this “union of polygons” perspective helps make clear.

It has long been known that an NN with a single hidden layer, i.e. the three-layer architecture discussed here, is equal in representational power to a neural network with arbitrary depth, as long as the hidden layer is made sufficiently wide.

Why this is so is obvious in the simplified setting described here, because unions of sets of unions of polygons can be flattened out in terms of unions of the underlying polygons. For example, consider the set of polygons formed by the following 10 boundaries:

Figure 9

We would like to create 8 neurons that correspond to the 8 possible activation patterns formed by the polygons (i.e. fire when input is in none of them (1 case), one of them (3 cases), two of them (3 cases), or any of them (1 case)).

In the “deep” case, we can set up a four-layer NN such that the second layer defines the edges, the third layer defines the polygons, and the fourth layer contains the 8 possible activation patterns:

Figure 10

The third layer composes the second layer, by creating neurons that are specific to each closed region.

However, we can just as well collapse this into the following three-layer architecture, where each neuron in the third layer “rediscovers” the polygons and how they must be combined to yield a specific activation pattern:

Figure 11

Deeper architectures allow deeper compositions, where more complex polygons are made up of simpler ones, but in principle all this complexity can be collapsed onto one (hidden) layer.

There is a difference in representational efficiency however, and the two architectures above illustrate this important point.

While the three-layer approach is just as expressive as the four-layer one, it is not as efficient: the three-layer NN has a 2-10-8 configuration, resulting in 100 parameters (20 edges connecting first to second layer plus 80 edges connecting second to third layer), while the four-layer NN, with a 2-10-3-8 configuration, only has 74 parameters.

Herein lies the promise of deeper architectures, by enabling the inference of complex models using a relatively small number of parameters. In particular, lower-level features such as the polygons above can be learned once and then reused by higher layers of the network.

That’s it for now. I hope this discussion provided some insight into the workings of neural networks.

If you’d like to read more, see the Aside, and I also recommend this blog entry by Christopher Olah which takes a topological view of neural networks.

Update: HN discussion here.

How living organisms were created?

From “A short history of nearly everything” by Bill Bryson

When it was created, Earth had no oxygen in its environment.

Cyanobacteria or algae break down water by absorbing the hydrogen and release the oxygen waste,which is actually a very toxic element to every anaerobic organism.

Our white blood cells actually use oxygen to kill invading bacteria.  This process of releasing oxygen is called photosynthesis, undoubtedly the most important single metabolic innovation in the history of life on the planet.

It took two billion years for our environment to accumulate 20% of oxygen, since oxygen was absorbed to oxidize every conceivable mineral on Earth, rust the mineral, and sink it in the bottom of oceans.

Life started when special bacteria used oxygen to summon up enough energy to work and photosynthesize.

Mitochondria, tiny organism, manipulates oxygen in a way that liberates energy from foodstuffs . They are very hungry organisms that a billion of them are packed in a grain of sand.

Mitochondria maintain their own DNA, RNA, and ribosome and behave as if they think things might not work out between us.

They look like bacteria, divide like bacteria and sometimes respond to antibiotics in the same way bacteria do; they live in cells but do not speak the same genetic language.

The truly nucleated cells are called eukaryotes and we ended up with two kinds of them: those that expel oxygen, like plants, and those that take in oxygen, like us.

Single-celled eukaryote contains 400 million bits of genetic information in its DNA, enough to fill 80 books of 500 pages.  It took a billion years for eukaryotes to learn to assemble into complex multi-cellular beings.

Microbes or bacteria form an intrinsic unit with our body and our survival.  They are in the trillions, grazing on our fleshy plains and breaking down our foodstuff and our waste into useful elements for our survival.

They synthesize vitamins in our guts, convert food into sugar and polysaccharides and go to war on alien microbes; they pluck nitrogen from the air and convert it into useful nucleotides and amino acids for us, a process that is extremely difficult to manufacture industrially.

Microbes continue to regenerate the air that we breathe with oxygen.  Microbes are very prolific and can split and generate 280 billion offspring within a day.

In every million divisions, a microbe may produce a mutant with a slight characteristic that can resist antibodies.

The most troubling is that microbes are endowed with the ability to evolve rapidly and acquire the genes of the mutants and become a single invincible super-organism; any adaptive change that occurs in one area of the bacterial province can spread to any other.

Microbes are generally harmless unless, by accident, they move from a specialized location in the body to another location such as the blood stream, for example, or are attacked by viruses, or our white blood cells go on a rampage.

Microbes can live almost anywhere; some were found in nuclear power generators feeding on uranium, some in the deep seas, some in sulfuric environment, some in extreme climate, and some can survive in enclosed bottles for hundred of years, as long as there is anything to feed on.

Viruses or phages can infect bacteria. A virus are not alive, they are nucleic acid, inert and harmless in isolation and visible by the electron microscope. Viruses barely have ten genes; even the smallest bacteria require several thousand genes..  But introduce them into a suitable host and they burst into life.

Viruses prosper by hijacking the genetic material of a living cell and reproduce in a fanatical manner.  About 5,000 types of virus are known and they afflict us with the flu, smallpox, rabies, yellow fever, Ebola, polio and AIDS.

Viruses burst upon the world in some new and startling form and then vanish as quickly as they came after killing millions of individuals in a short period.

There are billions of species. Tropical rainforests that represent only 6% of the Earth surface harbor more than half of its animal life and two third of its flowering plants.

A quarter of all prescribed medicines are derived from just 40 plants and 16% coming from microbes.

The discovery of new flowery plants might provide humanity with chemical compounds that have passed the “ultimate screening program” over billions of years of evolution.

The tenth of the weight of a six year-old pillow is made up of mites, living or dead, and mite dung; washing at low temperature just get the lice cleaner!

Einstein speaks on theoretical physics; (Nov. 18, 2009)

The creative character of theoretical physicist is that the products of his imagination are so indispensably and naturally impressed upon him that they are no longer images of the spirit but evident realities. Theoretical physics includes a set of concepts and logical propositions that can be deduced normally. Those deductive propositions are assumed to correspond exactly to our individual experiences.  That is why in theoretical book the deduction exercises represent the entire work.

Newton had no hesitation in believing that his fundamental laws were provided directly from experience.  At that period the notion of space and time presented no difficulties: the concepts of mass, inertia, force, and their direct relationship seemed to be directly delivered by experience.  Newton realized that no experience could correspond to his notion of absolute space which implicates absolute inertia and his reasoning of actions at distance; nevertheless, the success of the theory for over two centuries prevented scientists to realize that the base of this system is absolutely fictive.

Einstein said “the supreme task of a physician is to search for the most general elementary laws and then acquire an image of the world by pure deductive power. The world of perception determines rigorously the theoretical system though no logical route leads from perception to the principles of theory.” Mathematical concepts can be suggested by experience, the unique criteria of utilization of a mathematical construct, but never deducted. The fundamental creative principle resides in mathematics.

Logical deductions from experiments of the validity of the Newtonian system of mechanics were doomed to failures. Research by Faraday and Maxwell on the electro-magnetic fields initiated the rupture with classical mechanics. There was this interrogation “if light is constituted of material particles then where the matters disappear when light is absorbed?” Maxwell thus introduced partial differential equations to account for deformable bodies in the wave theory. Electrical and magnetic fields are considered as dependent variables; thus, physical reality didn’t have to be conceived as material particles but continuous partial differential fields; but Maxwell’s equations are still emulating the concepts of classical mechanics.

Max Plank had to introduce the hypothesis of quanta (for small particles moving at slow speed but with sufficient acceleration), which was later confirmed, in order to compute the results of thermal radiation that were incompatible with classical mechanics (still valid for situations at the limit).  Max Born pronounced “Mathematical functions have to determine by computation the probabilities of discovering the atomic structure in one location or in movement”.

Louis de Broglie and Schrodinger demonstrated the fields’ theory operation with continuous functions. Since in the atomic model there are no ways of locating a particle exactly (Heisenberg) then we may conserve the entire electrical charge at the limit where density of the particle is considered nil. Dirac and Lorentz showed how the field and particles of electrons interact as of same value to reveal reality. Dirac observed that it would be illusory to theoretically describe a photon since we have no means of confirming if a photon passed through a polarizator placed obliquely on its path. 

      Einstein is persuaded that nature represents what we can imagine exclusively in mathematics as the simplest system in concepts and principles to comprehend nature’s phenomena.  For example, if the metric of Riemann is applied to a continuum of four dimensions then the theory of relativity of gravity in a void space is the simplest.  If I select fields of anti-symmetrical tensors that can be derived then the equations of Maxwell are the simplest in void space.

The “spins” that describe the properties of electrons can be related to the mathematical concept of “semi-vectors” in the 4-dimensional space which can describe two kinds of elementary different particles of equal charges but of different signs. Those semi-vectors describe the magnetic field of elements in the simplest way as well as the properties electrical particles.  There is no need to localize rigorously any particle; we can just propose that in a portion of 3-dimensional space where at the limit the electrical density disappears but retains the total electrical charge represented by a whole number. The enigma of quanta can thus be entirely resolved if such a proposition is revealed to be exact.


            Till the first quarter of the 20th century sciences were driven by shear mathematical constructs.  This was a natural development since most experiments in natural sciences were done by varying one factor at a time; experimenters never used more than one independent variable and more than one dependent variable (objective measuring variable or the data).  Although the theory of probability was very advanced the field of practical statistical analysis of data was not yet developed; it was real pain and very time consuming doing all the computations by hand for slightly complex experimental designs. Sophisticated and specialized statistical packages constructs for different fields of research evolved after the mass number crunchers of computers were invented. 

            Thus, early theoretical scientists refrained from complicating their constructs simply because the experimental scientists could not practically deal with complex mathematical constructs. Thus, the theoretical scientists promoted the concept or philosophy that theories should be the simplest with the least numbers of axioms (fundamental principles) and did their best to imagining one general causative factor that affected the behavior of natural phenomena or would be applicable to most natural phenomena.

            This is no longer the case. The good news is that experiments are more complex and showing interactions among the factors. Nature is complex; no matter how you control an experiment to reducing the numbers of manipulated variables to a minimum there are always more than one causative factor that are interrelated and interacting to producing effects.

            Consequently, the sophisticated experiments with their corresponding data are making the mathematician job more straightforward when pondering on a particular phenomenon.  It is possible to synthesize two phenomena at a time before generalizing to a third one; mathematicians have no need to jump to general concepts in one step; they can consistently move forward on firm data basis. Mathematics will remain the best synthesis tool for comprehending nature and man behaviors.

            It is time to account for all the possible causatives factors, especially those that are rare in probability of occurrence (at the very end tail of the probability graphs) or for their imagined little contributing effects: it is those rare events that have surprised man with catastrophic consequences.

            Theoretical scientists of nature’s variability should acknowledge that nature is complex. Simple and beautiful general equations are out the window. Studying nature is worth a set of equations! (You may read my post “Nature is worth a set of equations”)

Cognitive mechanisms; (Dec. 26, 2009)

Before venturing into this uncharted territory let me state that there is a “real universe” that each one perceives differently: if this real world didn’t exist then there would be nothing to perceive. The real world cares less about the notions of time and space. No matter how we rationalize about the real world our system of comprehension is strictly linked to our brain/senses systems of perceptions. The way animals perceive the universe is different than our perception.  All we can offer are bundles of hypotheses that can never be demonstrated or confirmed even empirically. The best we can do is to extend the hypothesis that our perceived universe correlates (qualitative coherent resemblance) with the real universe. The notions of time, space, and causality are within our perceived universe.  Each individual has his own “coherent universe” that is as valid as any other perception. What rational logic and empirical experiments have discovered in “laws of nature” apply only to our perceived universe; mainly to what is conveniently labeled the category of grown up “normal people” who do not suffer major brain disturbances or defects.

Man uses symbols such as language, alphabets, mathematical forms, and musical symbols to record their cognitive performances. Brain uses “binary code” of impressions and intervals of non impressions to register a codified impression.  Most probably, the brain creates all kinds of cells and chemicals to categorize, store, classify, and retrieve various impressions; the rational is that since no matter how fast an impression is it stands to reason that the trillions and trillions of impressions would saturate the intervals between sensations in no time.

We are born with 25% of the total number of synapses that grown up will form.  Neurons have mechanisms of transferring from one section of the brain to other parts when frequent focused cognitive processes are needed. A child can perceive one event following another one but it has no further meaning but simple observation.  A child is not surprised with magic outcomes; what is out of the normal for a grown up is as valid a phenomenon as another to him (elephant can fly). We know that vision and auditory sensations pass through several filters (processed data) before being perceived by the brain.  The senses of smell and taste circumvent filters and are sensed by the limbic (primeval brain) before passing this data to cognition.

The brain attaches markers or attributes to impressions that it receives.  Four markers that I call exogenous markers attach to impressions as they are “registered” or perceived in the brain coming from the outside world through our senses.  At least four other markers, I label “endogenous markers” are attached to internal cognitive processing and are attached to information when re-structuring or re-configurations are performed during the dream periods: massive computations are needed to stored data before they are transformed to other ready useful data before endogenous markers are attributed to them for registering in other memory banks. There are markers that I call “reverse-exogenous” and are attached to information meant to be exported from the brain to the outside world. They are mainly of two kinds: body language information (such as head, hand, shoulder, or eye movements) and the recorded types on external means such as writing, painting, sculpting, singing, playing instruments, or performing art work.

The first exogenous marker directs impressions from the senses in their order of successions. The child recognizes that this event followed the other one within a short period of occurrence. His brain can “implicitly” store the two events are following in succession in a qualitative order (for example the duration of the succession is shorter or longer than the other succession). I label this marker as “Time recognizer” in the qualitative meaning of sensations.

The second marker registers and then stores an impression as a spatial configuration. At this stage, the child is able to recognize the concept of space but in a qualitative order; for example, this object is closer or further from the other object. I call this marker “space recognizer”.

The third marker is the ability to delimit a space when focusing on a collection of objects. Without this ability to first limit the range of observation (or sensing in general) it would be hard to register parts and bits of impressions within a first cut of a “coherent universe”. I label this marker “spatial delimiter”

The fourth marker attaches “strength” or “weight” of occurrence as the impression is recognized in the database.  The child cannot count but the brain is already using this marker for incoming information. In a sense, the brain is assembling events and objects in special “frequency of occurrence” database during dream periods and the information are retrieved in qualitative order of strength of sensations in frequency.  I call this attribute “count marker”.

The fifth marker is an endogenous attributes: this marker is attached within the internal export/import of information in the brain. This attribute is a kind of “correlation” quantity that indicates same/different trends of behavior of events or objects.  In a sense, this marker will internally sort out data as “analogous” or contrary collections along a time scale. People have tendency to associate correlation with cause and effect relation but it is not. A correlation quantity can be positive (two variables have the same behavioral trend in a system) or negative quantity (diverging trends). With the emergence of the 5th marker the brain has grown a quantitative threshold in synapses and neurons to starting massive computations on impressions stored in the large original database or what is called “long-term memory”.

The sixth marker is kind of a “probability quantity” that permits the brain to order objects according to “plausible” invariant properties in space (for example objects or figures are similar according to a particular property, including symmetrical transformations). I label this the “invariant marker” and it re-structures collections of objects and shapes in structures such as hereditary, hierarchical, network, or circular.

The seventh marker I call the “association attribute”. Methods of deduction, inductions, and other logical manipulations are within these kinds of data types.  They are mostly generated from rhetorical associations such as analogies, metaphors, antonyms, and other categories of associations. No intuition or creative ideas are outside the boundary of prior recognition of the brain.  Constant focus and work on a concept generate complex processing during the dream stage. The conscious mind recaptures sequences from the dream state and most of the time unconsciously. What knowledge does is decoding in formal systems the basic processes of the brain and then re-ordering what seems as chaotic firing in brain cells.  Symbols were created to facilitate rules writing for precise rationalization.

The eighth marker I call the “design marker”; it recognizes interactions among variables and interacts with reverse exogenous markers since a flow with outside perceptions is required for comprehension. Simple perceived relationships between two events or variables are usually trivial and mostly wrong; for example thunder follows lightning and thus wrongly interpreted as lightning generates thunder.  Simple interactions are of the existential kind as in the Pavlov reactions where existential rewards, such as food, are involved in order to generate the desired reactions. The Pavlov reaction laws apply to man too. Interactions among more than two variables are complex for interpretations in the mind and require plenty of training and exercises.  Designing experiments is a very complex cognitive task and not amenable to intuition: it requires learning and training to appreciating the various cause and effects among the variables.

The first kinds of “reverse exogenous” markers can be readily witnessed in animals such as in body language of head, hand, shoulder, or eye movements; otherwise Pavlov experiments could not be conducted if animals didn’t react with any external signs. In general, rational thinking retrieves data from specialized databases “cognitive working memory” of already processed data and saved for pragmatic utility. Working memories are developed once data find outlets to the external world for recording; thus, pure thinking without attempting to record ideas degrades the cognitive processes with sterile internal transfer without new empirical information to compute in.

An important reverse-exogenous marker is sitting still, concentrating, emptying our mind of external sensations, and relaxing the mind of conscious efforts of perceiving the knowledge “matter” in order to experience the “cosmic universe”.

This article was not meant to analyze emotions or value moral systems.  It is very probable that the previously described markers are valid for the moral value systems with less computation applied to the data transferred to the “moral working memory”. I believe that more other sophisticated computations are performed than done to emotional data since a system is constructed for frequent “refreshing” with age and experiences.

I conjecture that emotions are generated from the vast original database and the endogenous correlation marker is the main computation method: the reason is that emotions are related to complex and almost infinite interactions with people and community; thus, the brain prefers not to consume time and resources on complex computations that involve many thousands of variables interacting simultaneously. Thus, an emotional reaction in the waking period is not necessarily “rational” but of the quick and dirty resolutions kinds. In the dream sessions, emotionally loaded impressions are barely processed because they are hidden deep in the vast original database structure and are not refreshed frequently to be exposed to the waking conscious cognitive processes; thus, they flare up within the emotional reaction packages.

Note: The brain is a flexible organic matter that can be trained and developed by frequent “refreshing” of interactions with the outside world of sensations. Maybe animals lack the reverse exogenous markers to record their cognitive capabilities; more likely, it is because their cognitive working memory is shriveled that animals didn’t grow the appropriate limbs for recording sensations: evolution didn’t endow them with external performing limbs for writing, sculpting, painting, or doing music. The fact that chimps were trained to externalize cognition as valid as 5 years old capabilities suggest that attaching artificial limbs to chimps, cats, or dogs that are compatible with human tools will demonstrate that chimps can give far better cognitive performance than expected.

This is a first draft to get the project going. I appreciate developed comments and references.

How random and unstable are your phases? (Dec. 7, 2009)

There are phenomena in the natural world that behave randomly or what it seems chaotic such as in percolation and “Brownian movement” of gases.  The study of phases in equilibrium among chaotic, random, and unstable physical systems were analyzed first my physicists and taken on by modern mathematicians.

The mathematician Wendelin Werner (Fields Prize) researched how the borders that separate two phases in equilibrium among random, and unstable physical systems behave; he published “Random Planar Curves…”

Initially, the behavior of identical elements (particles) in large number might produce deterministic or random results in various cases.

For example, if we toss a coin many times we might guess that heads and tails will be equal in number of occurrences; the trick is that we cannot say that either head or tail is in majority.

The probabilistic situations inspired the development of purely mathematical tools.  The curves between the phases in equilibrium appear to be random, but have several characteristics:

First, the curves have auto-similarity, which means that the study of a small proportion could lead to generalization in the macro-level with the same properties of “fractal curves”,

The second characteristic is that even if the general behavior is chaotic a few properties remain the same (mainly, the random curves have the same “fractal dimension” or irregular shape;

The third is that these systems are very unstable (unlike the games of head and tails) in the sense that changing the behavior of a small proportion leads to large changes by propagation on a big scale.  Thus, these systems are classified mathematically as belonging to infinite complexity theories.

Themes of unstable and random systems were first studied by physicists and a few of them received Nobel Prizes such as Kenneth Wilson in 1982.

The research demonstrated that such systems are “invariant” by transformations (they used the term re-normalization) that permit passages from one scale to a superior scale.  A concrete example is percolation.

Let us take a net resembling beehives where each cavity (alveolus) is colored black or red using the head and tail flipping technique of an unbiased coin. Then, we study how these cells are connected randomly on a plane surface.

The Russian Stas Smirnov demonstrated that the borders exhibit “conforming invariance”, a concept developed by Bernhard Riemann in the 19th century using complex numbers. “Conforming invariance” means that it is always possible to warp a rubber disk that is covered with thin crisscross patterns so that lines that intersect at right angle before the deformation can intersect at right angle after the deformation.  The set of transformations that preserves angles is large and can be written in series of whole numbers or a kind of polynomials with infinite degrees. The transformations in the percolation problem conserve the proportion of distances or similitude.

The late Oded Schramm had this idea: suppose two countries share a disk; one country control the left border and the other the right border; suppose that the common border crosses the disk. If we investigate a portion of the common border then we want to forecast the behavior of the next portion.

This task requires iterations of random conforming transformations using computation of fractal dimension of the interface. We learn that random behavior on the micro-level exhibits the same behavior on the macro-level; thus, resolving these problems require algebraic and analytical tools.

The other case is the “Brownian movement” that consists of trajectories where one displacement is independent of the previous displacement (stochastic behavior).  The interfaces of the “Brownian movement” are different in nature from percolation systems.

Usually, mathematicians associate a probability “critical exponent or interaction exponent” so that two movements will never meet, at least for a long time.  Two physicists, Duplantier and Kyung-Hoon Kwan, extended the idea that these critical exponents belong to a table of numbers of algebraic origin. Mathematical demonstrations of the “conjecture” or hypothesis of Benoit Mandelbrot on fractal dimension used the percolation interface system.

Werner said: “With the collaboration of Greg Lawler we progressively comprehended the relationship between the interfaces of percolation and the borders of the Brownian movement.  Strong with Schramm theory we knew that our theory is going to work and to prove the conjecture related to Brownian movement.”

Werner went on: “It is unfortunate that the specialized medias failed to mention the great technical feat of Grigori Perelman in demonstrating Poincare conjecture.  His proof was not your tread of the mill deductive processes with progressive purging and generalization; it was an analytic and human proof where hands get dirty in order to control a bundle of possible singularities.

These kinds of demonstrations require good knowledge  of “underlying phenomena”.

As to what he consider a difficult problem Werner said: “I take a pattern and then count the paths of length “n” so that they do not intersect  twice at a particular junction. This number increases exponentially with the number n; we think there is a corrective term of the type n at exponential 11/32.  We can now guess the reason for that term but we cannot demonstrate it so far.”

The capacity of predicting behavior of a phenomenon by studying a portion of it then, once an invariant is recognized, most probably a theory can find counterparts in the real world; for example, virtual images techniques use invariance among objects. It has been proven that vision is an operation of the brain adapting geometric invariance that are characteristics of the image we see.

Consequently, stability in the repeated signals generates perception of reality.  In math, it is called “covariance laws” when system of references are changed.  For example, the Galileo transformations in classical mechanics and Poincare transformations in restricted relativity.

In a sense, math is codifying the processes of sensing by the brain using symbolic languages and formulations.

568.  theoretical physicsspeaks on theoretical physics; (Nov. 18, 2009)


569.  I am mostly the other I; (Nov. 19, 2009)


570.  Einstein speaks on General Relativity; (Nov. 20, 2009)


571.  Einstein speaks of his mind processes on the origin of General Relativity; (Nov. 21, 2009)


572.  Everyone has his rhetoric style; (Nov. 22, 2009)

Einstein speaks on General Relativity; (Nov. 20, 2009)

I have already posted two articles in the series “Einstein speaks on…” This article describes Einstein’s theory of restricted relativity and then his concept for General Relativity. It is a theory meant to extend physics of fields (for example electrical and magnetic fields among others) to all natural phenomena, including gravity. Einstein declares that there was nothing speculative in his theory but it was adapted to observed facts.

The fundamentals are that the speed of light is constant in the void and that all systems of inertia are equally valid (each system of inertia has its own metric time). The experience of Michelson has demonstrated these fundamentals. The theory of restrained relativity adopts the continuum of space coordinates and time as absolute since they are measured by clocks and rigid bodies with a twist: the coordinates become relative because they depend on the movement of the selected system of inertia.

The theory of General Relativity is based on the verified numerical correspondence of inertia mass and weight. This discovery is obtained when coordinates posses relative accelerations with one another; thus each system of inertia has its own field of gravitation. Consequently, the movement of solid bodies does not correspond to the Euclid geometry as well as the movement of clocks. The coordinates of space-time are no longer independent. This new kind of metrics existed mathematically thanks to the works of Gauss and Riemann.

Ernst Mach realized that classical mechanics movement is described without reference to the causes; thus, there are no movements but those in relation to other movements.  In this case, acceleration in classical mechanics can no longer be conceived with relative movement; Newton had to imagine a physical space where acceleration would exist and he logically announced an absolute space that did not satisfy Newton but that worked for two centuries. Mach tried to modify the equations so that they could be used in reference to a space represented by the other bodies under study.  Mach’s attempts failed in regard of the scientific knowledge of his time.

We know that space is influenced by the surrounding bodies and so far, I cannot think the general Relativity may surmount satisfactorily this difficulty except by considering space as a closed universe, assuming that the average density of matters in the universe has a finite value, however small it might be.

Einstein speaks on theoretical sciences; (Nov. 15, 2009)

I intend to write a series on “Einstein speaks” on scientific methods, theoretical physics, relativity, pacifism, national-socialism, and the Jewish problem.

In matter of space two objects may touch or be distinct.  When distinct, we can always introduce a third object in between. Interval thus stays independent of the selected objects; an interval can then be accepted as real as the objects. This is the first step in understanding the concept of space. The Greeks privileged lines and planes in describing geometric forms; an ellipse, for example, was not intelligible except as it could be represented by point, line, and plane. Einstein could never adhere to Kant’s notion of “a priori” simply because we need to search the characters of the sets concerning sensed experiences and then to extricate the corresponding concepts.

The Euclidian mathematics preferred using the concepts of objects and the relation of the position among objects. Relations of position are expressed as relations of contacts (intersections, lines, and planes); thus, space as a continuum was never considered.  The will to comprehend by thinking the reciprocal relations of corporal objects inevitably leads to spatial concepts.

In the Cartesian system of three dimensions all surfaces are given as equivalent, irrespective of arbitrary preferences to linear forms in geometric constructs. Thus, it goes way beyond the advantage of placing analysis at the service of geometry. Descartes introduced the concept of a point in space according to its coordinates and geometric forms became part of a continuum in 3-dimensional space.

The geometry of Euclid is a system of logic where propositions are deduced with such exactitude that no demonstration provoke any doubt. Anyone who could not get excited and interested in such architecture of logic could not be initiated to theoretical research.

There are two ways to apprehend concepts: the first method (analytical logic) resolves the following problem “how concepts and judgments are dependents?” the answer is by mathematics; however, this assurance is gained at a prohibitive price of not having any content with sensed experiences, even indirectly. The other method is to intuitively link sensed experiences with extracted concepts though no logical research can confirm this link.

For example: suppose we ask someone who never studied geometry to reconstruct a geometric manual devoid of any schemas. He may use the abstract notions of point and line and reconstruct the chain of theorems and even invents other theorems with the given rules. This is a pure game of words for the gentleman until he figures out, from his personal experience and by intuition, tangible meanings for point and line and geometry will become a real content.

Consequently, there is this eternal confrontation between the two components of knowledge: empirical methodology and reason. Experimental results can be considered as the deductive propositions and then reason constitutes the structure of the system of thinking. The concepts and principles explode as spontaneous inventions of the human spirit. Scientific theoretician has no knowledge of the images of the world of experience that determined the formation of his concepts and he suffers from this lack of personal experience of reality that corresponds to his abstract constructs.  Generally, abstract constructs are forced upon us to acquire by habit. Language uses words linked to primitive concepts which exacerbate the difficulty with explaining abstract constructs.

The creative character of science theoretician is that the products of his imagination are so indispensably and naturally impressed upon him that they are no longer images of the spirit but evident realities. The set of concepts and logical propositions, where the capacity to deduction is exercised, correspond exactly to our individual experiences.  That is why in theoretical book deduction represents the entire work.  That is what is going on in Euclid geometry: the fundamental principles are called axioms and thus the deduced propositions are not based on commonplace experiences. If we envision this geometry as the theory of possibilities of the reciprocal position of rigid bodies and is thus understood as physical science, without suppressing its empirical origin, then the resemblance between geometry and theoretical physics is striking.

The essential goal of theory is to divulge the fundamental elements that are irreducible, as rare and as evident as possible; an adequate representation of possible experiences has to be taken into account.

Knowledge deducted from pure logic is void; logic cannot offer knowledge extracted from the world of experience if it is not associated with reality in two way interactions. Galileo is recognized as the father of modern physics and of natural sciences simply because he fought his way to impose empirical methods. Galileo has impressed upon the scientists that experience describes and then proposes a synthesis of reality.

Einstein is persuaded that nature represents what we can imagine exclusively in mathematics as the simplest system in concepts and principles to comprehend nature’s phenomena. Mathematical concepts can be suggested by experience, the unique criteria of utilization of a mathematical construct, but never deducted. The fundamental creative principle resides in mathematics. The follow up article “Einstein speaks on theoretical physics” with provide ample details on Einstein’s claim.


Einstein said “We admire the Greeks of antiquity for giving birth to western science.” Most probably, Einstein was not versed in the history of sciences and was content of modern sciences since Kepler in the 18th century: maybe be he didn’t need to know the history of sciences and how Europe Renaissance received a strong impulse from Islamic sciences that stretched for 800 years before Europe woke up from the Dark Ages. Thus, my critique is not related to Einstein’s scientific comprehension but on the faulty perception that sciences originated in Greece of the antiquity.

You can be a great scientist (theoretical or experimental) but not be versed in the history of sciences; the drawback is that people respect the saying of great scientists even if they are not immersed in other fields; especially, when he speaks on sciences and you are led to assume that he knows the history of sciences.  That is the worst misleading dissemination venue of faulty notions that stick in people’s mind.

Euclid was born and raised in Sidon (current Lebanon) and continued his education in Alexandria and wrote his manuscript on Geometry in the Greek language.  Greek was one of the languages of the educated and scholars in the Near East from 300 BC to 650 AC when Alexander conquered this land with his Macedonian army.  If the US agrees that whoever writes in English should automatically be conferred the US citizenship then I have no qualm with that concept.  Euclid was not Greek simply because he wrote in Greek. Would the work of Euclid be most underestimated if it were written in the language of the land Aramaic?

Einstein spoke on Kepler at great length as the leading modern scientist who started modern astronomy by formulating mathematical model of planets movements. The Moslem scientist and mathematician Ibn Al Haitham set the foundation for required math learning in the year 850 (over 900 years before Europe Renaissance); he said that arithmetic, geometry, algebra, and math should be used as the foundations for learning natural sciences. Ibn Al Haitham said that it is almost impossible to do science without strong math background.  Ibn Al Haitham wrote mathematical equations to describe the cosmos and the movement of planets. Maybe the great scientist Kepler did all his work alone without the knowledge of Ibn Al Haitham’s analysis but we should refrain of promoting Kepler as the discoverer of modern astronomy science. It also does not stand to reason that the Islamic astronomers formulated their equations without using 3-dimensional space: Descartes is considered the first to describing geometrical forms with coordinates in 3-dimensional space.

I have a problem with Newton’s causal factor; (Nov. 13, 2009)

Let me refresh your memory of Newton’s explanation of the causal factor that moves planets in specific elliptical trajectories.  Newton’s related the force that attracts objects onto the ground by the field of acceleration (gravitation field) that it exerts on the mass of an object. Thus, objects are attracted to one another “at distance and simultaneously” by other objects; thus, this attractive force causes movements in foreseeable trajectories. Implicitly, Newton is saying that it the objects (masses or inertia) that are creating the acceleration or the field of gravity. 

If this is the theory then, where is the cause in this relation?  Newton is no fool; he knew that he didn’t find the cause but was explaining an observation.  He had two alternatives: either to venture into philosophical concepts of the source for gravity or get at the nitty-gritty business of formulating what is observed.  Newton could easily have taken the first route since he spent most of his life studying theological matters. Luckily for us, he opted for the other route.

      Newton then undertook to inventing mathematical tools such as differentiation and integration to explaining his conceptual model of how nature functions. Newton could then know, at a specific location of an object, where the object was at the previous infinitesimal time dT and predict where it will be dT later.  The new equation could explain the cause of the elliptical trajectories of planets as Kepler discovered empirically and as Galileo proved by experiments done on falling objects.

For two centuries, scientists applied the mechanical physics of Newton that explained most of the experimental observations such as heat kinetic, conservation of energy laws, the theory of gases, and the nature of the second principle in thermodynamic.   Even the scientists working on the electromagnetic fields started by inventing concepts based on Newton’s premises of continuum matters and of an absolute space and time.  Scientists even invented the notion of “ether” filling the void with physical characteristics that might explain phenomena not coinciding with Newton’s predictions.

Then, modern physics had to finally drop the abstract concept of simultaneous effects at a distance.  Modern physics adopted the concept that masses are not immutable entities, and that speed of light in the void exists but it has a speed limit. Newton’s laws are valid for movements of small speeds. Thus, partial differentials were employed to explaining the theory of fields. Thermal radiation, radioactivity, and spectrums observations have let to envision the theory of discrete packet of energy.

Newton was no fool.  He already suspected that his system was restrictive and had many deficiencies. First, Newton discovers experimentally that the observable geometric scales (distances of material points) and their course in time do not define completely the movements physically (the bucket experiment).  There must exist “something else” other than masses and distances to account for. He admits that space must possess physical characteristics of the same nature as masses for movements to have meaning in his equations. To be consistent with his approach of not introducing concepts that are not directly attached to observable objects ,Newton had to postulate the concept of absolute space and absolute time framework.

Second, Newton declares that his principle of the reciprocal action of gravity has no ambition for a definitive explanation but a rule deduced from experiment.

Third, Newton is aware that the perfect correspondence of weight and inertia does not offer any explanation.  None of these three logical objections can be used to discredit the theory. They were unsatisfied desires of a scientific mind to reach a unifying conception of nature’s phenomenon.  The causal and differential laws are still debatable and nobody dares reject them completely and for ever.

Let me suggest this experiment: we isolate an object in the void, in a chamber that denies access to outside electromagnetic and thermal effects, and we stabilize the object in a suspension sort of levitating. Now we approach other objects (natural or artificially created) in the same isolated condition as the previous one. What would happen?

Would the objects move at a certain distance? Would they be attracted? At what masses movement is generated? How many objects should be introduced before any kind of movement is generated? What network structure of the objects initiates movements? Would they start spinning on themselves before they oscillate as one mass (a couple) in clockwise and counterclockwise fashion around a fictitious axe? How long before any movement is witnessed? What would be the spinning speed if any; the speed of the One Mass; any acceleration before steady state movement?  I believe that the coefficient G will surface from the data gathered and might offer satisfactory answers to the cause of movements.  

The one difficult problem in this experiment is the kind of mechanisms to keeping the objects in suspension against gravity. These various mechanisms would play the role of manipulated variable.

            My hypothesis is that it is the movements of atoms, electrons, and all the moving particles within masses that are the cause that generates the various fields of energies that get objects in movement.  Gravity is just the integration of all these fields of energy (at the limit) into one comprehensive field called gravity. If measured accurately, G should be different at every point in space/time.  We have to determine the area that we are interested for the integral G at the limit of the area.  With man activities that are changing earth and climatic ecosystem then, I think G has changed dramatically in many locations and need to be measured accurately for potential catastrophic zones on earth.

Learning paradigm for our survival; (Nov. 9, 2009)

Einstein, the great theoretical physicist, confessed that most theoretical scientists are constantly uneasy until they discover, from their personal experiences, natural correspondences with their abstract models.  I am not sure if this uneasiness is alive before or after a mathematician is an expert professional. 

For example, mathematicians learn Riemann’s metrics in four-dimensional spaces and solve the corresponding problems. How many of them were briefed that this abstract construct, which was invented two decades before relativity, was to be used as foundation for modern science? Would these kinds of knowledge make a difference in the long run for professional mathematicians?

During the construction of theoretical (mathematical) models, experimental data contribute to revising models to taking into account real facts that do not match previous paradigms. I got into thinking: If mathematicians receive scientific experimental training at the university and are exposed to various scientific fields, they might become better mathematicians by getting aware of the scientific problems and be capable of interpreting purely mathematical models to corresponding natural or social phenomenon that are defying comprehension.

By the way, I am interested to know if there are special search engines for mathematical concepts and models that can be matched to those used in fields of sciences.  By now, it would be absurd if no projects have worked on sorting out the purely mathematical models and theories that are currently applied in sciences.

I got this revelation.  Schools use different methods for comprehending languages and natural sciences.  Kids are taught the alphabet, words, syntax, grammars, spelling and then much later are asked to compose essays.  Why this process is not applied in learning natural sciences?  

Why students learning math are not asked to write essays on how formulas and equations they had learned apply to natural or social realities?

I have strong disagreement on the pedagogy of learning languages:

First, we know that children learn to talk years before they can read. Why then kids are not encouraged to tell verbal stories before they can read?  Why kids’ stories are not recorded and translated into the written words to encourage the kids into realizing that what they read is indeed another story telling medium?

Second, we know that kids have excellent capabilities to memorize verbally and visually entire short sentences before they understand the fundamentals. Why don’t we develop their cognitive abilities before we force upon them the traditional malignant methodology?  The proven outcomes are that kids are devoid of verbal intelligence, hate to read, and would not attempt to write, even after they graduate from universities.

Arithmetic, geometry, algebra, and math are used as the foundations for learning natural sciences. The Moslem scientist and mathematician Ibn Al Haitham set the foundation for required math learning, in the year 850, if we are to study physics and sciences. Al  Haitham said that it is almost impossible to do science without strong math background. 

Ibn Al Haitham wrote math equations to describe the cosmos and its movement over 9 centuries before Kepler emulated Ibn Al Haitham’s analysis. Currently, Kepler is taunted as the discoverer of modern astronomy science.

We learn to manipulate equations; we then are asked to solve examples and problems by finding the proper equations that correspond to the natural problem (actually, we are trained to memorize the appropriate equations that apply to the problem given!).  Why we are not trained to compose a story that corresponds to an equation, or set of equations (model)?

If kids are asked to compose essays as the final outcome of learning languages, why students are not trained to compose the natural phenomena from given set of equations? Would not that be the proper meaning for comprehending the physical world or even the world connected with human behavior? 

Would not the skill of modeling a system be more meaningful and straightforward after we learn to compose a world from a model or set of equations?  Consequently, scientists and engineers, by researching natural phenomena and man-made systems that correspond to the mathematical models, would be challenged to learn about natural phenomena. Thus, their modeling abilities would be enhanced, more valid, and more instructive!

If mathematicians are trained to compose or view the appropriate natural phenomenon and human behavior from equations and mathematical models, then the scientific communities in natural and human sciences would be far richer in quality and quantity.

Our survival needs mathematicians to be members of scientific teams.  This required inclusion would be the best pragmatic means into reforming math and sciences teaching programs.

Note: This post is a revised version of “Oh, and I hate math: Alternative teaching methods (February 8, 2009)”.




July 2020

Blog Stats

  • 1,397,917 hits

Enter your email address to subscribe to this blog and receive notifications of new posts by

Join 743 other followers

%d bloggers like this: