Adonis Diaries

Posts Tagged ‘sciences

Neural Network? Sciences or networking? 

I have taken a couple of graduate courses in neural network at its beginning in 1989 and its modeling and how experiments are done and interpreted using this psychology computer learning algorithm.

What Does a Neural Network Actually Do?

There has been a lot of renewed interest lately in neural networks (NNs) due to their popularity as a model for deep learning architectures (there are non-NN based deep learning approaches based on sum-products networks and support vector machines with deep kernels, among others).

Perhaps due to their loose analogy with biological brains, the behavior of neural networks has acquired an almost mystical status. This is compounded by the fact that theoretical analysis of multilayer perceptrons (one of the most common architectures) remains very limited, although the situation is gradually improving.

To gain an intuitive understanding of what a learning algorithm does, I usually like to think about its representational power, as this provides insight into what can, if not necessarily what does, happen inside the algorithm to solve a given problem.

I will do this here for the case of multilayer perceptrons. By the end of this informal discussion I hope to provide an intuitive picture of the surprisingly simple representations that NNs encode.

I should note at the outset that what I will describe applies only to a very limited subset of neural networks, namely the feedforward architecture known as a multilayer perceptron.

There are many other architectures that are capable of very different representations. Furthermore, I will be making certain simplifying assumptions that do not generally hold even for multilayer perceptrons. I find that these assumptions help to substantially simplify the discussion while still capturing the underlying essence of what this type of neural network does. I will try to be explicit about everything.

Let’s begin with the simplest configuration possible: two inputs nodes wired to a single output node. Our NN looks like this:

Figure 1

The label associated with a node denotes its output value, and the label associated with an edge denotes its weight. The topmost node h represents the output of this NN, which is:

h = f\left(w_1 x_1+w_2 x_2+b\right)

In other words, the NN computes a linear combination of the two inputs x_1 and x_2, weighted by w_1 and w_2 respectively, adds an arbitrary bias term b and then passes the result through a function f, known as the activation function.

There are a number of different activation functions in common use and they all typically exhibit a nonlinearity. The sigmoid activation f(a)=\frac{1}{1+e^{-a}}, plotted below, is a common example.

Figure 2

As we shall see momentarily, the nonlinearity of an activation function is what enables neural networks to represent complicated input-output mappings.

The linear regime of an activation function can also be exploited by a neural network, but for the sake of simplifying our discussion, we will choose an activation function without a linear regime. In other words, f will be a simple step function:

Figure 3

This will allow us to reason about the salient features of a neural network without getting bogged down in the details.

In particular, let’s consider what our current neural network is capable of. The output node can generate one of two values, and this is determined by a linear weighting of the values of the input nodes. Such a function is a binary linear classifier.

As shown below, depending on the values of w_1 and w_2, one regime in this two-dimensional input space yields a response of 0 (white) and the other a response of 1 (shaded):

Figure 4

Let’s now add two more output nodes (a neural network can have more than a single output). I will need to introduce a bit of notation to keep track of everything. The weight associated with an edge from the jth node in the first layer to the ith node in the second layer will be denoted by w_{ij}^{(1)}. The output of the ith node in the nth layer will be denoted by a_i^{(n)}.

Thus x_1 = a_1^{(1)} and x_2 = a_2^{(1)}.

Figure 5

Every output node in this NN is wired to the same set of input nodes, but the weights are allowed to vary. Below is one possible configuration, where the regions triggering a value of 1 are overlaid and colored in correspondence with the colors of the output nodes:

Figure 6

So far we haven’t really done anything, because we just overlaid the decision boundaries of three linear classifiers without combining them in any meaningful way. Let’s do that now, by feeding the outputs of the top three nodes as inputs into a new node.

I will hollow out the nodes in the middle layer to indicate that they are no longer the final output of the NN.

Figure 7

The value of the single output node at the third layer is:

a_1^{(3)} = f \left(w_{11}^{(2)} a_1^{(2)}+w_{12}^{(2)} a_2^{(2)}+w_{13}^{(2)} a_3^{(2)}+b_1^{(2)}\right)

Let’s consider what this means for a moment. Every node in the middle layer is acting as an indicator function, returning 0 or 1 depending on where the input lies in \mathbb{R}^2.

We are then taking a weighted sum of these indicator functions and feeding it into yet another nonlinearity. The possibilities may seem endless, since we are not placing any restrictions on the weight assignments.

In reality, characterizing the set of NNs (with the above architecture) that exhibit distinct behaviors does require a little bit of work–see Aside–but the point, as we shall see momentarily, is that we do not need to worry about all such possibilities.

One specific choice of assignments already gives the key insight into the representational power of this type of neural network. By setting all weights in the middle layer to 1/3, and setting the bias of the middle layer (b_1^{(2)}) to -1, the activation function of the output neuron (a_1^{(3)}) will output 1 whenever the input lies in the intersection of all three half-spaces defined by the decision boundaries, and 0 otherwise.

Since there was nothing special about our choice of decision boundaries, we are able to carve out any arbitrary polygon and have the NN fire precisely when the input is inside the polygon (in the general case we set the weights to 1/k, where k is the number of hyperplanes defining the polygon).

Figure 8

This fact demonstrates both the power and limitation of this type of NN architecture.

On the one hand, it is capable of carving out decision boundaries comprised of arbitrary polygons (or more generally polytopes). Creating regions comprised of multiple polygons, even disjoint ones, can be achieved by adding a set of neurons for each polygon and setting the weights of their respective edges to 1/k_i, where k_i is the number of hyperplanes defining the ith polygon.

This explains why, from an expressiveness standpoint, we don’t need to worry about all possible weight combinations, because defining a binary classifier over unions of polygons is all we can do. Any combination of weights that we assign to the middle layer in the above NN will result in a discrete set of values, up to one unique value per region formed by the union or intersection of the half-spaces defined by the decision boundaries, that are inputted to the a_1^{(3)} node.

Since the bias b_1^{(2)} can only adjust the threshold at which a_1^{(3)} will fire, then the resulting behavior of any weight assignment is activation over some union of polygons defined by the shaded regions.

Thus our restricted treatment, where we only consider weights equal to 1/k, already captures the representational power of this NN architecture.

A few caveats merit mention.

First, the above says nothing about representational efficiency, only power. A more thoughtful choice of weights, presumably identified by training the NN using backpropagation, can provide a more compact representation comprised of a smaller set of nodes and edges.

Second, I oversimplified the discussion by focusing only on polygons. In reality, any intersection of half-spaces is possible, even ones that do not result in bounded regions.

Third, and most seriously, feedforward NNs are not restricted to step functions for their activation functions. In particular modern NNs that utilize Rectified Linear Units (ReLUs) most likely exploit their linear regions.

Nonetheless, the above simplified discussion illustrates a limitation of this type of NNs. While they are able to represent any boundary with arbitrary accuracy, this would come at a significant cost, much like the cost of polygonally rendering smoothly curved objects in computer graphics.

In principle, NNs with sigmoidal activation functions are universal approximators, meaning they can approximate any continuous function with arbitrary accuracy. In practice I suspect that real NNs with a limited number of neurons behave more like my simplified toy models, carving out sharp regions in high-dimensional space, but on a much larger scale.

Regardless NNs still provide far more expressive power than most other machine learning techniques and my focus on \mathbb{R}^2 disguises the fact that even simple decision boundaries, operating in high-dimensional spaces, can be surprisingly powerful.

Before I wrap up, let me highlight one other aspect of NNs that this “union of polygons” perspective helps make clear.

It has long been known that an NN with a single hidden layer, i.e. the three-layer architecture discussed here, is equal in representational power to a neural network with arbitrary depth, as long as the hidden layer is made sufficiently wide.

Why this is so is obvious in the simplified setting described here, because unions of sets of unions of polygons can be flattened out in terms of unions of the underlying polygons. For example, consider the set of polygons formed by the following 10 boundaries:

Figure 9

We would like to create 8 neurons that correspond to the 8 possible activation patterns formed by the polygons (i.e. fire when input is in none of them (1 case), one of them (3 cases), two of them (3 cases), or any of them (1 case)).

In the “deep” case, we can set up a four-layer NN such that the second layer defines the edges, the third layer defines the polygons, and the fourth layer contains the 8 possible activation patterns:

Figure 10

The third layer composes the second layer, by creating neurons that are specific to each closed region.

However, we can just as well collapse this into the following three-layer architecture, where each neuron in the third layer “rediscovers” the polygons and how they must be combined to yield a specific activation pattern:

Figure 11

Deeper architectures allow deeper compositions, where more complex polygons are made up of simpler ones, but in principle all this complexity can be collapsed onto one (hidden) layer.

There is a difference in representational efficiency however, and the two architectures above illustrate this important point.

While the three-layer approach is just as expressive as the four-layer one, it is not as efficient: the three-layer NN has a 2-10-8 configuration, resulting in 100 parameters (20 edges connecting first to second layer plus 80 edges connecting second to third layer), while the four-layer NN, with a 2-10-3-8 configuration, only has 74 parameters.

Herein lies the promise of deeper architectures, by enabling the inference of complex models using a relatively small number of parameters. In particular, lower-level features such as the polygons above can be learned once and then reused by higher layers of the network.

That’s it for now. I hope this discussion provided some insight into the workings of neural networks.

If you’d like to read more, see the Aside, and I also recommend this blog entry by Christopher Olah which takes a topological view of neural networks.

Update: HN discussion here.

How living organisms were created?

From “A short history of nearly everything” by Bill Bryson

When it was created, Earth had no oxygen in its environment.

Cyanobacteria or algae break down water by absorbing the hydrogen and release the oxygen waste,which is actually a very toxic element to every anaerobic organism.

Our white blood cells actually use oxygen to kill invading bacteria.  This process of releasing oxygen is called photosynthesis, undoubtedly the most important single metabolic innovation in the history of life on the planet.

It took two billion years for our environment to accumulate 20% of oxygen, since oxygen was absorbed to oxidize every conceivable mineral on Earth, rust the mineral, and sink it in the bottom of oceans.

Life started when special bacteria used oxygen to summon up enough energy to work and photosynthesize.

Mitochondria, tiny organism, manipulates oxygen in a way that liberates energy from foodstuffs . They are very hungry organisms that a billion of them are packed in a grain of sand.

Mitochondria maintain their own DNA, RNA, and ribosome and behave as if they think things might not work out between us.

They look like bacteria, divide like bacteria and sometimes respond to antibiotics in the same way bacteria do; they live in cells but do not speak the same genetic language.

The truly nucleated cells are called eukaryotes and we ended up with two kinds of them: those that expel oxygen, like plants, and those that take in oxygen, like us.

Single-celled eukaryote contains 400 million bits of genetic information in its DNA, enough to fill 80 books of 500 pages.  It took a billion years for eukaryotes to learn to assemble into complex multi-cellular beings.

Microbes or bacteria form an intrinsic unit with our body and our survival.  They are in the trillions, grazing on our fleshy plains and breaking down our foodstuff and our waste into useful elements for our survival.

They synthesize vitamins in our guts, convert food into sugar and polysaccharides and go to war on alien microbes; they pluck nitrogen from the air and convert it into useful nucleotides and amino acids for us, a process that is extremely difficult to manufacture industrially.

Microbes continue to regenerate the air that we breathe with oxygen.  Microbes are very prolific and can split and generate 280 billion offspring within a day.

In every million divisions, a microbe may produce a mutant with a slight characteristic that can resist antibodies.

The most troubling is that microbes are endowed with the ability to evolve rapidly and acquire the genes of the mutants and become a single invincible super-organism; any adaptive change that occurs in one area of the bacterial province can spread to any other.

Microbes are generally harmless unless, by accident, they move from a specialized location in the body to another location such as the blood stream, for example, or are attacked by viruses, or our white blood cells go on a rampage.

Microbes can live almost anywhere; some were found in nuclear power generators feeding on uranium, some in the deep seas, some in sulfuric environment, some in extreme climate, and some can survive in enclosed bottles for hundred of years, as long as there is anything to feed on.

Viruses or phages can infect bacteria. A virus are not alive, they are nucleic acid, inert and harmless in isolation and visible by the electron microscope. Viruses barely have ten genes; even the smallest bacteria require several thousand genes..  But introduce them into a suitable host and they burst into life.

Viruses prosper by hijacking the genetic material of a living cell and reproduce in a fanatical manner.  About 5,000 types of virus are known and they afflict us with the flu, smallpox, rabies, yellow fever, Ebola, polio and AIDS.

Viruses burst upon the world in some new and startling form and then vanish as quickly as they came after killing millions of individuals in a short period.

There are billions of species. Tropical rainforests that represent only 6% of the Earth surface harbor more than half of its animal life and two third of its flowering plants.

A quarter of all prescribed medicines are derived from just 40 plants and 16% coming from microbes.

The discovery of new flowery plants might provide humanity with chemical compounds that have passed the “ultimate screening program” over billions of years of evolution.

The tenth of the weight of a six year-old pillow is made up of mites, living or dead, and mite dung; washing at low temperature just get the lice cleaner!

Einstein speaks on theoretical physics; (Nov. 18, 2009)

The creative character of theoretical physicist is that the products of his imagination are so indispensably and naturally impressed upon him that they are no longer images of the spirit but evident realities. Theoretical physics includes a set of concepts and logical propositions that can be deduced normally. Those deductive propositions are assumed to correspond exactly to our individual experiences.  That is why in theoretical book the deduction exercises represent the entire work.

Newton had no hesitation in believing that his fundamental laws were provided directly from experience.  At that period the notion of space and time presented no difficulties: the concepts of mass, inertia, force, and their direct relationship seemed to be directly delivered by experience.  Newton realized that no experience could correspond to his notion of absolute space which implicates absolute inertia and his reasoning of actions at distance; nevertheless, the success of the theory for over two centuries prevented scientists to realize that the base of this system is absolutely fictive.

Einstein said “the supreme task of a physician is to search for the most general elementary laws and then acquire an image of the world by pure deductive power. The world of perception determines rigorously the theoretical system though no logical route leads from perception to the principles of theory.” Mathematical concepts can be suggested by experience, the unique criteria of utilization of a mathematical construct, but never deducted. The fundamental creative principle resides in mathematics.

Logical deductions from experiments of the validity of the Newtonian system of mechanics were doomed to failures. Research by Faraday and Maxwell on the electro-magnetic fields initiated the rupture with classical mechanics. There was this interrogation “if light is constituted of material particles then where the matters disappear when light is absorbed?” Maxwell thus introduced partial differential equations to account for deformable bodies in the wave theory. Electrical and magnetic fields are considered as dependent variables; thus, physical reality didn’t have to be conceived as material particles but continuous partial differential fields; but Maxwell’s equations are still emulating the concepts of classical mechanics.

Max Plank had to introduce the hypothesis of quanta (for small particles moving at slow speed but with sufficient acceleration), which was later confirmed, in order to compute the results of thermal radiation that were incompatible with classical mechanics (still valid for situations at the limit).  Max Born pronounced “Mathematical functions have to determine by computation the probabilities of discovering the atomic structure in one location or in movement”.

Louis de Broglie and Schrodinger demonstrated the fields’ theory operation with continuous functions. Since in the atomic model there are no ways of locating a particle exactly (Heisenberg) then we may conserve the entire electrical charge at the limit where density of the particle is considered nil. Dirac and Lorentz showed how the field and particles of electrons interact as of same value to reveal reality. Dirac observed that it would be illusory to theoretically describe a photon since we have no means of confirming if a photon passed through a polarizator placed obliquely on its path. 

      Einstein is persuaded that nature represents what we can imagine exclusively in mathematics as the simplest system in concepts and principles to comprehend nature’s phenomena.  For example, if the metric of Riemann is applied to a continuum of four dimensions then the theory of relativity of gravity in a void space is the simplest.  If I select fields of anti-symmetrical tensors that can be derived then the equations of Maxwell are the simplest in void space.

The “spins” that describe the properties of electrons can be related to the mathematical concept of “semi-vectors” in the 4-dimensional space which can describe two kinds of elementary different particles of equal charges but of different signs. Those semi-vectors describe the magnetic field of elements in the simplest way as well as the properties electrical particles.  There is no need to localize rigorously any particle; we can just propose that in a portion of 3-dimensional space where at the limit the electrical density disappears but retains the total electrical charge represented by a whole number. The enigma of quanta can thus be entirely resolved if such a proposition is revealed to be exact.


            Till the first quarter of the 20th century sciences were driven by shear mathematical constructs.  This was a natural development since most experiments in natural sciences were done by varying one factor at a time; experimenters never used more than one independent variable and more than one dependent variable (objective measuring variable or the data).  Although the theory of probability was very advanced the field of practical statistical analysis of data was not yet developed; it was real pain and very time consuming doing all the computations by hand for slightly complex experimental designs. Sophisticated and specialized statistical packages constructs for different fields of research evolved after the mass number crunchers of computers were invented. 

            Thus, early theoretical scientists refrained from complicating their constructs simply because the experimental scientists could not practically deal with complex mathematical constructs. Thus, the theoretical scientists promoted the concept or philosophy that theories should be the simplest with the least numbers of axioms (fundamental principles) and did their best to imagining one general causative factor that affected the behavior of natural phenomena or would be applicable to most natural phenomena.

            This is no longer the case. The good news is that experiments are more complex and showing interactions among the factors. Nature is complex; no matter how you control an experiment to reducing the numbers of manipulated variables to a minimum there are always more than one causative factor that are interrelated and interacting to producing effects.

            Consequently, the sophisticated experiments with their corresponding data are making the mathematician job more straightforward when pondering on a particular phenomenon.  It is possible to synthesize two phenomena at a time before generalizing to a third one; mathematicians have no need to jump to general concepts in one step; they can consistently move forward on firm data basis. Mathematics will remain the best synthesis tool for comprehending nature and man behaviors.

            It is time to account for all the possible causatives factors, especially those that are rare in probability of occurrence (at the very end tail of the probability graphs) or for their imagined little contributing effects: it is those rare events that have surprised man with catastrophic consequences.

            Theoretical scientists of nature’s variability should acknowledge that nature is complex. Simple and beautiful general equations are out the window. Studying nature is worth a set of equations! (You may read my post “Nature is worth a set of equations”)

Cognitive mechanisms; (Dec. 26, 2009)

Before venturing into this uncharted territory let me state that there is a “real universe” that each one perceives differently: if this real world didn’t exist then there would be nothing to perceive. The real world cares less about the notions of time and space. No matter how we rationalize about the real world our system of comprehension is strictly linked to our brain/senses systems of perceptions. The way animals perceive the universe is different than our perception.  All we can offer are bundles of hypotheses that can never be demonstrated or confirmed even empirically. The best we can do is to extend the hypothesis that our perceived universe correlates (qualitative coherent resemblance) with the real universe. The notions of time, space, and causality are within our perceived universe.  Each individual has his own “coherent universe” that is as valid as any other perception. What rational logic and empirical experiments have discovered in “laws of nature” apply only to our perceived universe; mainly to what is conveniently labeled the category of grown up “normal people” who do not suffer major brain disturbances or defects.

Man uses symbols such as language, alphabets, mathematical forms, and musical symbols to record their cognitive performances. Brain uses “binary code” of impressions and intervals of non impressions to register a codified impression.  Most probably, the brain creates all kinds of cells and chemicals to categorize, store, classify, and retrieve various impressions; the rational is that since no matter how fast an impression is it stands to reason that the trillions and trillions of impressions would saturate the intervals between sensations in no time.

We are born with 25% of the total number of synapses that grown up will form.  Neurons have mechanisms of transferring from one section of the brain to other parts when frequent focused cognitive processes are needed. A child can perceive one event following another one but it has no further meaning but simple observation.  A child is not surprised with magic outcomes; what is out of the normal for a grown up is as valid a phenomenon as another to him (elephant can fly). We know that vision and auditory sensations pass through several filters (processed data) before being perceived by the brain.  The senses of smell and taste circumvent filters and are sensed by the limbic (primeval brain) before passing this data to cognition.

The brain attaches markers or attributes to impressions that it receives.  Four markers that I call exogenous markers attach to impressions as they are “registered” or perceived in the brain coming from the outside world through our senses.  At least four other markers, I label “endogenous markers” are attached to internal cognitive processing and are attached to information when re-structuring or re-configurations are performed during the dream periods: massive computations are needed to stored data before they are transformed to other ready useful data before endogenous markers are attributed to them for registering in other memory banks. There are markers that I call “reverse-exogenous” and are attached to information meant to be exported from the brain to the outside world. They are mainly of two kinds: body language information (such as head, hand, shoulder, or eye movements) and the recorded types on external means such as writing, painting, sculpting, singing, playing instruments, or performing art work.

The first exogenous marker directs impressions from the senses in their order of successions. The child recognizes that this event followed the other one within a short period of occurrence. His brain can “implicitly” store the two events are following in succession in a qualitative order (for example the duration of the succession is shorter or longer than the other succession). I label this marker as “Time recognizer” in the qualitative meaning of sensations.

The second marker registers and then stores an impression as a spatial configuration. At this stage, the child is able to recognize the concept of space but in a qualitative order; for example, this object is closer or further from the other object. I call this marker “space recognizer”.

The third marker is the ability to delimit a space when focusing on a collection of objects. Without this ability to first limit the range of observation (or sensing in general) it would be hard to register parts and bits of impressions within a first cut of a “coherent universe”. I label this marker “spatial delimiter”

The fourth marker attaches “strength” or “weight” of occurrence as the impression is recognized in the database.  The child cannot count but the brain is already using this marker for incoming information. In a sense, the brain is assembling events and objects in special “frequency of occurrence” database during dream periods and the information are retrieved in qualitative order of strength of sensations in frequency.  I call this attribute “count marker”.

The fifth marker is an endogenous attributes: this marker is attached within the internal export/import of information in the brain. This attribute is a kind of “correlation” quantity that indicates same/different trends of behavior of events or objects.  In a sense, this marker will internally sort out data as “analogous” or contrary collections along a time scale. People have tendency to associate correlation with cause and effect relation but it is not. A correlation quantity can be positive (two variables have the same behavioral trend in a system) or negative quantity (diverging trends). With the emergence of the 5th marker the brain has grown a quantitative threshold in synapses and neurons to starting massive computations on impressions stored in the large original database or what is called “long-term memory”.

The sixth marker is kind of a “probability quantity” that permits the brain to order objects according to “plausible” invariant properties in space (for example objects or figures are similar according to a particular property, including symmetrical transformations). I label this the “invariant marker” and it re-structures collections of objects and shapes in structures such as hereditary, hierarchical, network, or circular.

The seventh marker I call the “association attribute”. Methods of deduction, inductions, and other logical manipulations are within these kinds of data types.  They are mostly generated from rhetorical associations such as analogies, metaphors, antonyms, and other categories of associations. No intuition or creative ideas are outside the boundary of prior recognition of the brain.  Constant focus and work on a concept generate complex processing during the dream stage. The conscious mind recaptures sequences from the dream state and most of the time unconsciously. What knowledge does is decoding in formal systems the basic processes of the brain and then re-ordering what seems as chaotic firing in brain cells.  Symbols were created to facilitate rules writing for precise rationalization.

The eighth marker I call the “design marker”; it recognizes interactions among variables and interacts with reverse exogenous markers since a flow with outside perceptions is required for comprehension. Simple perceived relationships between two events or variables are usually trivial and mostly wrong; for example thunder follows lightning and thus wrongly interpreted as lightning generates thunder.  Simple interactions are of the existential kind as in the Pavlov reactions where existential rewards, such as food, are involved in order to generate the desired reactions. The Pavlov reaction laws apply to man too. Interactions among more than two variables are complex for interpretations in the mind and require plenty of training and exercises.  Designing experiments is a very complex cognitive task and not amenable to intuition: it requires learning and training to appreciating the various cause and effects among the variables.

The first kinds of “reverse exogenous” markers can be readily witnessed in animals such as in body language of head, hand, shoulder, or eye movements; otherwise Pavlov experiments could not be conducted if animals didn’t react with any external signs. In general, rational thinking retrieves data from specialized databases “cognitive working memory” of already processed data and saved for pragmatic utility. Working memories are developed once data find outlets to the external world for recording; thus, pure thinking without attempting to record ideas degrades the cognitive processes with sterile internal transfer without new empirical information to compute in.

An important reverse-exogenous marker is sitting still, concentrating, emptying our mind of external sensations, and relaxing the mind of conscious efforts of perceiving the knowledge “matter” in order to experience the “cosmic universe”.

This article was not meant to analyze emotions or value moral systems.  It is very probable that the previously described markers are valid for the moral value systems with less computation applied to the data transferred to the “moral working memory”. I believe that more other sophisticated computations are performed than done to emotional data since a system is constructed for frequent “refreshing” with age and experiences.

I conjecture that emotions are generated from the vast original database and the endogenous correlation marker is the main computation method: the reason is that emotions are related to complex and almost infinite interactions with people and community; thus, the brain prefers not to consume time and resources on complex computations that involve many thousands of variables interacting simultaneously. Thus, an emotional reaction in the waking period is not necessarily “rational” but of the quick and dirty resolutions kinds. In the dream sessions, emotionally loaded impressions are barely processed because they are hidden deep in the vast original database structure and are not refreshed frequently to be exposed to the waking conscious cognitive processes; thus, they flare up within the emotional reaction packages.

Note: The brain is a flexible organic matter that can be trained and developed by frequent “refreshing” of interactions with the outside world of sensations. Maybe animals lack the reverse exogenous markers to record their cognitive capabilities; more likely, it is because their cognitive working memory is shriveled that animals didn’t grow the appropriate limbs for recording sensations: evolution didn’t endow them with external performing limbs for writing, sculpting, painting, or doing music. The fact that chimps were trained to externalize cognition as valid as 5 years old capabilities suggest that attaching artificial limbs to chimps, cats, or dogs that are compatible with human tools will demonstrate that chimps can give far better cognitive performance than expected.

This is a first draft to get the project going. I appreciate developed comments and references.

How random and unstable are your phases? (Dec. 7, 2009)

There are phenomena in the natural world that behave randomly or what it seems chaotic such as in percolation and “Brownian movement” of gases.  The study of phases in equilibrium among chaotic, random, and unstable physical systems were analyzed first my physicists and taken on by modern mathematicians.

The mathematician Wendelin Werner (Fields Prize) researched how the borders that separate two phases in equilibrium among random, and unstable physical systems behave; he published “Random Planar Curves…”

Initially, the behavior of identical elements (particles) in large number might produce deterministic or random results in various cases.

For example, if we toss a coin many times we might guess that heads and tails will be equal in number of occurrences; the trick is that we cannot say that either head or tail is in majority.

The probabilistic situations inspired the development of purely mathematical tools.  The curves between the phases in equilibrium appear to be random, but have several characteristics:

First, the curves have auto-similarity, which means that the study of a small proportion could lead to generalization in the macro-level with the same properties of “fractal curves”,

The second characteristic is that even if the general behavior is chaotic a few properties remain the same (mainly, the random curves have the same “fractal dimension” or irregular shape;

The third is that these systems are very unstable (unlike the games of head and tails) in the sense that changing the behavior of a small proportion leads to large changes by propagation on a big scale.  Thus, these systems are classified mathematically as belonging to infinite complexity theories.

Themes of unstable and random systems were first studied by physicists and a few of them received Nobel Prizes such as Kenneth Wilson in 1982.

The research demonstrated that such systems are “invariant” by transformations (they used the term re-normalization) that permit passages from one scale to a superior scale.  A concrete example is percolation.

Let us take a net resembling beehives where each cavity (alveolus) is colored black or red using the head and tail flipping technique of an unbiased coin. Then, we study how these cells are connected randomly on a plane surface.

The Russian Stas Smirnov demonstrated that the borders exhibit “conforming invariance”, a concept developed by Bernhard Riemann in the 19th century using complex numbers. “Conforming invariance” means that it is always possible to warp a rubber disk that is covered with thin crisscross patterns so that lines that intersect at right angle before the deformation can intersect at right angle after the deformation.  The set of transformations that preserves angles is large and can be written in series of whole numbers or a kind of polynomials with infinite degrees. The transformations in the percolation problem conserve the proportion of distances or similitude.

The late Oded Schramm had this idea: suppose two countries share a disk; one country control the left border and the other the right border; suppose that the common border crosses the disk. If we investigate a portion of the common border then we want to forecast the behavior of the next portion.

This task requires iterations of random conforming transformations using computation of fractal dimension of the interface. We learn that random behavior on the micro-level exhibits the same behavior on the macro-level; thus, resolving these problems require algebraic and analytical tools.

The other case is the “Brownian movement” that consists of trajectories where one displacement is independent of the previous displacement (stochastic behavior).  The interfaces of the “Brownian movement” are different in nature from percolation systems.

Usually, mathematicians associate a probability “critical exponent or interaction exponent” so that two movements will never meet, at least for a long time.  Two physicists, Duplantier and Kyung-Hoon Kwan, extended the idea that these critical exponents belong to a table of numbers of algebraic origin. Mathematical demonstrations of the “conjecture” or hypothesis of Benoit Mandelbrot on fractal dimension used the percolation interface system.

Werner said: “With the collaboration of Greg Lawler we progressively comprehended the relationship between the interfaces of percolation and the borders of the Brownian movement.  Strong with Schramm theory we knew that our theory is going to work and to prove the conjecture related to Brownian movement.”

Werner went on: “It is unfortunate that the specialized medias failed to mention the great technical feat of Grigori Perelman in demonstrating Poincare conjecture.  His proof was not your tread of the mill deductive processes with progressive purging and generalization; it was an analytic and human proof where hands get dirty in order to control a bundle of possible singularities.

These kinds of demonstrations require good knowledge  of “underlying phenomena”.

As to what he consider a difficult problem Werner said: “I take a pattern and then count the paths of length “n” so that they do not intersect  twice at a particular junction. This number increases exponentially with the number n; we think there is a corrective term of the type n at exponential 11/32.  We can now guess the reason for that term but we cannot demonstrate it so far.”

The capacity of predicting behavior of a phenomenon by studying a portion of it then, once an invariant is recognized, most probably a theory can find counterparts in the real world; for example, virtual images techniques use invariance among objects. It has been proven that vision is an operation of the brain adapting geometric invariance that are characteristics of the image we see.

Consequently, stability in the repeated signals generates perception of reality.  In math, it is called “covariance laws” when system of references are changed.  For example, the Galileo transformations in classical mechanics and Poincare transformations in restricted relativity.

In a sense, math is codifying the processes of sensing by the brain using symbolic languages and formulations.

568.  theoretical physicsspeaks on theoretical physics; (Nov. 18, 2009)


569.  I am mostly the other I; (Nov. 19, 2009)


570.  Einstein speaks on General Relativity; (Nov. 20, 2009)


571.  Einstein speaks of his mind processes on the origin of General Relativity; (Nov. 21, 2009)


572.  Everyone has his rhetoric style; (Nov. 22, 2009)

Einstein speaks on General Relativity; (Nov. 20, 2009)

I have already posted two articles in the series “Einstein speaks on…” This article describes Einstein’s theory of restricted relativity and then his concept for General Relativity. It is a theory meant to extend physics of fields (for example electrical and magnetic fields among others) to all natural phenomena, including gravity. Einstein declares that there was nothing speculative in his theory but it was adapted to observed facts.

The fundamentals are that the speed of light is constant in the void and that all systems of inertia are equally valid (each system of inertia has its own metric time). The experience of Michelson has demonstrated these fundamentals. The theory of restrained relativity adopts the continuum of space coordinates and time as absolute since they are measured by clocks and rigid bodies with a twist: the coordinates become relative because they depend on the movement of the selected system of inertia.

The theory of General Relativity is based on the verified numerical correspondence of inertia mass and weight. This discovery is obtained when coordinates posses relative accelerations with one another; thus each system of inertia has its own field of gravitation. Consequently, the movement of solid bodies does not correspond to the Euclid geometry as well as the movement of clocks. The coordinates of space-time are no longer independent. This new kind of metrics existed mathematically thanks to the works of Gauss and Riemann.

Ernst Mach realized that classical mechanics movement is described without reference to the causes; thus, there are no movements but those in relation to other movements.  In this case, acceleration in classical mechanics can no longer be conceived with relative movement; Newton had to imagine a physical space where acceleration would exist and he logically announced an absolute space that did not satisfy Newton but that worked for two centuries. Mach tried to modify the equations so that they could be used in reference to a space represented by the other bodies under study.  Mach’s attempts failed in regard of the scientific knowledge of his time.

We know that space is influenced by the surrounding bodies and so far, I cannot think the general Relativity may surmount satisfactorily this difficulty except by considering space as a closed universe, assuming that the average density of matters in the universe has a finite value, however small it might be.




April 2020

Blog Stats

  • 1,376,499 hits

Enter your email address to subscribe to this blog and receive notifications of new posts by

Join 720 other followers

%d bloggers like this: