Franka Waaldijk's math & science & philosophy blog

A conceptual model for human image recognition: combining passive memory with active imagination

Posted on February 15, 2015 by franka waaldijk

When I was in college (in the ’80s) the question why humans outperform computers in image recognition already was receiving some attention. At the time an idea came to me, and it still seems relevant enough to write down. No doubt similar ideas have been brought forward, but some repetition will do no harm.

In this post I describe a simple conceptual model of how human image recognition could work, given the obvious limitation on human memory capacity when compared to computers. A key observation is that although we are good at passive recognition, our active visual memory seems very limited. We do not seem to store entire images in our memory. If we are asked to visualize objects, faces, scenes in our mind we usually find that it is very hard to really produce `detailed’ mental imagery.

The model also offers an explanation for the déjà vu phenomenon.

Recent developments and background
Nowadays, there are areas of image recognition / classification in which computers outperform humans, so the question has evolved a bit. But still, in the general field of image recognition the feeling is that humans are generally better than computers…so far.

Stanford University (Andrej Karpathy with Fei-Fei Li) in collaboration with Google has recently announced a significant improvement in artificial-intelligence image recognition (New York Times article November 2014, see here for the Stanford technical paper).

Even more recently Amirhossein Farzmahdi et al. (at the Institute for Research on Fundamental Sciences in Tehran) published a paper on neural-network based face-recognition software (review, for the paper on arXiv see here), derived from studies of primate brains in relation to face recognition. Although still not nearly as good as humans, at least the software shows traits similar to human face-recognition performance.

Holistic face-processing seems to be the human way (`hotly debated yet highly supported’ according to the abstract of the above paper), and neuroscience describes specialized areas in the brain for face recognition.

A conceptual model for human image recognition
Enough background. On to the conceptual model promised in the title. A main question to me in college was:

How can one devise a recognition machinery which does not take up enormous memory?

A key observation seems that although we are good at passive recognition, our active visual memory is very limited. We do not store entire images in our memory. If we are asked to visualize objects, faces, scenes in our mind we will find that it is very hard to really produce `detailed’ mental imagery.

Nonetheless, given some time, we can come up with more and more details. And of course we are extremely good at passive recognition. Even if the face we see has been altered by lighting, aging, facial hair, you name it. But can we always immediately place a name to a face? No we can’t. We often struggle: `… I’m sure I know this person from somewhere, but was it high school? Some holiday? The deli near my previous job? …’

And then slowly, we can enhance our recognition by going down such paths, imagining the person a bit younger perhaps, or with a shovel, or in this deli with an employee’s uniform…until we hit on a strong recognition sense and say: `Hey Nancy, wow, I almost didn’t recognize you with those sunglasses and short hair, it’s been a long time.’

This leads to the following conceptual model. Possibly, our image recognition uses two components: one-dimensional passive recognition and more-dimensional active imagination.

The first component is one-dimensional passive recognition. By this I mean that visual data is generally not stored, but memory-processed in such a way that when similar visual data are observed, a sense of recognition is triggered. One-dimensional: from 0 (no recognition at all) to 1 (sure sense of recognition).

So when we observe say a face, our brain does not store actual `pixels’, but instead creates some sort of tripwire. Or better still: a collection of tripwires. These tripwires then give off a signal when a similar face is observed. The more similarity, the stronger the signal (which produces the sensation: `hey I’ve seen this face before (or close)’).

Then the second component comes into play: more-dimensional active imagination. By this I mean an active imaging, which changes components of the observed image, with the express purpose of amplifying the tripwire signal (the passive recognition sense). Suppose I look at the face before me, imagine it without beard, and the tripwire signal gets stronger… then I am one step closer to recognizing the face. Next I picture this person in my old college, but the signal gets weaker…so next I search in my job history…and I hit a stronger signal upon my third job (still don’t know who it is but I am getting closer)… etc.

In this way, without storing large `files’, it should be possible to reach high levels of passive recognition. This does depend on creating very good tripwires, and having a good active imagination. Such a system would favour `holistic’ recognition (in concurrence with scientific findings), because details are not stored separately.

That’s almost all for now. In the recent news on image recognition software I haven’t seen the idea of `active imaging to enhance passive recognition’ come up (but that doesn’t mean it is not used). Oh, and finally: how does this model explain déjà vu?

Well that is really easy. According to the model, déjà vu occurs when a tripwire is falsely yet strongly triggered. The brain is flooded with a strong sense of recognition, which has no base in a factual previous experience. If you have ever experienced déjà vu, you will likely do so again :-). If it concerns a situation (`I’ve been in this exact situation before’) you could try to see if you can predict what will happen next. According to this model, you can’t, but still the feeling of recognition will only slowly die away.

[Update 17 Feb:]
In this recent article on face detection what I call `tripwire’ is called a `detector’, and a series of tripwires is called a `detector cascade’.

Posted in Uncategorized | Tagged active imagination, déjà vu, detector cascade, face detection, facial recognition, human image recognition, image recognition software, memory, neural networks, passive recognition | 2 Comments

Collective Intelligence seems a bigger threat than Artificial Intelligence

Posted on February 13, 2015 by franka waaldijk

Recently both Stephen Hawking and Bill Gates have voiced their concern over Artificial Intelligence (AI), warning that AI could possibly become a threat to humanity in the future.

This prompts me to (finally) write down some thoughts on Collective Intelligence (CI), which is also sometimes referred to as swarm intelligence or hive intelligence (hive mind) when not dealing with humans. CI refers to the idea that humans can create a hive mind – even unknowingly. (As a primer you could read Collective Intelligence in Humans: A Literature Review by Juho Salminen.).

Of course a fundamental question in regard to hive intelligences is: does an intelligent hive have self-awareness? Somehow we `always’ associate intelligence with self-awareness, but to me this might well be because we have a hard time picturing intelligences which differ from ours. However, even if a CI made up of humans would have self-awareness, these humans would be unlikely to be aware of this. Do ants know that their ant hill is intelligent?

To me it seems likely that CI is already a reality. In this view there already are non-human intelligences which are stronger than human intelligence. Consider any large human organization (corporation, religion, country, …) and consider whether it displays signs of hive intelligence (such as seen in ant hills):

Large human organizations (LHOs) have a strong tendency to self-preservation.
LHOs compete fiercely for resources.
LHOs are largely independent of the individuals of which they are comprised. Anyone is replaceable, although some replacements impact more than others.
LHOs learn and adapt. They retain memories. They have active long-term strategies as well as surviving tactics.
The individuals which help form the LHO are usually quite differentiated according to the tasks they perform. The factory worker is unlikely to be able to come up with marketing and sales strategies; vice versa the marketing and sales analyst is unlikely to be able to craft the product to be sold.
Communication `internal’ to the LHO is usually quite different from communication with other LHOs. There are secrets, there are barriers, there is misunderstanding, there is difference in speed and informality of communication.
Internal efficiency is a key driving force in the development of LHOs. There is a continuous pressure to perform more efficiently. This pressure comes from the fierce competition for resources, and any LHO which does not adapt quickly enough, efficiently enough, will be swept aside and dismantled (devoured) by those who do.
There is pressure on individuals to conform to the `code’ or `identity’ of the LHO to which they belong.

If the above meets a little with your recognition, then I can continue to where I was headed:

CI poses a bigger threat to humans than AI.

Why? Let’s see. Have you lately had any thoughts similar to:

I am on a treadmill, we are all on a treadmill. Fast is seldom fast enough. Good is only good enough for a very short time.
If I don’t conform to `the norm’ I will be cast aside, left behind, ridiculed, ignored.
If I were completely free and independent of income concerns, I would do things very differently.
If I were completely free and independent of social concerns, I would do things very differently.
I have to live up to the expectations of a) employer b) peers c) family d) friends e) society f) myself …
I have to keep up with the latest developments. New technology, social platforms, new hypes and raves, the news, I have to be up-to-date.
I have to communicate, participate in networks, just in order to get by socially and professionally.
I have to profile myself, promote myself, market myself, advertise myself, prove myself more and more. Just doing my job does not cut it anymore. To administrators, to peers, it is important that I am innovative, pushing borders, and pushing myself to new `heights’.
I have to be seen as a responsible-enough member of society. Law-abiding and not amoral.
I have to find money for a) my project b) my research c) my prototype d) my dream … In order to raise this money I need to convince people that a), b), c), d)… is more worthy than those of others.
…

I can go on like this, but I hope my point is clear. Most of us are being `forced’ by various LHOs to conform more and more to role patterns that are beneficial to these LHOs but possibly detrimental to us.

The ant hill only cares about having enough able workers and soldiers to survive and hopefully thrive and expand. It does not care about what kind of life these workers and soldiers lead.

Moreover, if ants stray too far from the ant hill and pick up too many strange smells, they are no longer recognized as `own’ and thus become prone to attack from the other ants. To me this mirrors the increasing difficulty for individuality in our society.

It has become more and more difficult to operate on an individual basis, in the past decades. The individual voice is slowly being drowned out. Non-conformity becomes harder. The worth of our endeavours is increasingly being measured in terms of social response to these endeavours. Citation counts, Facebook likes, number of followers, and … money. Money is an easily underestimated factor in the workings of CI, but it is the natural `reward’ for any CI’s exertion. It can easily be compared to packets of sugar for the ant hill.

Modern ICT has tremendously increased the capabilities for CIs to expand rapidly. Which is why I expect to see the above effects crystallize more clearly in the near future.

So, to recap, I believe we are already seeing Collective Intelligences at work, influencing our lives more heavily than we would like. Personally, I can only hope that we are capable of preventing CIs from taking over completely, but to be honest I doubt it.

And if it ever came to a contest between AI and CI, my money would be on the latter…

[Update 16 Feb:]
Thanks to Toby Bartels for pointing out on Google+ that CI and AI can be seen more compellingly as two sides of the same coin:

“I’m not sure that there’s much difference. An artificial general intelligence (that is, the sort of artificial intelligence that worries people, as opposed to specialized expert systems) is unlikely to be developed by an individual in a garage. It’ll be developed by a corporation (or worse, a military), and it will work against us regardless of whether it stays in or escapes its box.”

Posted in Uncategorized | Tagged Artificial Intelligence, Collective Intellligence, hive intelligence, hive mind, society, swarm intelligence | Leave a comment

The arrow of time (4): entropy and reversibility, pictured in designs

Posted on June 11, 2014 by franka waaldijk

So…I do not see any convincing answers in physics to the basic question of `what is time?’. To wrap up this complicated subject for now, I will show some half-designs for the IMAPP symposium (which I did not elaborate on since the Francesco del Cossa design that I showed earlier clearly was superior), and reformulate earlier questions on time and entropy.

The first question already is hard to formulate without falling into much inaccuracy as well as absurdity. But here goes anyway: suppose we have two situations / configurations S₁ and S₂ of a closed system U [for Universe] such that S₁ is exactly the same as S₂ in every conceivable non-time-related way (particles, waves, constellations,…down to every last photon). Then I would say that

A: Within U there is no time difference between S₁ and S₂, in other words they are also time identical.
B: Therefore time in U corresponds to (some measure of) the difference between configurations of U.

Hence my earlier `formula’ ΔTime $\approx$ ΔEntropy.

Thus the question of (ir)reversibility, known as `arrow of time’, in my eyes could well be a tautology. To see this, consider the statement: `we cannot lower the entropy of the system U when going forward in time’. When ΔTime `equals’ ΔEntropy this more or less becomes equivalent to: `we cannot lower the entropy of the system U when the entropy is increasing’…

Also, it raises the question whether time is a `local’ phenomenon (non-uniform). One half-design that I made for the IMAPP symposium centered around this entropy idea:

(click for enlargement, you might notice different `reversal arrows’ which I added pictorially, to express the questions surrounding this subject)

Next, in my eyes the question of causality and reversibility is intimately connected to our own consciousness. We seem to experience things exclusively in the present, but! we do not even know what `experience’ and `the present’ mean. Anything we experience stems from neurons firing in our brain; anything we see/hear/sense in this way has a time lag as compared to the stimulus which provoked our senses…

Somehow we retain memories (unreliable!) from past events, and we experience time as moving forward, probably because our consciousness is hardwired that way. See Immanuel Kant‘s Kritik der Reinen Vernunft, we quote from wikipedia:

Kant proposed a “Copernican Revolution” in philosophy. Kant argued that our experiences are structured by necessary features of our minds. In his view, the mind shapes and structures experience so that, on an abstract level, all human experience shares certain essential structural features. Among other things, Kant believed that the concepts of space and time are integral to all human experience, as are our concepts of cause and effect.[3] One important consequence of this view is that one never has direct experience of things, the so-called noumenal world, and that what we do experience is the phenomenal world as conveyed by our senses. These claims summarize Kant’s views upon the subject–object problem.

In my humble and ignorant opinion, Kantian philosophy is not eclipsed by Einstein’s relativity and its concomitant spacetime. A real discussion of relativistic spacetime is beyond both me and the scope of this series of blog posts, but perhaps it is relevant to notice that causality in relativistic spacetime hinges on `light cones’ (image by Stib):

When it comes to reversibility and the arrow of time, the Kantian crux seems to me: what do we mean with the word causality?

If our consciousness were hardwired `the other way round’, could we not perceive reality as follows: a billiard ball rolling `kauses’ a billiard cue to hit the ball which in turn `kauses’ a billiard player to appear at the billiard table etc. etc.

With this in mind I made the following Escherian half-design:

(click for enlargement)

and since I found this half-design to be too sterile, I also made another one based more on entropy and human experience (which is approximate, vague, sketchy):

(click for enlargement)

This last design was a close contender (but lost to the Del Cossa design, see the first posts in this series), however it lacked the depiction of a human interaction, intervention, … I also tried yet another half-design, before finally picking the Del Cossa design:

(click for enlargement, the arrow shooter comes from a centaur sculpture in the Basilique Sainte-Marie-Madeleine of Vézelay, the original photo was taken by Vassil)

In the end , the Del Cossa design, apart from its visual strength, had another interesting feature which proved decisive: the golden circle held by its arrow-bearing protagonist. To me this circle symbolized both mathematics, and two other conundrums of time: can there be a first moment in time? is time circular (another way of looking at reversibility)?:

(click for enlargement, almost final design, just the sponsors omitted)

Hope you enjoyed this cross-over between science, philosophy and graphical design!

Postscript: if you’ve come this far, then the following very recent article should interest you: new quantum theory could explain the flow of time. It seems that every few years or so, a new insight in `the arrow of time’ is claimed…which in a way illustrates how hard the problems surrounding time really are.

Entropy, entanglement, energy dispersal… they all start with an E so perhaps I could just pimp up my `formula’ thus: ΔT = H(ΔE), where H is some appropriate function (multiplication with constant would be nice but is probably far too simplistic).

Isn’t cosmology just the most marvelous religion? The really amazing part of physics to me is that we actually succeed in increasing our capabilities to manipulate Nature, even when (in my eyes) we remain largely ignorant of the real mechanisms at work. On the other hand, I’m highly pessimistic about whether this increase in manipulative activities will be beneficial to humanity, and life on earth in general.

[End of series]

Posted in Uncategorized | Tagged arrow of time, entropy, Escher, graphical design, IMAPP, Immanuel Kant, quantum entanglement, spacetime, time | Leave a comment

The arrow of time (3): what is it, actually?

Posted on June 8, 2014 by franka waaldijk

No, but honestly: what is meant by the term `the arrow of time’? I cannot get my head around it, unfortunately. This seemingly puts me in an unenviable minority position, enhanced by my obvious ignorance of relevant theories in modern physics.

My problem is this: If we do not know at all what time is, then how can we determine that there is an arrow of time?

As an illustration, let us look at the second law of thermodynamics, as related to the arrow of time via entropy (quote from wikipedia: entropy (arrow of time)):

Entropy is the only quantity in the physical sciences (apart from certain rare interactions in particle physics; see below) that requires a particular direction for time, sometimes called an arrow of time. As one goes “forward” in time, the second law of thermodynamics says, the entropy of an isolated system can increase, but not decrease. Hence, from one perspective, entropy measurement is a way of distinguishing the past from the future. However in thermodynamic systems that are not closed, entropy can decrease with time: many systems, including living systems, reduce local entropy at the expense of an environmental increase, resulting in a net increase in entropy. Examples of such systems and phenomena include the formation of typical crystals, the workings of a refrigerator and living organisms.

Entropy, like temperature, is an abstract concept, yet, like temperature, everyone has an intuitive sense of the effects of entropy. Watching a movie, it is usually easy to determine whether it is being run forward or in reverse. When run in reverse, broken glasses spontaneously reassemble, smoke goes down a chimney, wood “unburns”, cooling the environment and ice “unmelts” warming the environment. No physical laws are broken in the reverse movie except the second law of thermodynamics, which reflects the time-asymmetry of entropy. An intuitive understanding of the irreversibility of certain physical phenomena (and subsequent creation of entropy) allows one to make this determination.

By contrast, all physical processes occurring at the microscopic level, such as mechanics, do not pick out an arrow of time. Going forward in time, an atom might move to the left, whereas going backward in time the same atom might move to the right; the behavior of the atom is not qualitatively different in either case. It would, however, be an astronomically improbable event if a macroscopic amount of gas that originally filled a container evenly spontaneously shrunk to occupy only half the container.

In previous posts I already conjectured a similar view on time and entropy, perhaps a bit more radical: ΔTime $\approx$ ΔEntropy.

The above conjecture is vague and needs improvement, the gist is that entropy and time are cut from the same cloth. But as I said I am rather hampered by my overwhelming ignorance of modern physics.

Still, it seems to me that what is usually called `the arrow of time’ depends on an experimentally unconfirmed ideal view of time as an independent and qualitatively different dimension (an `objective clock’ or perhaps `subjective clock’ which runs independently of other physical processes/attributes, at least on the nanoscale; precisely here might lie the difficulty in reconciling quantum mechanics with general relativity).

One should read at this point the Stanford Encyclopedia of Philosophy entry on time, to hopefully help gain some insight in what I’m trying to say.

[Also I think I remember a vivid related portraying of time by Kurt Vonnegut in Slaughterhouse-Five (I read this over 30 years ago, so inaccuracy is inevitable). As I remember this portraying, time is similar to the other dimensions, which leads to all things existing in four equivalent dimensions…only our human consciousness is like a train which moves in a certain fixed direction. And from that train we can only look out through a very narrow window (the present), hence we see the landscape pass us by in a more or less linear time fashion, moment after moment. If we would be able to break free from the train, then our sensation of time would be radically changed.]

If I may, let me put forward an aphorism which I discovered through my telescope on a meteorite made of antimatter :-). It perhaps illustrates my thoughts on a possible reversal of time: namely that time could well be a phenomenon produced by our consciousness, in other words an anthropomorphic artefact:

“We anti-time humans have no memory to speak of, alas!, and can only rely on our often patchy foresight of our future“

Posted in Uncategorized | Tagged anti-time, antimatter, arrow of time, entropy, fourdimensionalism, Kurt Vonnegut, Slaughterhouse-Five, time | Leave a comment

The arrow of time (2): Francesco del Cossa

Posted on June 5, 2014 by franka waaldijk

[continued from the previous post:]

(click for enlargement)

The Arrow of Time (almost final design)

The basis of the design is formed by a detail from the fresco `Allegory of March: the triumph of Minerva’ by Francesco del Cossa (1430-1477 apprx).

Of course this is not meant as an art blog, but nonetheless I like to point out sources and alterations. You can see (if you are so inclined) that I did some work with Photoshop and Illustrator to revitalize the fresco detail sufficiently for use in an A_0 poster.

The visual strength of the design (to which I referred in the previous post) for me stems from multiple factors. One of these is the pretty superficial clue of the arrow which is held by the central figure…but a more content-related factor is the wear-and-tear of the fresco itself. We see that it is old…because of the wear, the damages, which is the physical manifestation of time (increase in entropy; whether this supports the idea of an arrow of time will be discussed in later posts).

Finally, this mysterious circle held in the other hand to me signifies a combination of mathematics, infinity, divinity…and mystery of course.

Other designs that I made sometimes were more intriguing, but they lacked this direct intuitive appeal.

(to be continued)

Posted in Uncategorized | Tagged Allegory of March, arrow of time, entropy, Francesco del Cossa, Triumph of Minerva | Leave a comment

The arrow of time (1)

Posted on June 4, 2014 by franka waaldijk

Three years ago I was asked to design a poster for the symposium `The Arrow of Time’, organized by IMAPP (Institute for Mathematics, Astrophysics and Particle Physics).

I would like to show the final design that I made, and also some other partial designs. Perhaps more relevant to this blog’s purpose, I also would like to pose the question: does the arrow of time really exist? Let me start however with the poster:

(click for enlargement)

The Arrow of Time (almost final design, since for the final design I had to add the logos of the sponsors, which seldom is an improvement [yet I managed to avoid real disruption])

I chose this design over other designs (which sometimes were stronger conceptually) because of its visual strength.

(To be continued)

Posted in Uncategorized | Tagged arrow of time, art, graphical design, IMAPP | Leave a comment

All bets are off, to disprove the strong physical Church-Turing Thesis (foundations of probability, digital physics and Laplacian determinism 3)

Posted on January 17, 2014 by franka waaldijk

(continued from previous post)

Let H₀ be the hypothesis: `the real world is non-computable’ (popularly speaking, see previous post), and H₁ be PCTT⁺ (also see the previous post).

For comparison we introduce the hypothesis H₂: `the real world produces only rational (real) numbers’.

H₂ is assumed to have been the original world view of ancient Greek mathematicians (Pythagoreans), before their discovery that $\sqrt{2}$ is irrational (which is `rumoured’ to have caused a shock but I cannot find a reliable historical reference for this).

The rational numbers in $[0,1]$ have Lebesgue measure $0$ , so we can start constructing $(T_{40,m})_{m\in \mathbb{N}}$ such that $T_{40}=\bigcup_{m\in \mathbb{N}} T_{40,m}$ has Lebesgue measure less than $2^{-40}$ , and such that $T_{40}$ contains all rational numbers in $[0,1]$ .

If we then take our coin-flip-randomly produced $x\in [0,1]$ , I personally don’t think that we will encounter an $m\in\mathbb{N}$ for which we see that $T_{40,m}$ contains $x$ .

This opinion is supported by the fact that we can easily construct a non-rational number…at least in theory. Take for instance $e$ , the basis of the natural logarithm, which equals $\Sigma_{n\in\mathbb{N}}\frac{1}{n!}$ . We can in fact construct $T_{40}$ in such a way that $T_{40}$ does not contain $e$ , and assume this to be the case here.

On the one hand, this does not depend on infinity, since we can simply look at approximations of $e$ . We construct $T_{40}$ such that for any $m\in\mathbb{N}$ the $2m+2^{\rm th}$ binary approximation to $e$ is positively apart from $T_{40,m}$ . On the other hand, any finite approximation to $e$ is still rational…and so we can only construct $e$ as an irrational number in the sense described above.

With regard to the existence of non-computable reals, the situation in my humble opinion is very different. We cannot construct a non-computable real, as result of the Church-Turing Thesis (which I have no reason to doubt). Any construction of a real which we recognize as such will consist of a finite set of construction instructions…in other words a Turing machine.

So to make a reasonable case for the existence of non-computable reals, we are forced to turn to Nature. In the previous post, we flipped our coin to produce a random $x$ in $[0,1]$ . We argued that finding $m\in\mathbb{N}$ for which $S_{40,m}$ contains $x$ would force us to reject the hypothesis H₀ (`the real world is non-computable’).

So what result in this coin-tossing experiment could force us to reject H₁, the strong physical Church-Turing thesis (PCTT⁺, `the universe is a computer’)?

To be able to reject H₁ in the scientific hypothesis-testing way, we should first assume H₁. [This might pose a fundamental problem, because if we really assume H₁, then our perception of probability might change, and we might have to revise the standard scientific hypothesis-testing way which seems to be silently based on H₀. But we will for the moment assume that the scientific method itself needs no amendment under H₁.]

Under H₁ $x$ has to fall in some $S_{40,m}$ . Failure to do so even if we let $m\in\mathbb{N}$ grow very large, might indicate H₁ is false. For scientific proof we should avail of some number $M\in\mathbb{N}$ such that (under H₁) the probability that $x$ is not in $\bigcup_{m\in \mathbb{N}, m<M} S_{40,m}$ is less than $2^{-40}$ .

This reverse probability has had me puzzled for some time, and sent me on the quest for a probability distribution on the natural numbers. In the thread `drawing a natural number at random’ I argued that some indication could be taken from Benford's law, and for discrete cases from Zipf’s law. Anyway, very tentatively, the result of this thread was to consider relative chances only. If for $1\leq n,m \in \mathbb{N}$ we denote the relative Benford chance of drawing $n$ vs. drawing $m$ by: $\frac{P_B(n)}{P_B(m)}$ , then we find that $\frac{P_B(n)}{P_B(m)} = \frac{\log{\frac{n+1}{n}}}{\log{\frac{m+1}{m}}}$ . The relative Zipf chance of drawing $n$ vs. drawing $m$ would be given by $\frac{P_Z(n)}{P_Z(m)} = \frac{m}{n}$ .

In both cases, the relevant density function is $f(x)=\frac{1}{x}$ . The important feature of this distribution is twofold:

1) The smaller natural numbers are heavily favoured over the larger. (`Low entropy’).

2) There is no $M\in\mathbb{N}$ such that even the relative probability of drawing $m\in\mathbb{N}$ larger than $M\in\mathbb{N}$ becomes less than $2^{-40}$ . (Because $\log x$ tends to infinity).

Fools rush in where angels fear to tread. I know, and so let me fall squarely in the first category. Yet this train of thought might provoke some smarter people to come up with better answers, so I will just continue. I do not believe these relative chances can simply be applied here, there are too many unknowns and assumptions. But it cannot do harm to try and get some feel for the reverse probability needed to disprove H₁.

For this tentative argument then, disregarding some technical coding issues, we consider (under H₁) our coin-flip random $x$ to equal some computable $x_s$ computed by a Turing machine with random number $s\in\mathbb{N}$ , drawn from some extremely large urn with low entropy (favouring the smaller natural numbers).

Even with this favouring of the smaller natural numbers, still we cannot begin to indicate $M\in\mathbb{N}$ such that (under H₁) the probability that $x$ is not in $\bigcup_{m\in \mathbb{N}, m<M} S_{40,m}$ is less than $2^{-40}$ . Perhaps if we would know the size of the urn (which in this case would seem to be the universe itself) we could say something more definite on $M$ . But all things considered, it seems to me that $M$ could easily be astronomically large, far larger than our limited computational resources can ever handle.

In other words: all bets are off, to disprove H₁.

And so also, if H₁ is true, it could very well take our coin-flip experiment astronomically long to find this out.

But I still think the experiment worthwhile to perform.

claimtoken-52f2197334248

claimtoken-52f2362a0cc48

Posted in Uncategorized | Tagged Benford's law, foundations of probability theory, Laplacian determinism, random natural number, strong physical Church-Turing thesis | Leave a comment

An experiment to (dis)prove the strong physical Church-Turing Thesis (foundations of probability, digital physics and Laplacian determinism 2)

Posted on January 14, 2014 by franka waaldijk

There seems to be a pervasive role of `information’ in probability, entropy and hence in physics. But the precise nature of this role escapes me, I’m afraid. I may have said before somewhere in this thread that I do not put too much faith in the classical model of probability as described by Laplace (see previous post, showing Laplace stated similar doubts himself).

One reason for this is an argument/experiment related to digital physics which has not received enough attention, I believe. I equate the term `digital physics’ with the strong physical Church-Turing thesis PCTT⁺: `Every real number produced by Nature is a computable real number‘ (the Universe is a computer).

The argument/experiment runs like this:

1. Denote with $[0,1]_{_{\rm REC}}$ the recursive unit interval, that is the set of computable reals in $[0,1]$ . We can effectively construct coverings of $[0,1]_{_{\rm REC}}$ which classically have arbitrarily small Lebesgue measure. In fact we can for any $n$ give a countable sequence of intervals $(S_{n,m})_{m\in \mathbb{N}}$ such that the recursive interval $[0,1]_{_{\rm REC}}$ is covered by $(S_{n,m})_{m\in \mathbb{N}}$ , and such that the sum of the lengths of the intervals $(S_{n,m})_{m\in \mathbb{N}}$ does not exceed $2^{-n}$ . (see [Bridges&Richman1987] Varieties of Constructive Mathematics Ch. 3, thm. 4.1; the coverings are not constructively measurable because the measure limit cannot be achieved constructively, but this doesn’t affect the probability argument).

2. Flipping a coin indefinitely yields a (for practical purposes potentially infinite) sequence $x\in\{0,1\}^{\mathbb{N}}$ , we can see $x$ as a binary real number in $[0,1]$ . Let $\mu$ denote the standard Lebesgue measure, and let $A\subseteq [0,1]$ be Lebesgue measurable. Then in classical probability theory the probability that $x$ is in $A$ equals $\mu(A)$ . (Assuming the coin is `fair’ which leads to a uniform probability distribution on $[0,1]$ ).

3. Let H₀ be the hypothesis: `the real world is non-computable’ (popularly speaking), and H₁ be PCTT⁺ (mentioned above). Then letting the test size $\alpha$ be $2^{-40}$ , we can start constructing $(S_{40,m})_{m\in \mathbb{N}}$ . Notice that $S_{40}=\bigcup_{m\in \mathbb{N}}(S_{40,m})$ has Lebesgue measure less than $2^{-40}$ . H₀ is meant to be interpreted mathematically as: classical mathematics is a correct description of physics, the Lebesgue measure of the non-computable reals in $[0,1]$ equals $1$ , and the uniform probability distribution applies for a coin-flip-randomly produced real in $[0,1]$ .

4. Therefore the probability that $x$ is in $S_{40}$ is less than $2^{-40}$ . If we ever discover an $m$ such that $x$ is in the interval $S_{40,m}$ , then according to the rules of hypothesis testing I think we would have to discard H₀, and accept H₁, that is PCTT⁺.

5. Even if the uniform probability distribution is not perfectly satisfied, the above argument still obtains. Any reasonable probability distribution function (according to H₀) will be uniformly continuous on $[0,1]$ , yielding a uniform correspondence between positive Lebesgue measure and positive probability of set membership.

This seems to me a legitimate scientific experiment, which can be carried out. An interesting form would be to have people add their flips of a coin to the sequence $x$ . I am really curious what the implications are. But several aspects of this experiment remain unclear to me.

I’ve been trying to attract attention to the possibility of carrying out this experiment, so far rather unsuccessfully. Perhaps someone will point out a fallacy in the reasoning, otherwise I think it should be carried out.

Still, there is a snag of course. Assuming H₁, that is PCTT⁺, we are `sure’ to see $x$ fall in some $S_{40,m}$ …but how long would we have to wait for the right $m$ to crop up?

This question then becomes the subject of the reverse hypothesis test: assuming H₁, can we determine $M\in \mathbb{N}$ such that with probability less than $2^{-40}$ we do not see $x$ fall into any $S_{40,m}$ for $m\leq M$ ?

If so we could use the experiment also to disprove PCTT⁺.

Finally, if we should in this way somehow `prove’ PCTT⁺, what remains of the standard scientific method of statistical hypothesis testing?

All these questions were raised in my paper `On the foundations of constructive mathematics — especially in relation to the theory of continuous functions‘ (2005, circulated as preprint since 2001).

I have yet to receive an answer…so here another invitation to comment. Don’t hesitate to point out where I go wrong.

Notice that a similar experiment can be done for the rational numbers (also of zero Lebesgue measure). I’m confident that such an experiment would not statistically yield that all reals are rational, but the reverse question remains interesting. These reverse questions were the motivation for the thread on `drawing a natural number at random’. This type of question is heavily entropy-related, I feel, and I will discuss this in the next post.

Finally, at this moment I consider PCTT⁺ the best scientific formulation of Laplacian determinism, which explains the title of these posts.

Posted in Uncategorized | Tagged digital physics, foundations of probability theory, Laplace, Laplacian determinism, strong physical Church-Turing thesis | Leave a comment

Foundations of probability, digital physics and Laplacian determinism

Posted on January 9, 2014 by franka waaldijk

In this thread of posts from 2012, the possibility of drawing a natural number at random was discussed. In the previous post I rediscussed an entropy-related solution giving relative chances. This solution also explains Benford’s law.

In the 2012 thread, I was working on two fundamental questions, the first of which

QUESTION 1 Is our physical world finite or infinite?

was treated to some degree of satisfaction. But its relation to the second question still needs exposition. So let me try to continue the thread here by returning to:

QUESTION 2 What is the role of information in probability theory?

In my (math) freshman’s course on probability theory, this question was not raised. Foundations of probability were in fact ignored even in my specialization area: foundations of mathematics. Understandable from a mathematical point of view perhaps…but not from a broader foundational viewpoint which includes physics. I simply have to repeat what I wrote in an earlier post:

(Easy to illustrate the basic problem here, not so easy perhaps to demonstrate why it has such relevance.) Suppose we draw a marble from a vase filled with an equal amount of blue and white marbles. What is the chance that we draw a blue marble?

In any high-school exam, I would advise you to answer: 50%. In 98% of university exams I would advise the same answer. Put together that makes … just kidding. The problem here is that any additional information can drastically alter our perception of the probability/chance of drawing a blue marble. In the most dramatic case, imagine that the person drawing the marble can actually feel the difference between the two types of marbles, and therefore already knows which colour marble she has drawn. For her, the chance of drawing a blue marble is either 100% or 0%. For us, who knows? Perhaps some of us can tell just by the way she frowns what type of marble she has drawn…?

It boils down to the question: what do we mean by the word `chance’? I quote from Wikipedia:

The first person known to have seen the need for a clear definition of probability was Laplace.^{[citation needed]} As late as 1814 he stated:

The theory of chance consists in reducing all the events of the same kind to a certain number of cases equally possible, that is to say, to such as we may be equally undecided about in regard to their existence, and in determining the number of cases favorable to the event whose probability is sought. The ratio of this number to that of all the cases possible is the measure of this probability, which is thus simply a fraction whose numerator is the number of favorable cases and whose denominator is the number of all the cases possible.

— Pierre-Simon Laplace, A Philosophical Essay on Probabilities^[4]

This description is what would ultimately provide the classical definition of probability.

One easily sees however that this `definition’ avoids the main issue. Laplace did not always avoid this main issue however:

Laplace([1776a]; OC, VIII, 145):

Before going further, it is important to pin down the sense of the words chance and probability. We look upon a thing as the effect of chance when we see nothing regular in it, nothing that manifests design, and when furthermore we are ignorant of the causes that brought it about. Thus, chance has no reality in itself. It is nothing but a term for expressing our ignorance of the way in which the various aspects of a phenomenon are interconnected and related to the rest of nature.

and:

We may regard the present state of the universe as the effect of its past and the cause of its future. An intellect which at a certain moment would know all forces that set nature in motion, and all positions of all items of which nature is composed, if this intellect were also vast enough to submit these data to analysis, it would embrace in a single formula the movements of the greatest bodies of the universe and those of the tiniest atom; for such an intellect nothing would be uncertain and the future just like the past would be present before its eyes.

—Pierre Simon Laplace, A Philosophical Essay on Probabilities^[37]

In the meantime I came across some work by Albert Tarantola, and this work is really heartening! In a seminal paper Inverse problems = quest for information (together with Bernard Valette) Tarantola already states that we should consider any probability distribution as an information state (subjective even) regarding the phenomenon under study, and vice versa: every information state can be described by a probability distribution function on some appropriate model space.

Now we’re talking!

To surprise me further: Tarantola describes the quandary that measure density functions like $f(x)= \frac{1}{x}$ cause (the integral diverges) and offers exactly the same solution: look at relative probabilities of events, instead of absolute probabilities. To top it off, Tarantola emphasizes that the measure density function $f(x)=\frac{1}{x}$ plays a very important role in inverse problems…

So now I need to study this all, in order also to join these ideas to the perspective of digital physics and Laplacian determinism.

(to be continued)

Posted in Uncategorized | Tagged Albert Tarantola, Bernard Valette, information and probability theory, inverse problems, Laplace, Laplacian determinism | 1 Comment

An entropy-related derivation of Benford’s law

Posted on January 8, 2014 by franka waaldijk

In this thread of posts from 2012, the possibility of drawing a natural number at random was discussed. A solution offering relative chances was given, and stated to be in accordance with Benford’s law.

Then, due to unforeseen circumstances, the thread remained unfinished. In fact it stopped precisely at the point for which I had started the thread in the first place :-).

I wish to return to this thread, but it seems worthwhile to repeat the result mentioned above in a different wording. Why? Well, searching on “Benford’s law” I didn’t find any comparable entropy-related derivation (perhaps I should say motivation) in the literature/internet. And perhaps more telling, Theodore Hill (when deriving Benford’s law from considering a `random’ mix of distributions, see A Statistical Derivation of the Significant-Digit Law) explicitly mentions the difficulty of drawing a natural number at random, if the sum probability must be 1, which would seem to exclude the density function $\frac{1}{x}$ .

So the trick to turn to relative chances $\frac{P(n)}{P(m)}$ seems new [except it isn’t, see the next post], and it does yield Benford’s law. The entropy-related motivation for these relative chances also seems to be new. The (relative) chances involved in drawing a natural number at random will play a role in the discussion to come (on Laplacian determinism, digital physics and foundations of probability theory). But first let us explicitly derive Benford’s law from these chances:

Solution to `drawing a natural number at random’:

* We can only assign relative chances, and the role of the natural number $0$ remains mysterious.

* For $1\leq n,m \in \mathbb{N}$ let’s denote the relative chance of drawing $n$ vs. drawing $m$ by: $\frac{P(n)}{P(m)}$ .

* For $1\leq n,m \in \mathbb{N}$ , we find that $\frac{P(n)}{P(m)} = \frac{\log{\frac{n+1}{n}}}{\log{\frac{m+1}{m}}}$

(* An alternative `discrete’ or `Zipfian’ case $P_{\rm discrete}$ can perhaps be formulated, yielding: for $1\leq n,m \in \mathbb{N}$ , we find that $\frac{P_{\rm discrete}(n)}{P_{\rm discrete}(m)} = \frac{m}{n}$ .)

The entropy-related motivation for these chances can be found in this thread of posts from 2012.

Now to arrive at Benford’s law, we consider the first digit of a number $N$ in base 10 (the argument is base-independent though) to be drawn at random according to our entropy-related chances. We then see that the relative chance of drawing a 1 compared to drawing a 2 equals $\frac{\log 2}{\log\frac{3}{2}}$ which in turn equals $\frac{^{10}\log 2}{^{10}\log\frac{3}{2}}$ . Since the sum of the probabilities of drawing 1, 2,…,9 equals 1, one finds that the chance of drawing $i$ as first digit for $N$ equals $^{10}\log\frac{i+1}{i}$ .

A second-digit law can be derived similarly, for example adding the relative chances for 11, 21, 31, 41, 51, 61, 71, 81, 91 vs. the sum of the relative chances for 12, 22, 32, 42, 52, 62, 72, 82, 92 to arrive at the relative chance of drawing a 1 as second digit vs. the chance of drawing a 2 as second digit. And so on for third-digit laws etc.

This shows that the second-digit distribution is more uniform than the first-digit distribution, and that each next-digit law is more uniform than its predecessor, which fits with our entropy argument.

In the next post the thread on information and probability will be continued.

Posted in Uncategorized | Tagged Benford's law, digital physics, entropy, foundations of probability theory, Laplacian determinism, random natural number | Leave a comment

Franka Waaldijk's math & science & philosophy blog

A conceptual model for human image recognition: combining passive memory with active imagination

Collective Intelligence seems a bigger threat than Artificial Intelligence

The arrow of time (4): entropy and reversibility, pictured in designs

The arrow of time (3): what is it, actually?

The arrow of time (2): Francesco del Cossa

The arrow of time (1)

All bets are off, to disprove the strong physical Church-Turing Thesis (foundations of probability, digital physics and Laplacian determinism 3)

An experiment to (dis)prove the strong physical Church-Turing Thesis (foundations of probability, digital physics and Laplacian determinism 2)

Foundations of probability, digital physics and Laplacian determinism

An entropy-related derivation of Benford’s law

Archives

Links