Icon ASCII : A Love Letter


Icon My Neural Network isn't working! What should I do?


Icon Phase-Functioned Neural Networks for Character Control


Icon 17 Line Markov Chain


Icon 14 Character Random Number Generator


Icon Simple Two Joint IK


Icon Generating Icons with Pixel Sorting


Icon Neural Network Ambient Occlusion


Icon Three Short Stories about the East Coast Main Line


Icon The New Alphabet


Icon "The Color Munifni Exists"


Icon A Deep Learning Framework For Character Motion Synthesis and Editing


Icon The Halting Problem and The Moral Arbitrator


Icon The Witness


Icon Four Seasons Crisp Omelette


Icon At the Bottom of the Elevator


Icon Tracing Functions in Python


Icon Still Things and Moving Things


Icon water.cpp


Icon Making Poetry in Piet


Icon Learning Motion Manifolds with Convolutional Autoencoders


Icon Learning an Inverse Rig Mapping for Character Animation


Icon Infinity Doesn't Exist


Icon Polyconf


Icon Raleigh


Icon The Skagerrak


Icon Printing a Stack Trace with MinGW


Icon The Border Pines


Icon You could have invented Parser Combinators


Icon Ready for the Fight


Icon Earthbound


Icon Turing Drawings


Icon Lost Child Announcement


Icon Shelter


Icon Data Science, how hard can it be?


Icon Denki Furo


Icon In Defence of the Unitype


Icon Maya Velocity Node


Icon Sandy Denny


Icon What type of Machine is the C Preprocessor?


Icon Which AI is more human?


Icon Gone Home


Icon Thoughts on Japan


Icon Can Computers Think?


Icon Counting Sheep & Infinity


Icon How Nature Builds Computers


Icon Painkillers


Icon Correct Box Sphere Intersection


Icon Avoiding Shader Conditionals


Icon Writing Portable OpenGL


Icon The Only Cable Car in Ireland


Icon Is the C Preprocessor Turing Complete?


Icon The aesthetics of code


Icon Issues with SDL on iOS and Android


Icon How I learned to stop worrying and love statistics


Icon PyMark


Icon AutoC Tools


Icon Scripting xNormal with Python


Icon Six Myths About Ray Tracing


Icon The Web Giants Will Fall


Icon PyAutoC


Icon The Pirate Song


Icon Dear Esther


Icon Unsharp Anti Aliasing


Icon The First Boy


Icon Parallel programming isn't hard, optimisation is.


Icon Skyrim


Icon Recognizing a language is solving a problem


Icon Could an animal learn to program?




Icon Pure Depth SSAO


Icon Synchronized in Python


Icon 3d Printing


Icon Real Time Graphics is Virtual Reality


Icon Painting Style Renderer


Icon A very hard problem


Icon Indie Development vs Modding


Icon Corange


Icon 3ds Max PLY Exporter


Icon A Case for the Technical Artist


Icon Enums


Icon Scorpions have won evolution


Icon Dirt and Ashes


Icon Lazy Python


Icon Subdivision Modelling


Icon The Owl


Icon Mouse Traps


Icon Updated Art Reel


Icon Tech Reel


Icon Graphics Aren't the Enemy


Icon On Being A Games Artist


Icon The Bluebird


Icon Everything2


Icon Duck Engine


Icon Boarding Preview


Icon Sailing Preview


Icon Exodus Village Flyover


Icon Art Reel




Icon One Cat Just Leads To Another

Recognizing a language is solving a problem

Created on Nov. 13, 2011, 12:57 a.m.

All though Could an Animal learn to Program I've been talking as if the ability to recognize a language is some kind of general case in problem solving. Any problem which can be formalized can be re-realized in the form of a language recognition problem. And more than that - the complexity of the problem corresponds to how complicated a machine, grammar or algorithm you need to recognize it. This is at first a very unintuitive notion. Unfortunately how the problem complexity maps to the language class it a bit beyond this blog, but I'll try to give some form of explanation on why the rest is true.

First, what does it mean to recognize a language? In formal language theory it means that given some string you can do some kind of algorithm or computation and either accept or reject the string based on if it is in the language or not. So we have two concepts of the language. The set of strings this machine accepts but also the actual specification of the machine itself. From the machine we can generate the set of strings, simply by running it on every possible string and only including the ones it accepts on. But it isn't always possible to know what the machine looks like given a set of strings. Asking what this machine looks like is actually one way of asking "How do we solve this problem?".

So how do we phrase a problem in terms of language recognition? It is impossible to give a general answer to this so it is better to give some examples. We'll start with a "yes, no" problem as those are a bit easier. For example: "I have some shape and I want to know if it has 10 sides".

The idea is to represent this shape as a string and then check if it is in the language of "ten sided shapes" using some machine. One way to do this would be to number the letters "A-Z" as 0 to 26 and then use these to encode the coordinates X,Y of each corner of the shape. We then just concatenate these into a string and feed it into our machine. A triangle with points at (0,0) (2,8) (5,4) would become the string "AACIFE". Due to the geometric fact that polygons in this encoding must have the same number of edges as corners, our machine would then just get the length of the string, divide it by two (each coordinate is two symbols long) and see if the result was equal to ten. In other words our machine would recognize all strings of length 20. So given this encoding, the language of strings of length 20 is the same as the set of shapes with ten sides.

But what about functions? How do we solve 2+2 by recognizing a language? Well a function can be thought of in terms of it's mapping. Addition maps two number to one, and the way we encode this as a string it to group both the inputs and the outputs together into one string. If we use our previous encoding, but imagining a unique symbol for all numbers (not just 0 to 26), and we also using a special symbol (say "|") to separate our inputs and outputs then we can think of the language of addition being the set of strings: "X|Y|Z" where X,Y and Z are symbols in our alphabet and the value of Z is equal the value of X plus the value of Y.

A machine which recognizes this language is simple. It first checks the string is in the correct format (three symbols separated by | symbols). It then finds the encoded value of X, the encoded value of Y and checks that the value of Z is equal to X plus Y. If all of these things are the case it accepts, otherwise it rejects. You can see that given any string this will only accept strings in the correct format and which represent an addition in our encoding. If we want to find an output Z given that we know X and Y we just enumerate for all different Zs until the machine accepts our string. Then we know we have the correct value for Z.

This probably seems backwards and weird but it is a very powerful way of thinking about problem solving and really gets the core fundamentals of it.

Here is a demonstration of the power of this model for problem solving, and I'll warn you that the maths is about to get heavy.

We can find out how many problems, of the total number of possible problems are actually solvable.

First, if you can, imagine the set of all possible languages, including languages with an infinite number of strings in them. This itself an infinite set so the answer to the above problem is not going to be a fixed number, but we can try to consider what proportion of this infinite set we can possibly recognize.

Lets look at the languages with a non-infinte number of strings first. These are easy. We can recognize these just be enumerating all the strings and checking if the input is equal to any of them. Just like we did for palindromes of length three. Sure, some languages might be huge but technically they are still recognizable in this way.

Next the infinite languages. This is a bit harder but we can definitely recognize some of these. For example there are an infinte number of possible strings in the English language, yet I can recognize that perfectly fine. The addition language I explained above is also infinite and we can recognize that. We can perform addition on numbers of any size. So perhaps it is possible to recognize all infinite languages?

Unfortunately not. Consider a language which is infinite but the strings in it have no seeming pattern or relation. Like the decimal expansion of pi it just goes on forever and you never know what the next string will be next. And there are lots of these languages. In fact the vast majority of languages will be like this and there is no way to generate or recognize them because they just express no logic or reasoning behind them.

The only infinite languages we can recognize are the ones with some kind of "pattern" in, like the addition language, palindromes, or even English. For the mathematicians here, this relation actually corresponds to the size of Natural Numbers in comparison to the size of the Real Numbers. There are an infinite number of languages, and an infinite number of recognizeable languages, but the number of total languages is a "bigger" kind of infinity.

And that is really nice way to see what the serious maths behind theory of computation is about. In short - how much of a "pattern" do you need in an infinite language for it to become recognizable by some machine. But we don't stop there, we even get into classifying these different kinds of patterns so we can say what problems are similar. This helps us solve problems in future because we can quickly see how we solved previous similar ones.

The language of English, with all of it's infinite possible strings, might not appear to have much of a pattern if you were to just list strings at random. But the really amazing thing is that there is a pattern, which you are recognizing right now just by reading this text. This is what mathematicians are doing all the time, just more formally. Looking for these weird patterns in infinite alien languages and recognizing what before was impossible to understand.

github twitter rss