Tuesday, December 22, 2015

Compiler design part 1 : Failure accepted buuuuut.... Esoteric programming language

Ok I am busted, 24x7 challenge didn't work.
Fine... I am like that and I am moving on.
I had this weird obsession with designing compilers. I even had a couple of blog entries, so to speak.
One of the most weird things about that I must say, is the journey. I wanted to start a series for tht sake alone but I couldn't succeed because I was going in the wrong path for me. See, everyone has their own path in reaching their destiny but for some reason some of them match and few of them stand. All we have to do is discover and compare notes. So, here I am presenting my notes.

First things first, opinion. In my opinion, compiler design is like integration. I am a fifth grade student by my knowledge in this context. So if I directly mug up integration formulae, I may very well solve the problem. Then again, use of integration is tricky now, isn't it? I mean application part and preparing for real world problems requires more than just skill, it's knowledge. Okay, what if I start from limits and gain the knowledge? It's fine and some theoretical preservation and sensitivity is needed but something else would be missing. Some call it practice, I say experience. The joy of solving a problem and solving multiple problems are different. The pain of losing and rediscovering and the loss of leaving 50 other questions is different. The first in both cases is experience and the second is practice. In short, my opinion is that designing a compiler requires more literature survey followed by immense practice.

Next, paths already followed. The best tool available online on the subject may be Compiler Design course by Stanford. It is fine and all but it is rushed. For a simple guy like me, starting off with a different LLVM based compiler is counter-intuitive and frankly, very complex. I must say that I am quite below the standards but in my defense, so are many. I mean, how many people in CSE or otherwise have thought of writing their own compiler and filtering out the passion, plagiarism and
needed-for-diploma categories, there are only a few hobbyists that actually think about this. Problem is that these people tend not be mathematicians and lack general perspective. They just give code and say voila, there you go. It's fine and all but I am still missing a piece... You didn't explain your language. There is this constant buzz about languages. Non-self-hosting ones are criticized for not being able to do so. Self hosting ones are criticized for doing just that and not being able to do anything else. This is a genuine concern for language developers. It's just not a hobby, it's an art. So the paths followed are different and not easy.

Next, grouping of ideas. Compiler designers differ in ideas and methodologies. Mostly because of their backgrounds. There is the lisp family and there is the C family. Lisp family has Scheme to create new languages and  itself is so huge and people daresay about creating a new one in it. Similar things can be said about Java and C# which actually provide reflections but hardly anyone knows it. Also, C has lex and yacc and Java has ANTLR. These are very good tools if you ask me but they are advanced for even beginners to fathom the knowledge that went behind in creating those things.
Also Ruby builders are creating new things but it goes against the whole idea of creating something useful. Creating qbasic from ruby is a huge back-step. Then we have esoteric programs which do not define themselves as self-hosting and are meant to be a hobby or a joke. These are languages like brainfuck or ArnoldC which are at first declared as Turing complete but lack a provision for actual programming. Python might be a better place to sit back and relax but you just can't make it possible because of python's nature of being utterly indecisive. Even the structure of python screams modern which implies, not for traditional approach.

The procedure of my approach might just be clear from now on and some might even contradict above initial assumptions but that's fine. You'll see that right away. Without further ado, I suggest to whomever reading this blog, write an interpreter for an esoteric language. The point here is not to learn compiler design. It's like teaching a fifth grade student about progressions. It is very much on that border of understanding things and also leads to limits concept. It's a joke in itself seeing from outside but nevertheless is a tool in itself for many great things. That bridge is necessary just to understand lexical analysis parsing mechanism. As a homework, if should you take this blog seriously... write an interpreter each for brainfuck, whitespace and ArnoldC. Create your own equivalent Turing complete language just by combining these ideas and interpreters should be easy.
This will teach two things for sure though. One, what in the world is a Turing complete language and two, how easy it is to make a list of commands look like a programming language.

The final step in the first course is to write a text editor. I know, this has nothing to do with the compiler but let me remind you of QBASIC and TurboC. These two programming languages came with an IDE of their own. Funny much? Yes, no body today is interested in terminals except for linux community which by the way is working on GUI wholesale. So, living by old PDP7 standards is useless. This also gives me an idea to suggest the one and only language you should use for language creation. It is again, Java. I know, I just wrote something quite against it earlier but hear me out. It gives an opportunity for packaging and GUI interface at low cost. C is a great language to implement interpreters with its simple syntax but is no where near creating a proper native GUI. I suggest polyglot programming wherever possible and I have heard that that's what professionals do.

Final notes here now. Write an interpreter before going forth on compilers and trust me, this will help in future.

No comments:

Post a Comment