Friday, July 6, 2012

How I Would Reverse Engineer an ET Craft

If I were to reverse engineer an ET craft, I would start by modelling it in a computer. I would not waste time trying to understand it. I would push buttons, note results, make models of the structure, until my model matched what I observed. I would use scanning microscopes and put those results into my model. I would fly it and make careful observations and measurements.
I would then optimize that model. I would take this optimized result and stick it into its own library. Would I bother philosophizing about it? No, why? Would I question or say something that it did was impossible. No, why should I? Just put it into the computer.
Now, let's say my boss came to me and said, we want to use a part of this technology in another application. What would I do? I would create another computer program that would tell me how to do that. I would not hesitate, hire thousands of experts, waste billions of dollars. I would get a couple of computer programmers and a super computer.
End of story. Reality is reality. We don't know all there is to know and why should we care to build some abstraction of something anyway? Results are all we were ever after, not these mind games.
Now we know why the black projects go so slow and are so expensive. The abstraction-junkies are on a rampage...

Thursday, July 5, 2012

All Physics Theories Are Illusions

Honestly, I don't like the name string theory at all. Why? Because it implies a basic starting point, a string. From the very start this limits the possibilities of what this theory can give us.
I would rather it be called The Unified Physics Model. I would rather that people concentrate on the idea of unification than on any particular limited human concept. Why? Because any theory or model is really just an abstraction and will allways be an abstraction, so let's not limit ourselves nor beat around the bush and confuse the layman.
For example, H is the symbol for Hydrogen, but Hydrogen is really an electron orbiting a Nucleus. But an Electron is really a charged particle with negative charge and spin. And negative charge means it has a negative electric field and attracts to positive fields and repels from other negative fields.
But the promise of a unified theory is not bunk at all; in fact it is required. In fact, it is logical. The naming we put on things, the labelling, that is the problem. But we don't need to know what the theory even is, nor how it really even works.
"Oh come on, are you nuts?" you might ask. "What are you really trying to say?"
I am saying point blank, that our very outdated methods of science itself are preventing us from making further progress. They are getting us lost, in names, conventions and abstractions all in a vain attempt to put a perfect set of equations down on the blackboard. How many years must someone study before they get a working model of all these abstractions? Honestly, there are not enough years in the human life to understand every area of physics and chemistry.
This begs the question,why then are we trying to master these names, abstractions and math? Why are we continuing to do something that a computer could do better?
And this is the crux of my argument. The computer only sees bits and bytes. Likewise, why should we see any more than this? Why do we need to know anything more than: The computer models the reality, what more do I need to know?
I propose that science is becomming quickly oudated. We focus on the mechanism and finding a better way to express it abstractly and in this process lose sight of what our true goal all along should be: RESULTS.
I propose that a unified theory, in reality is not necessary, and in fact, a total destraction. We should immediately stop looking for it! Likewise, we should drop all labels for anything. I will call it "reality." I apply no theories, just pure math and feed it into my computer. And use as many dimensions as you deam useful!
"But you need some model, some abstraction from which to base your math" you may further argue.
True, you need some starting point, but I argue it is not important what you pick as long as it works; as long as it predicts a correct result. And you should change it in a moment if some other model proves more efficient.
For example, I wrote an ephemeris years ago, which was accurate to a minute of arc. My program applied standard Keplar Dynamics to approximate the location of the planet. It then used a fourier transform to simulate the perturbations and get a more accurate result.
But let's be honest, why did I even bother modelling the Keplar Dynamics? I could have jumped to the fourier transform and be done with it. Ok, sometimes it was useful to start with a simplification, but I know a competitor who had an accuracy of one second of arc and only used fourier transforms based off of a much more accurate ephemeris.
So, if I want to build a rocket, why do I even need to know or care what the name of the materials are or even start to wonder what shape I should start with? Is it really important? Why should I guess, when a computer can do it all more accurately?
We are getting lost in the abstractions of reality and losing sight of optimal solutions! Our drive for perfection should be in the results, not in the abstraction.
Personally, I would rather tell the computer, "I need to go to the Moon and return in safety, now tell me what I need to do." The computer would find the most optimal way to get to the Moon and print out a list of optimal instructions.
This is the future. It is not a dumbing down of science. It is facing of reality, finally, and getting our egos out of the way.

Saturday, May 5, 2012

Mice suck

Everyone is completely out of their minds.  Why?  The Mouse is why.  Mice are totally a fad which has long outworn its welcome.  In the future, we won't point and click, we will speak, type, gesture and think our way to our goal.  Our computers will attempt to guess more what we want.  This is what we should be focusing our efforts on.  Not yet another GUI.   I remember how the Mac made the mouse popular back in the 80's.  We were so wowed by the amazing graphics of that monochrome 10 inch CRT.

I was so much faster without a mouse, in the text-only days.  I remember working with CAD without a mouse and it was so much easier.  Everything was faster and better without a mouse. 

The world is totally crazy with how much time they spend on the GUI, the visual appearance of something.

I am not against graphics when graphics are required.  But until that point, I much prefer a minimalistic approach.  Everything should be commands with command completion and short-cuts and aliases... etc...  I should be able, for example, using speech recognition, to speak to the program I am using.

When WYSIWYG is required, such as a spreadsheet, there should be a standard way of navigating, from cell to cell, etc..  Granted, most Windows programs have this, but its not always clear what it is and it's not always enforced.

I sometimes think, for example, how much easier it would be if I could edit my video using text, instead of clicking around.  I am slowly migrating my entire OS environment BACK to text.

Wednesday, May 2, 2012

Why Java?

Is Java really relevant? After perusing the class library from Java and comparing the language to C++, I came to ask myself: Why Java? On the bad side, it runs slower than C++. It does however tout a more thorough framework than STL, but then again, there are plenty of frameworks for C++ which include smart pointers and garbage collectors as well as every imaginable class and function. In fact, every argument in favor of Java, I can make the same or better argument for C++. The only one in which Java shines is with mobile phones, and devices with small footprints, which use many different tiny applications. In this case, portablitity and size win out.

Where both languages seem to fail is with functional programming and "terseness." They both don't handle implicit types very well, as do the "Caml-like" languages. Both make for a lot of unnecessary keyboard strokes. Java however is much less terse even than C++. In fact, it is the least terse language out there.

The truth is that C++ could be used in almost every place Java is used. Performance and portability is usually much better with C++.

What I don't understand, is why was Java invented at all? At the time, compiling to bytecode probably seemed like a good idea. You could squeeze a few more bytes out of your project. It was argued that it could run everywhere, just make the VM...

The C languages, in general, now-a-days are considered quite verbose. O-Caml would be a wonderful replacement for C++, if however, the garbage collecter was optional. C# makes use of a ValueType, which until recently was not possible in Java. This allows for direct passing of objects, bypassing the heap.

In summation, why I avoid Java is threefold:
1. I haven't found a situation where Java did better what I could do in C++.
2. Java is too verbose.
3. Java interfaces terribly with the operating system.

In general, I prefer to develop applications in C++ on top of a portable framework, like Milan. And then any OS specific calls, I can wrap with #ifdefs. This way I am guarenteed that my application will perform exactly as suspected, without any performaces surprises, which prove impossible to tweak.

If everyone developed with Milan and C++, we would never have the case of the buggy and slow application, which runs well on one machine and horribly on another.

Monday, April 9, 2012

Function size counting

Someone asked me recently how to count the size of a function in c++;

I use function size counting all the time and it has lots and lots of uses. Is it reliable? No way. Is it standard c++? No way. But that's why you need to check it in the disassembler to make sure it worked, every time that you release a new version. Compiler flags can mess up the ordering.

static void funcIwantToCount()
{
// do stuff
}
static void funcToDelimitMyOtherFunc()
{
__asm _emit 0xCC
__asm _emit 0xCC
__asm _emit 0xCC
__asm _emit 0xCC
}

int getlength( void *funcaddress )
{
int length = 0;
for(length = 0; *((UINT32 *)(&((unsigned char *)funcaddress)[length])) != 0xCCCCCCCC; ++length);
return length;
}

It seems to work better with static functions. Global optimizations can kill it.

P.S. I hate people, asking why you want to do this and it's impossible, etc. Stop asking these questions, please. Makes you sound stupid. Programmers are often asked to do non-standard things, because new products almost always push the limits of what's availble. If they don't, your product is probably a rehash of what's already been done. Boring!!!

Saturday, March 24, 2012

The overhead of std::string

Recently, I have been interested in the overhead of standard string and decided to do an investigation using Visual Studio 2010. I optimized my code for size and wrote a few different scenario functions. Firstly I noted the size of the particular std::string I am using is 20 bytes. That seems quite high. So, I turned off run-time type checking and noted the size was still 20 bytes.

OK. What next? I decided to write several functions which return a string in various ways. I know that even though this is in release build with all size optizations turned to maximum, that function alignment might be off, so I also decided in the debugger to take a look at the code. I noticed that all the functions were packed close together without any nops in between functions .

First, I will show you just the sizes of the various functions:
  1. const char *ReturnAConstCharString() { return "test"; } =6 bytes
  2. Sizeof(String ReturnAString()) { return String("test"); } = 22 bytes
  3. Sizeof(void FillingAStringReference(String &reference)) { reference = "test"; } = 30 bytes
  4. Sizeof(auto_ptr ReturnAutoPtr() { return auto_ptr(new String(""")); } = 43 bytes.

Wow, not what I expected at all. The one that most people think is the most optimized solution (using a reference), apparently takes more space than just returning a string. Naturally I would have expected the const char * version to be the lightest and 6 bytes is extremely light. However one can't do much with such a function, which is the same effect as referencing a static variable.

Looking at the underlying code, only (4) had a loop. So, the auto_ptr, performance wise, would probably be the poorest.

For most practical situations, the best solution is (2). Simply return the String. I wonder if this is always true for all classes? Probably not for the bigger classes however.

The next thing I wanted to investigate is the return load of each call. I mean, are the all the same weight or do they come with an overhead? The requirement of my return overhead functions is that the all return their values into a std::string. I also decided just to count instructions in the debugger, as there was really no other good way of doing it.

  1. sizeof(string value = ReturnAConstCharString()) = 17 bytes and 7 instructions (2 calls).
  2. sizeof(string value = ReturnAString()) = 43 bytes and 14 instructions (3 calls).
  3. sizeof(string value = FillingAStringReference) = 9 bytes and 4 instructions (1 call).
  4. sizeof(auto value = new auto_ptr(ReturnAutoPtr())) = 20 bytes and 8 instructions (2 calls).

Now, this starts to paint a more clear picture as the overhead of each method. The truth is that passing a string by reference takes the least overhead at least when being called. So, when a function is referenced frequently, this can save a lot of space when a reference is returned.

  1. Const char returning: 6 + 17 = 23 bytes.
  2. Returning a string: 22 + 43 = 55 bytes.
  3. Fill a reference: 30 + 9 = 39 bytes.
  4. auto_ptr: 43 + 20 = 63 bytes.

Ok, so, our original assumptions are starting to prove correct. References seems to be outpacing returning a string. This is probably true in terms of performance as well. But what about practically, in a program, which calls each function 5 times? Five seems like a good number for a small program.

  1. Const char returning: 6 + 5*17 = 91 bytes. 10 calls.
  2. Returning a string: 22 + 5*43 = 237 bytes. 15 calls.
  3. Fill a reference: 30 + 5*9 = 75 bytes. 5 calls.
  4. auto_ptr: 43 + 5*20 = 143 bytes. 10 calls.

Slightly larger programs probably end up calling functions which returns string at least 100 times, but probably contain up to 30 different functions. What would that look like?

  1. Const char returning: 30*6 + 100*17 = 180 + 17=1876 bytes. 230 calls.
  2. Returning a string: 30*22 + 100*43 = 660+4300= 4960 bytes. 330 calls.
  3. Fill a reference: 30*30 + 100*9 = 900 + 930 bytes = 1830. 130 calls.
  4. auto_ptr: 30*43 + 100*20 = 1290+2000 = 3290 bytes. 230 calls.

Still I find the idea of returning a String so much easier than a reference and frankly adding 3K extra, for this convenience is, in my mind acceptable. However, many programmers may feel that 3K is too much, or the overhead may be too great.

The main reason, why, is that to return a string, actually requires far less typing than all the above options and just for that reason alone, I usually pick this. When speed becomes an issue, I slip back to using references. I have used auto_ptrs in the past, but my feeling is that auto_ptrs are more appropriate for larger classes which exceed 30 bytes. Also not mentioned here is the overhead of allocating memory on the heap. Each call to new is much more costly that to use a stack variable.

Just remember the old adage, "premature optimization is the root of all evil", and you will be just fine.


Wednesday, March 21, 2012

Fast Unique File Name Generation

A colleague and I recently got into a discussion about generating unique file names, for temporary files. We discussed the different ways of doing this, using GUIDs or Windows GetTempFileName() function and other options.

I started writing file system drivers over 20 years ago, so I have watched a lot of stack traces down the file system stack. Generating a guid is much much faster, since it requires far less overhead than searching for a unique file name. GetTempFileName actually creates a file, which means it has to call through the entire file system driver stack (who knows how many calls that would be and a switch into kernel mode.) GetRandomFileName sounds like it is faster, however I would trust the GUID call to be even faster. What people don't realize is that even testing for the existence of a file requires a complete call through the driver stack. It actually results in an open, get attributes and close (which is at least 3 calls, depending on the level.) In reality it is a minimum of 20 function calls and a transition to kernel mode. GUIDS guarentee of uniqueness is good enough for most purposes.

My recommendation was to generate the name and create the file only if it doesn't exist. If it does, throw an exception and catch it, then generate a new guid and try again. That way, you have zero chance of errors and can sleep easy at night.

On a side note, checking for errors is so overdone. Code should be designed to crash if assumptions are wrong, or catch exceptions and deal with it then. It's much faster to push and pop an address on the exception stack, than to check everytime on every function for an error.