Kurt Guntheroth's Old Hands Blog: August 2011

Sunday, August 28, 2011

Learning About Robustness

Something I think about from time to time is, "What do Old Hands think about code, that is different from what rookies think?" Here are some examples.

When I was a rookie, I believed programs should be precise. If the program parsed some input, that input should conform. I believed that a program was robust if it printed a detailed error message and halted, so I could quickly fix the input.

As an experienced developer, I realized that processing the input was far more important than ensuring that every comma was in the right place. I came to believe that a program was robust if it accepted a generous superset of the expected input syntax. I remember hearing this advice as a rookie and snorting with derision at the thought of anything so sloppy. This is another difference between Old Hands and rookies.

When I became an Old Hand, I realized that any program you run frequently provids a service. Any time the program halts prematurely, it fails in its reason for being, which is to provide the service. I discovered that even a program that is not working correctly may be more useful than a program that won't run. I learned that a program is robust if it provides its service, all the time, under the widest range of conditions. A robust program should not fail, and if it does fail it should recover, and if it cannot recover, it should restart, and if it can't restart, another program should restart it. The code to provide all this robustness amounted to as much as 50% of the total, but I no longer viewed that as wasteful excess.

When I was a student, I never checked return codes. "How could a system function possibly fail?", I thought. I was naive enough to assume that the people who write system functions never made mistakes, and hardware never failed.

When I was a rookie, I discovered that even if a function failed only once in a million calls, I would see it fail. A million calls just doesn't take that long when a program is running flat out all day. Such a program would terminate unexpectedly, usually saying no more than "segmentation fault".

As an experienced developer, I learned that functions fail all the time, but I hadn't known it, because I wasn't checking the return codes! Usually functions failed because the argument values were illegal. I remembered spending hours looking for why my program wasn't working, when the functions were telling me exactly what was wrong. I meticulously checked every error return, and reported failure up to higher levels of the code. I also began writing functions that did more checking and logging, because when these functions failed, they told me what I was doing wrong.

Once when I was old enough to know better, I wrote a library of meticulous functions tthat checked every return code, to check out some functionality in Windows. It took days. A colleague who was an Old Hand bodged together an informal but usable tool in a couple of hours because he didn't check return codes. I learned something that day.

Wednesday, August 3, 2011

I went to college to get educated, back when people thought they could be successful even if they didn't go to college. (Taking manufacturing jobs. Wonder how that worked out for them?) I took this two credit class called Math 111A, which featured programming the pdp-8, and CDC Cyber 6600. And programming the pdp-8 emulator on the CDC 6600.

Of course this course was taught by a grad student. And it was arguably the worst instruction I received in college. Someone asked how to name variables. The teacher said, "Name them anything you like. Call them Kirk, Spock, and McCoy." So my first program had variables called Kirk, Spock, and McCoy. I didn't understand the difference between symbolic constants that named memory locations (variables) and symbolic constants that named constant values. It was amazing I ever got my programs to run.

There was an actual pdp-8, a filing-cabinet-sized microcomputer with almost 8,000 gates(!) and actual magnetic core memory. You fed it programs using the paper tape reader on an ASR-33 Teletype. First you toggled in a simple loader on the front panel switches. Then you used that loader to load the RIM loader off paper tape. Then you used the RIM loader to load your program. The machine was so mechanically flakey that it was even money it would stay up this long.

On the strength of this vast experience, I applied for a programming job, which ended up making me spending money the rest of the way through school.

My college GPA intersected with the Department of Computer Science's requirements during exactly one academic quarter, which coincidentally turned out to be the quarter I applied. After that the decision seemed to have been made.

The summer between my Junior and Senior year was The Energy Crisis; the first time energy stopped being ridiculously cheap and infinitely available. Campus authorities went around turning off the A/C to all the buildings on campus. The only exception was the Hospital. And the Academic Computer Center. Seems the mainframes like it cool. This cemented my already firm intention to go into software.

After school, I took my first full-time software job at Fluke in Everett WA. Of course by then I had a lot invested in being a software engineer, but there were two more events that confirmed my decision. I watched a summer EE intern destroy an irreplaceable prototype display tube. He powered it up. There was too much current in one column driver, and the column wire burned up, making this sad little "tink" noise as it died. This increased the current in all the other columns. You could hear it die, "Tink. Tink. Tink, tink, tink, tink-tink-tink-tink-tink." It was totally not his fault. The driver chips for this display weren't available yet, and we were were overdriving chips for a lower voltage display. But he felt so bad. I liked the notion (not completely correct) that when software crashes all you lose is time.

I had an EE colleague at Fluke named Jim Lenker. Jim was a thrill-seeker. He drove too fast. He went scuba-diving alone. He was missing a finger on one hand that he had cut off in an accident. But I noticed that when he worked with high-voltage circuits, he put his left hand in his back pocket to prevent making a circuit across his heart. Computers were all 5 Volts at the time. You can scarcely feel 5 Volts on your tongue. I liked that.

Why I Went to College

This is the story I always tell when somebody asks me why I went to college. One reason the story is interesting is that it isn't even my story. It belongs to my wife's brother Dan, who is a construction welder. Here's the story.

A ditch box is two steel plates held about 3 feet apart by braces. You put a ditch box in the ground when you have to work in a ditch because ditches tend to collapse in wet weather, and this is Seattle we're talking about.

So, my brother-in-law Dan is working outdoors, six feet below grade, in a ditch. It's the fifteenth of December. It't 35 degrees F and drizzling rain. Dan is soaking wet and standing in ice-cold water up to his knees. He is welding an additional cross brace into the ditch box, presumably to keep it from collapsing. He is using an electric arc-welder, and he can feel the current flowing over his wet body to ground.

...and that's why I went to college.

Of course that isn't really why I went to college. Well, it kinda is. I went to college because I wasn't really a grownup when I was 18 and it was go to college and get to live comfortably at home, or go to work and live on my own. Besides, there was just this background assumption in my family that everybody would go to college. But it was clear in my mind that I would rather work indoors in an air-conditioned office, and sitting down if I liked. And I would be happiest if the most dangerous thing I did on a daily basis was drive to work. And that's why I went to college.

Sunday, August 28, 2011

Learning About Robustness

Wednesday, August 3, 2011

Why I Took Computer Science

Why I Went to College