Last week, the vintage IBM 1401 computer at the Computer History Museum started behaving strangely: it wouldn't halt. More specifically, if you had two HALT instructions in a row, it would halt for the first, but when you continued, it crashed mysteriously. Here's how we fixed it.... 1/N
The IBM 1401 computer lets you run code a single cycle at a time, so you can see what happens at each step. (The arrow on the control panel.) But when I tried to single-step through the bad HALT instruction, the computer immediately died in a distant memory location. Now we had two problems. 2/N
I hooked up an oscilloscope and discovered that the computer wasn't running a single instruction cycle. Instead, it ran a few cycles (the pulses below), stopped briefly, ran a few more, irregularly stopping and starting, "building up steam" until it ran thousands of cycles. But why this pattern?
You push the green START button on the console to start the computer after a HALT, and also to single-step it. Hmmm. For convenience, there's also a START button on the card reader and one on the printer. We found that everything worked fine with those buttons. Hmmm.
It turned out that the computer's START button had a strand of wire that shorted one of its contacts. As a result, it would keep re-starting as long as you held the button, blowing through the HALT or single-step until it crashed. The contacts bounced a bit, causing the random gaps we saw.