Thriller story: how Pokitto almost lost its RTC

About a week ago, I got a very worrying message from Daniel. He had built 12 of the new Pokitto boards with the RTC (Real-Time Clock). The clock worked but none - not a single one - was able to go into the bootloader mode and connect to the PC via the USB.

This was bad news indeed. We had a working design that suddenly had developed a fatal flaw. With you, dear Pokitto backers & community in mind I thought: “this could hit the shipping time badly”. My second thought was, that unless we find the bug fast, we will remove the RTC and go back to the original design. My apology message to the community was already written!

Throughout my “mini-holiday” we sent messages back and forth with Daniel: “check ISP enable lines”, “check reset signal”, “check VBUS voltages” … all were correct. And yet: no USB connection. Daniel quickly dispatched 4 boards to me.

Now, it is hard to explain how tricky it is to debug something like this. The fault can be in so many places. It can be a hardware fault, it can be a broken chip somewhere, it can be the addition of the real-time clock and its auxillary power supply that changes the behavior of the Power Management Unit (responsible for power saving modes) of the main chip. Worst of all is the fact that the USB routines are in the ROM of the chip: there is no way to step through the code that is burned inside the chip. USB is notoriously difficult to debug: the protocol is quite complex and events happen very fast. You need good tools to be able to follow whats going on.

Yesterday the boards arrived and I quickly assembled 2 Pokittos and began the detective work, not really knowing what to look for.

The whole day I was trying out things, measuring signal voltages, comparing schematics and eliminating possible sources of problems one by one. Daniel had already done these steps, but no harm in double checking.

At 10 in the evening I finally had a breakthrough. I had written a small program that makes the Pokitto to behave like a USB Serial device and connect to the PC as a virtual COM port. First I ran it on the faulty version 4 board. No show. Then I put it on the working v3 Pokitto: wham bang! COM port pops up and “Hello world!” message on the PC. Repeating the test gave same results. Now I knew the USB was the culprit. Not the RTC or any other of the thousands of things.

Because the USBSerial was now a software program on the chip instead of a hidden ROM routine, I was able to step through the USB handshake routine (called the “chirp”) and spot the point of failure. Something was pooping - yes, no kinder word for it - pooping on the USB data lines, messing up the signal.

I wrote about my findings to Daniel. The reply came almost immediately: check the ESD (electrostatic discharge) protection chip on the USB data lines.

And sure enough, a chip and feature that had been on the Pokitto PCB from day 1 of the design had chosen this revision of the board to suddenly rotate 90 degrees on its place, spewing crap on the USB lines. And all the time we were cursing the new RTC feature! Turning the chip around automagically solved all problems. Everything works again.

It was a close call, dear readers. Pokitto almost lost its Real-Time Clock. If the error would not have been found within 2-3 days, we would have gone back to the previous design in order not to screw up the project timeline completely.

But now I have a working v4 Pokitto on my table with Real-Time Clock and we are back in the game!

Mugshot of the culprit:

4 Likes

Wow, First of all, well done both of you for finding the fault. Second thanks again for keeping us all up to date with behind the scenes info. We all really appreciate it.

2 Likes

Thanks for the hard work, and a really quick resolving of the bug. You guys are real professionals!

Edit: Very (painfully) entertaining to read too. Especially for this audience!

2 Likes

Just wondering, are the Early Bird boards tested for functionality before shipping ? Or is there a risk something like this could happen since they aren’t assembled ?

2 Likes

Don’t worry. This was not normal: this happened because we added the RTC and changed the pick n’ place program. These are still prototype boards: you can see there is no silk screen print yet on the board (component names).

Edit: what happens next is I thoroughly test these new boards and then we start manufacturing real boards (takes about 1-2 weeks). This is the reason why the addition of the RTC delays us a bit. We can’t ship unless we are sure everything works as intended.

Of course all production boards will be tested before shipping, we have a test program that tests all components. They can be tested prior to assembly.

1 Like

Very nice, thank you Jonne

1 Like

That was a thriller! Was it the way that the equipment was set up to place the components for this run that was the ultimate culprit?

2 Likes

Yes. It was simply a faulty programming of the orientation of the chip in the p’n’p machine. But it was a very crafty bug, because everything else apart from the USB bootloader was working OK and we didn’t suspect it at first, since it was a component that had no issues on the previous 3 rounds.

1 Like

Excellent work Sherlock Jonne! Thank you for working so hard on this project for us!

1 Like

It’s always the little things, well good news means I may be able to get one sooner the more bugs are squashed!

Wahou what a stress !!!

Exellent new so now we will have a fully pokitto with RTC working :wink:

Thanks a lot

1 Like