Maybe you are one of the people who are considering to start a project that includes embedded software development combined with electronics.
I hope you found this text earlier than you started your project. My hope is that reading through it, I can spare you some of the pain that we went through. Probably, I won’t but at least you can prepare mentally. Or you are wise and beginning to think of another project.
If not, in this part of the “lessons learned from two orders of magnitude” -series of posts, I will share with you four lessons issues regarding embedded developing, loosely following a presentation that I gave at the Hasso-Plattner-Institute (HPI) in 2023:
´Lesson #1: Anything will break
What is your expectation if you buy an electronics product from a vendor? Exactly – it works! Throughout the last 50 years, the microelectronics industry and the software industry grew from a small industry to be one of the main drivers of economic growth. Its products are not used anymore by a small group of nerdy enthusiasts but by nearly everybody who lives on the planet Earth. Thus, the days of blue-screens, fixing networking problems at LAN parties and hanging internet connections (except on trains in Germany and in the Berlin metropolitan area) are long forgotten, since the products had to become very reliable to be used by a large non-tech savvy audience.
Once you go from customer electronics, to embedded electronics, it’s important to forget about the mostly nice user-experience that we enjoy with our gadgets. If you buy an embedded product, even from a seemingly notable company, it is safe to assume that some of its features will not work, sometimes probably the features that you need most – like a reliable on- and off switch.
In the realm of embedded electronics, it feels like we are a bit back to the era of Windows 3.11 and Linux 0.99, where basically you can expect anything to happen at any time for no obvious reasons.
Maybe you are reading this article because you are so pissed and disillusioned by your embedded prototype that procrastination and mind idling is the only thing that is possible right now. Maybe you are also asking yourself why it has to be that way? Why can’t they make nicer products like in the case of customer electronics?
I believe the reasons for this is “economy of scale”. Economy of scale is what pushed the massive investments into the customer electronics industry that were necessary to finance intense research and development (R&D) as well as quality assurance (QA) responsible for the high quality of products that we are used to today.
In contrast the market volume for embedded devices is much lower and the sales numbers for each product are sufficiently smaller. Take two examples: NVIDIA sells many more GPUs in the video games and data center sector than Jetsons. Sony sells more imaging sensors for consumer electronics or into the automotive industry, than to standalone industrial automation applications.
Thus, in the embedded sector we are lacking the necessary investment to bring the products to the same standard as in consumer electronics. Fixing the magnitude of issues of a product that is similarly complex but sells in much smaller quantities just doesn’t pay off.
I think what happens on top is that in embedded electronics, we CAN dig much deeper (as the systems are more open) and DO dig much deeper (as we use the systems as professional users, not as consumers). Thus, we discover more issues. If we had a similar approach towards our smartphones, probably there we would find significantly more issues as well.
Lesson #2: Customer Support does not exist in the “embedded dictionary”
An unfortunate side-effect of the two issues described in the section above is the seemingly non-existent customer support in the embedded electronics world.
If you have an issue, you are usually on your own (luckily some but seemingly few exceptions exist) and then your best friend is a powerful search engine (an LLM model and/or google) and if you are lucky your issue was already addressed previously. If you are less lucky, the issue is known to producer of your embedded device of choice but the company states that they are not going to solve it any time soon. If you used up all your luck for the year, after posting your issue on the company’s forum, nobody responds even after a month and you have no idea how to move forward.
While this can be very frustrating on the user side, again we have to remember that the plurality of possible problems that users face is high and at the same time the number of resources that can be brought on to solve a particular problem are effectively lower. Thus, the companies are left with nothing else but prioritizing the most pending things, and more often than not, not exactly your problem gets priority.
Lesson #3: Never trust cables and connectors.
Let me start this section with an image puzzle.
Why do you think it is not possible to flash the Jetson as it is shown in the left setup and it is possible to do so as on the right image?
I guess, the puzzle is a bit too easy, as you might have guessed the answer already from the title of this section:
Indeed, using the original cable as delivered by NVIDIA together with the Jetson, will not be useful to flash it. However, a similar cable of a cheap brand will do the job. I’m sorry to disappoint you, but I don’t have an explanation to this as we didn’t dig further into the “why”. It doesn’t make sense for me either. Sometimes it’s OK to just go with “if it’s stupid and it works, it ain’t stupid”.
Unfortunately, this is not only true for USB-cables but basically for nearly any kind of cable that you might think of: Ethernet, power supply, BNC and the list could go on for long. Especially, if you are wondering why your cam doesn’t work and you use MIPI cables, probably check the MIPI cables and swap them for a different standard if you can. That will spare yourself some headache.
The main takeaway is: If you are very puzzled why your system or code doesn’t work, don’t forget to check the cables: Really do it, even if it seems to be a simple off-the-shelf product and its malfunctioning seems pretty unlikely.
Lesson #4: There is far more version control to do than just with your repo.
Suppose you have your software stack running nice and stable on one device. Two months later you get components from your vendors to build a new one. Things arrive, everything is assembled exactly to plan, you setup and run your system and it has, weird, unexpected behavior. What happened?
Maybe the components that you got delivered have a fabrication error but probably they don’t. They just have a new firmware version that you are not aware of. And guess what, of course it conflicts with a relevant part of your software stack: Either some registry address changed minimally or whole instruction sets, or they introduced a new bug with the update that was not there yet, or they fixed a bug that you exploited!
Whatever the details are, small updates in firmware can have a big impact on your embedded system so make sure to not compare apples with peaches. Sometimes, your bug was actually intended to be somebody’s feature.