Experimenting with Edge-AI to Improve My Cacao Winnower
I canāt describe how happy I was when ā after all the work Iād put in ā my little ābotā actually worked to help winnow my cacao beans.
I trained a small, AI-model using about 2,000 images to recognize when nibs needed to be added to the vibratory feeder on my winnower and when they didnāt. Then, I built a little fixture and added it to my system.
And you know what? It all worked! Crazy right?
If youāre curious about how easy its becoming to integrate AI into every-day -things, then this story is for you.
Iāll walk you through my high-level journey of how I made it happen and you donāt really have to understand technical things . . . this article is for everyone.
Winnowing cacao is the process of taking the crushed, roasted cacao bean matter that is mixed with the husks and separating them so that you can grind just the bean matter (nibs) into chocolate.
The machine that does all of this is called a winnower.
Hereās my one-of-a-kind, home-built winnower (minus the vacuum unit):
Nibs are periodically deposited from the that clear container on top, onto that feeder just below it and they are spread out as they slide down slowly over the opening of the funnel on the top-left-side by a constant vibrating mechanism. I use a dental mold vibrating machine which works great. Its right behind that knob thing ā the controller.
This evenly distributes the nibs, loosens up the husks and generally makes it easier for the husks to get separated from the nibs and pulled away in the process as they fall down that pipe on the left into the collection bucket.
Most of the time, just setting up the timing of the deposit of nibs just right usually works. To help me dial in the timing, Iāve created a controller for the nib-depositing-mechanism that allows you to adjust:
- How often to deposit the nibs (the timing of when nibs are dropped)
- How much nibs to allow to drop (the aperture of the chute)
- How long to keep the chute open (the duration of the process)
Hereās a close up of the controllerās interface from the picture above:
Check out all those cool knobs! So, you can set this thing up all kinds of ways until nibs flow consistently and evenly out the chute, down the feeder bed and down the separator pipe.
Like I said, this works great ā mostly.
However, different beans and roasts of nibs changes the structure and density of the cracked matter. So, its never the same settings twice. You canāt just set-it-and-forget-it. Doing so causes one of two problems:
- If youāre too fast / generous with the nibs, then nib matter deposits too quickly and overwhelms the vacuum or worse, builds up on the feeder and then flows over the sides on the floor and makes a mess.
- If youāre too slow / not enough nibs, then you end up wasting time and energy running the vacuum and feeder doing nothing and you slow down your whole process.
Before adding AI to Winnowloompa, I would just set things up, and check on it every few minutes to make sure things were good. But this kept me from being able to focus on other things; slowing down my multi-tasking. And, if I happened to be roasting, it turned into that Lucille Ball skit where they were trying to catch the chocolate from falling off the end of the conveyor . . . only this isnāt āchocolateā; yet.
So . . . I thought, āI need someone to watch Winnowloompa for me.ā
Thatās when I decided to give it an small, brain-with-an-eye-thingy so that it could monitor itself and I could just . . . set-it-and-forget-it!
So . . . they actually make them.
I chose to use an old-standard for me ā the same one I used when I put AI in a bean sorter ā an ESP32-Cam module.
This little thing is loaded with tech stuff!
Thereās a quick, CPU processor on it, an SSD card reader, a camera, WIFI, Bluetooth . . . just perfect for this application . . . a small brain-with-an-eye! Oh, they cost like $4ā5 . . . more powerful than the first IBM PCās . . . insane . . . and cheaper than a large coffee. Now, I promised to not get too technical here so Iāll stop there with my geeking out over this . . . but you get the idea.
I also had to create a holding case for it. I couldnāt have it all bare; itās electronics. And then I also needed some kind of mounting system for it. So, I had to design and create that too. Thatās what you see in the first image of this article all the way back up top (take a quick peek . . . Iāll wait) . . . SwitchBot doing itās thing.
So, now that I had found my brain-with-an-eye, made a case for it and figured out how to mount it to stuff I had to figure out how to teach it to do things?
So, it turns out, adding AI to something is simpler than you might think. And, itās also harder than you might think. Weird. Right?
Hereās the basic process:
- Identify the decision point: Figure out where a decision needs to be made in your system.
- Collect data: Gather examples that teach the AI how to make that decision, just like youād need to teach a person.
- Train a model: Build a neural network model and train it using the data you collected.
- Integrate it into the system: Add the electronics and connect the model to your system so it can act like a person making that decision.
For Step 1, the best vantage point for watching the nibs is looking at the bed of the feeder. So, I attached the SwitchBot near the end of the feeder above the bed of the feeder right over the funnel.
This is the perfect place for it to see the entire bed.
Well that was easy.
For Step 2, collecting the data. Now that the camera can see the whole bed, I wrote a program for my computer to talk to the SwitchBot and take pictures, of the feeder every 5 seconds.
Then, I started winnowing about 10 lbs of chocolate and gathered the images. I initially tried to pre-categorized the images while I collected them but that was impossible. Things moved too quickly.
Once I had all of the images, I wrote a second program that let me cycle through the images chronologically and label each image as:
- Stop. Donāt deposit any more nibs matter.
- Go. Let the nib matter flow.
This was tricky and time consuming. I had thousands of images of all sorts of states.
- Nibs, blurred as they fell from the chute on an empty bed
- An entirely filled bed
- Nibs at the top of the bed but not at the end
- Nibs at the end but not at the top
- Nibs that were all the way to one side or another
- Nibs a short time after depositing but not quite down bed much
Just imagine taking a movie of 10 lbs of nibs and jumping through the movie every 5 seconds.
I had to look at 2,000+ images many, many times trying to decide if I wanted the SwitchBot to say āGoā or āStopā based upon what was in that image.
Some images were very easy to categorize; others, it could go either way so I had to make a call. Initially, I wasnāt consistent with this which I ended up having to contend with later (see below).
In the end, I always erred on the side of caution and teaching it to be more conservative about it. I didnāt want to end up back in the place where too many nibs were being dumped on the bed.
So, just imagine how long this took. Step 2 was not so easy and not so quick.
Here are 40 of the 1,100+ āGoā images:
And here are 40 of the the 1,200+ āStopā images:
Now, having all these images labeled, the next trick was to see if I could teach a model to see the difference.
For Step 3, with all of this knowledge about what to say to the controller and when to say it, I was ready to teach SwitchBotās mind
Time to train an AI model . . . but which model?
Remember, I just wanted to prove out whether or not all of this would work. So, in an effort to succeed, I decided to give it a fighting chance by starting out with a base model that was proven in classifying images of things.
So, after some research, I chose to fine-tune train an existing model ā Googleās EfficientNet-B0 ā which was trained on a standard image classification set of data. Teaching it my simple, āStopā, āGoā on these images was almost guaranteed to succeed.
And it did! The third program I wrote ā an AI training program ā worked.
After the additional training was complete, I wrote a fourth program to run through the data used for training and also other images it had never seen before to test out if the model worked or not.
So, after training the model and testing it out, I found a few instances where the model was insisting that an image I marked as āGoā was really a āStopā and vice versa.
The funny thing is . . . I agreed with it!
Think about that.
I taught this model how to look at these images and pick āGoā or āStopā and some of the very images I used to train it, it disagreed with me on and told me āNo. That one should be a Stop. You said it was Go.ā
And you know what . . . I agreed with it. Like I said above, I wasnāt always consistent with the classification but the model taught me! Baby terminator took me to school.
Thatās why you see those āGoā and āStopā buttons above. As I ran through the data, I asked the model to āFind a Mismatchā. Iād then inspect the image and if I agreed, Iād reclassify the image right there.
Now, some of the images were hard to call. It could go either way and as you can see on the screen, thereās a percentage of confidence there in the result too. So, ultimately, I decided that an actual āGoā was
- if the model predicted that it was a āGoā and
- that it had a confidence score of 80% or higher
Anything else is a stop. Now, in this way, I minimize the chances that I get big piles of nibs on my feeder or my floor down to none while keeping that feeder as busy as I can.
Itās optimized.
After reclassifying some images, I retrained the model, went through process above and kept doing this several times until I was happy with the final model.
In the end, the model predicted the right classification 98% of the time; literally better than me since I changed my mind on a few things.
It worked better than I could have hoped.
Now that I had a working model, it was time to weave all of this into my existing system.
Step 4 was actually pretty easy in concept but time-consuming to implement.
First, I needed a way for the controller to talk to the āeyestalkā . . . the SwitchBot. This required finding a free pin on the controller brain.
Fortunately, I had a free digital pin on the microcontroller (that big dark chip above).
I opted to keep the systems electrically decoupled; that is, they were both powered independently of each other. So, I needed a +5 volt, a ground and a signal (digital pin 9 above) in the cable. Fortunately, there was room on the breadboard for me to solder the resistor, decoupling capacitor, power lines and a three-pin Dupont connector for the cable (see the yellow squiggles in the diagram).
So, since the ESP32-Cam brain-eye SwitchBot ā which works on 3.3 volts digital signals ā wasnāt going to talk directly to the the 5 volt Arduino ATMega328P above, I needed some way for them to exchange messages.
Since the āGoā message is really just a binary thing, I opted for an optocoupler. This is a small chip that has a baby LED inside and an LED sensor in front of it. It allows you to connect a circuit on one side that powers the LED and a totally different circuit on the other side to sense when that hidden LED is turned on. In this way, thereās no electricity ā no high and low digital signal ā being swapped. Instead, SwitchBot is shining a baby flashlight for the controller to let it know if the coast is clear or not like the rebels in the mountains sending signals to the spies in the camp.
I then had to re-work the code on the micro-controller.
I changed it so that now, when I boot up, if Iām holding down the āRunā button, it will start up in āAI-modeā and start looking for the SwitchBot signal before opening the chute. If itās not āhighā i.e., āGoā, then nothing happens. If I donāt hold down the āRunā button when I turn it on, it just acts normally. So, if I want to run with no AI, I still can.
So, this also meant that Iād have to build out that part of the circuit inside of the SwitchBot that takes the āGoā signal and turns on the baby flashlight.
You can see the three-pin Dupont connector at the bottom there to the right next to the power socket. Thatās where the cable to the controller goes. It actually powers the bottom of the circuit from the 5 volts coming from the controller.
Up and to the right is the 4 pin optocoupler chip. Its small. I wired up a switch to it too so that can be used for debugging and just make the signal always on. Thereās also an on / off switch, the green āGoā LED at the top, the socket for the ESP32-Cam and to the right, a switch and cabling for programming the ESP32-Cam. With all of that done, I just needed a long, custom cable between the two and I was done with the physical integration.
Oh, I forgot to tell you. Right now, the AI model doesnāt live in this part. At the moment, itās still in my computer. My computer talks to the SwitchBot over a WIFI connection. The SwitchBot has a baby webserver in it.
So, I wrote a fifth program that talks to the SwitchBot and gets an image every second. It then asks the model if the image is a āGoā or not. If it is, the computer tells the SwitchBot to turn on the baby flashlight for the controller. If itās a āStopā, it tells the SwitchBot to turn off the baby signalling flashlight. It also toggles the green LED at the top when this is going on so that we can see whatās going on.
This is how I was able to do the last step and integrate everything.
So, thatās how I created an AI proof-of-concept on Winnowloompa and proved to myself that AI could actually solve my problem.
Itās alive!
I want to move the model processing from my computer to the SwitchBot itself.
The EfficientNet-B0 Fine-tuned model is too large to fit on the SwitchBot.
Iāve done some preliminary testing and Iāve been able to scale down the input image sizes from 224 pixels x 224 pixels and 8-bit color which is 150,528 bytes down to 16 pixels x 16 pixels and greyscale which is 256 bytes.
The small, custom CNN model I created in this process has the same accuracy as that huge model which is mind boggling to consider. But, Iāve tested it out and sure enough it accurately predicts the right state 99% of the time; slightly better than the bigger model.
So, my next step is to figure out how to use TensorFlowās TinyML and get the coding that runs on the computer all squeezed down and running on the ESP32-Cam by itself.
Stay tuned! Letās see if I can do this . . .