A prosthetic arm can easily get attached to upper limb of arm from where muscle movements are detected using sensors. The movements are then classified as hand gestures using machine learning algorithms.
The process involves signal capture, detection, isolation, and classification.
Signal capture and detection processes are the series of operations where signals from various channels are continuously monitored and analyzed to identify relevant data patterns. During the signal capture phase, sensors and corresponding channels are employed to gather raw data from the environment, which is then subjected to advanced analytical techniques. Captured signals are transferred to ARM processor for signal detection, which confirms the presence of a relevant signal, the next step is isolation, wherein unwanted noise and irrelevant information are filtered out to enhance the clarity of the signal being studied. Finally, in the classification stage, the isolated signals are categorized into relevant hand gestures as shown in Fig(1).
1) Signal Detection and Isolation :
In data acquisition stage, EMG (Electromyography) signals are captured from different sensors. Each sensor corresponds to one channel. The collective of all channels represents a "muscles movement" for hand gesture. EMG signals from all channels are processed by applying different transformations and converted into time and frequency domain features as shown in above figure(2).
In real time, this process should be fast and accurate. For making this process fast, I tested different C++ libraries Armadillo, Eigen for their performance in real time situations and integrated best performing library in data acquisition stage.
2) Feature Extraction :
In order to avoid overfitting a model due to noise, feature extraction is crucial. In this case, EMG signals are getting captured from multiple channels, combining which forms a hand gesture. Thus, these features are multidimensional and are noisy.
I have experimented with different feature reduction techniques such as PCA, LDA which captures combination of features with highest variance and integrated feature reduction phase in model training. Addition of phase improved model accuracy significantly.
3) Gesture classification :
The features generated out of muscles signals need to get feed to classification model. The decision to choose a right model for hand gesture classification based on data properties such as noise presence, temporal and frequency patterns, multidimensional features.
The machine learning models which handles this kind of data, plus it remembers last sequence of actions are RNN, LSTM, HMM and transformers. Each of the model has own pros and cons. The model which is optimized for maximum accuracy and which is less resource intensive, having higher response time is chosen.
Performance optimization in terms of response time is crucial in this phase. For that, I have tested different C++ libraries like Shark, with different input conditions.
CM1K CHIP :
I have experimented with a Neural net chip named CM1K for performance optimization and accuracy improvement. After successful testing of this chip, I have integrated it with a main processor by writing a device driver. There was a significant improvement in processing time and accuracy with this chip.
The CM1K chip features 1024 interconnected neurons working in parallel and capable of learning and recognizing patterns in a few microseconds.
The neurons behave collectively as a K-Nearest Neighbor classifier or a Radial Basis Function and are trainable. They are especially suitable to cope with ill-defined and fuzzy data, high variability of context and even novelty detection. Last, but not least, multiple CM1K chips can be daisy-chained to scale a network from thousands to millions of neurons with the same simplicity of operation as a single chip .
The CM1K chip is a chain of 1024 identical neurons operating in parallel, but also interconnected together to make global decisions. A neuron is a memory with some associated logic to compare an incoming pattern with the reference pattern stored in its memory and react (i.e. fire) according to its similarity range. A neuron also has a couple of attribute registers such as a context and category value. Once a pattern is broadcasted, the neurons communicate briefly with one another (for 16 clock cycles) to determine which one holds the closest match in its memory. The “Winner-Takes-All” neuron de-activates itself when its category is read, thus leaving the lead to the next “Winner-takes-All”, if applicable, and so on. A single CM1K matching a pattern of 256 bytes against 1024 models delivers the equivalent of 192 GiGaOps per second. Learning is initiated by simply broadcasting a category after an input pattern. If it represents novelty, the next neuron available in the chain automatically stores the pattern and its category. If some firing neurons recognize the pattern but with a category other than the category to learn, they auto-correct their influence fields. This intrinsic inhibitory and excitatory behavior makes the CM1K chip a unique component for cognitive computing applications.
My responsibility was to test the hand in terms of accuracy and response time in different test environments so that a whole tool can be utilized in real time situation.
For achieving desired goals, I have contributed to development in terms of C++ libraries testing, feature extraction process and integration of Neural net chip by working closely with core development team.
The outcome of development was there was 5% increase in accuracy of gesture classification with significant reduction in response time.