top of page
Writer's pictureSamuel Hosovsky

Think About Moving Your Hand — The AI-Powered Brain Chip Will Do The Rest

Artificial neural networks, meet your biological relatives: literally (2/5)


cyborg arms powered by AI brain implants

Decoding upper extremity movement is a focal point within the Motor BCI community. This focus stems from the high priority that individuals with paralysis place on the restoration of hand and arm functions, as identified by Collinger et al., (2013) and others. Over the past few decades, there has been notable progress in decoding discrete hand movements, poses, gestures, continuous trajectories, and even force, as comprehensively reviewed by Berezutskaya et al. (2022).


On a high level, traditional approaches typically decode neural activity into either discrete movement classes or continuous movements.


Discrete Movements


With discrete classes, the decoder is trained to associate particular activity with particular movement conditions. These may be different types of grasps, reaching a predefined target, preparing to move vs. moving, and so on.


Once a catalog of such associations is built with repeated and precisely timed trials, a low dimensional map of activity patterns to movements can be established with popular techniques like Principal Component Analysis (”PCA”), which finds dimensions capturing the most variance without using class labels; Linear Discriminant Analysis (”LDA”), which finds dimensions that best linearly separate the labeled classes; or K-Nearest Neighbors (”KNN”), which creates a more nuanced (non-parametric) boundary between classes assigning new neural activity to the class of the k-most similar activities.


Several studies have demonstrated the decoding of specific gestures, grips, and poses from intracranial neural data, showcasing the potential of Motor BCIs in capturing and interpreting large inventories of complex hand movements through discrete classification.


A landmark study by Aflalo et al. (2015), used LDA to demonstrate how informative the PPC is for motor control. They showed that PPC encodes reach the target location before any movement is generated. When presenting the User with one of six 2D targets, the first 190ms encoded their planned reach information well before they were asked to attempt to reach it. They were able to decode them at above 90% accuracy.


Using KNN, Klaes et al. (2015) decoded User’s imagined hand shapes used in the Rock-Paper-Scissors-Lizard-Spock game. Other linear classification techniques, such as Chestek et al. (2013) successfully classified hand postures using ECoG signals in the gamma frequency band over human sensorimotor areas, and Shah et al. (2023b) classified a wide range of hand poses including sign language alphabet (Fig. 16).


Figure 16 Shah et al., 2023b: Attempted multi-finger movements are well-represented in the neural activity — “(A) Neural activity was recorded as T5 attempted 38 hand movements on the right hand, consisting of gestures from the American Sign Language (ASL) alphabet & single finger movements. Each movement was attempted 22 times in trials consisting of 1 second prep, 1 second move and 2 second hold periods. (B) A confusion matrix, where the (i, j)th entry is colored by the percentage of trials where movement j was decoded when movement i was cued. A linear support vector classifier was used for decoding. Classification accuracy was 76%, substantially above the chance accuracy of 2.6%.



For datasets categorized into discrete movement conditions, such as various grip types, traditional dimensionality reduction methods like LDA or PCA are reliable in deriving a low-dimensional representation of latent activities that can differentiate these conditions. However, their inattention to the temporal dynamics of neural activity limits their applicability in analyzing continuous behavioral movements.


Additionally, to facilitate real-world grasping, these discrete linear decoders would be required to not only initiate a grasp but also issue a finger splay command to release it, i.e. maintaining valid state transitions similar to existing prosthetic systems.


Continuous Movements


Predicting the future path of movement based on noisy neural data can be challenging.


The optimal linear estimation (”OLE”) method simply finds the best linear relationship between the measured signals (like neural activity) and the desired outcome (like the object’s movement). However, it doesn’t consider the smoothness or dynamics of the movement itself. OLE-based decoders already show great performance, relating population activity to velocity-related variables in as high as 10 kinematic dimensions. After only minutes of training, Wodlinger et al. (2014) used OLE with ridge regression to simultaneously decode User’s attempts at moving their hand through 3D space while simultaneously rotating their wrist, and pinching, scooping, or grasping their hand to manipulate objects.

(To play, see footnote link) Video 2; Wodlinger et al., 2014: Movie 8–10D Action Research Arm Test testing.


Adding another layer to OLE, the Wiener filter not only finds the best linear relationship but also takes into account the statistical properties of the noise, minimizing its effects. It led to some of the earliest Motor BCI achievements, decoding movement variables such as hand position, velocity, gripping force, and even muscle activation (Carmena et al., 2003).


Finally, the Kalman filter also takes into account the dynamics of past behavior to recursively combine it with the current noisy measurements to estimate the most likely future state of the system. It’s particularly powerful because it adapts the degree of smoothing to match the dynamics of the movement being decoded, optimizing its performance for real-world applications.


Its rapid calibration (Brandman et al., 2018) and favorable decoding performance made the Kalman filter a popular linear transformation method (Pistohl et al., 2007; Gilja et al., 2015; Brandman et al., 2018) powering the reanimation of a User’s hand through implanted Functional Electrical Stimulation (”FES”)(Ajiboye et al., 2017), or cursor control for communication and computer access (Kim et al., 2008; Pandarinath et al., 2017; Nuyujukian et al., 2018).


(To play, see footnote link) Video 3; Ajiboye et al., 2017: Multi-Joint Movements


(To play, see footnote link) Video 4; Nuyujukian et al., 2018: Tablet Cursor Control


Aside from BA4 and BA6, in addition to discrete movement goals, Aflalo et al. (2015) also showed that PPC encodes movement trajectories, which they decoded into velocities using regression over the population activity which showed significant directional tuning.


Closed-loop decoding of attempted movement trajectories from PPC

(To play, see footnote link) Video 5; Aflalo et al. (2015): Closed-loop decoding of attempted movement trajectories from PPC — “The point-to-point task provided a simple environment that allowed EGS to test his ability to spatially position a prosthetic effector. Under free gaze, targets were presented one at a time on the LCD display. During open-loop decoder calibration sessions, following a 250 ms delay relative to target onset, the effector would move automatically to the cued target with an approximately bell-shaped velocity profile. In the closed-loop condition, EGS guided the neurally controlled effector to the target. A trial was considered successful if EGS was able to maintain the position of the effector over the target for 1s. During some sessions, a simple cursor, rendered on the display, was used as the effector. During other sessions, EGS used the MPL, constrained to move in a 2D plane, to point to the targets that appeared on the monitor. Sometimes both the MPL and cursor were shown simultaneously.”



From Neural to Kinematic Manifolds


Motor representations exist in a low-dimensional space that can be identified in a topologically if not geometrically consistent way. That is the expectation for neural representations within the brain.


On the outside, human movement has many degrees of freedom (”DoFs”) (although far fewer than those of even a small population of neurons). Each human hand has 27 DoFs, and each arm has six major DoFs (and several more if the articulated joints of the upper back are included). Thus, the upper extremities alone have at least 66 degrees of freedom. Fortunately, much like neural activity, human kinematics are full of correlations.


For everyday movements, the effective DoFs are approximately 6 to 10 (Gracia-Ibáñez et al., 2020; Ingram et al. 2008; Sîmpetru et al., 2023; Liu et al. 2016). For instance, Ngeo et al. (2014) found that 8 DOFs were sufficient for characterizing most variability in kinematics for a variety of everyday tasks. They did note, however, that a wider range of activities would require a larger number of DOFs.


The aforementioned dimensionality reduction techniques can reveal stereotyped movements known as kinematic synergies (Gracia-Ibáñez et al., 2020). Linear combinations of kinematic synergies can be used to represent the most useful movements, grips, and gestures.


Examples of hand movement synergies extracted using PCA

Figure 17; Gracia-Ibáñez et al., 2020: Examples of hand movement synergies extracted using PCA. Reprinted from Figure 2.



For instance, studies by Nakanishi et al. (2013) and Wessberg et al. (2000) validate the feasibility of decoding arm and hand movements for prosthetic control in a simplified dimensional space.


Further, Ngeo et al. (2014) demonstrated the estimation of continuous multi-DOF finger joint kinematics using a multi-output Gaussian Process, indicating the feasibility of capturing complex hand movements through simplified models. Moreover, Santello et al. (2016), leveraged hand kinematic synergies to control robotic hands.


The advancements in the field of Motor BCIs are significantly bolstered by the understanding and application of inherent synergies and low-dimensional structures within neural and kinematic domains. This approach simplifies the intricate process of decoding movements such as reaching and grasping, making the development of BCIs not only more manageable but also paving the way for more effective and intuitive systems.


One of the disadvantages of low-dimensional decoding is that it is not always possible to preserve geometric relationships faithfully in kinematic manifolds. Kinematic synergies may not cover some desired movements due to their approximate nature. However, topological relationships are generally preserved in low-dimensional representations thanks to Taken’s Theorem. For many applications, this is sufficient.


Limits of Linearity


For many daily activities, emulating simple and stereotyped movements through Motor BCI (e.g. moving a cursor) often meets the basic needs of Users. These basic movements, such as simple grasping or reaching, are crucial for performing routine tasks and already enhance the quality of life for individuals with motor impairments.


However, the aspiration to achieve a higher level of dexterity, akin to the nuanced and fluid motions of the human hand, stands as a challenging yet critical goal in advancing Motor BCI development. This ambition necessitates exploring and implementing more advanced decoding techniques that can interpret the complex neural signals associated with intricate movements.


The advent of deep learning and non-linear decoding techniques, such as those investigated by Naufel et al. (2018) and Safaie et al. (2023), has introduced a level of robustness and adaptability not seen in traditional linear decoders. These approaches have shown promise in accommodating a wide variety of wrist movements and enhancing performance across multiple subjects, even demonstrating resilience in the face of data loss, as evidenced by Sussillo et al. (2016).


For instance, Pan et al. (2018) achieved rapid decoding of hand gestures in electrocorticography using Recurrent Neural Networks (”RNNs”), highlighting the ability to characterize temporal patterns of different gestures for improved accuracy in gesture decoding. In a head-to-head comparison of the linear Wiener filters and LSTM neural networks (a type of RNN), Naufel et al. (2018) showed that LSTM decoders outperformed state-of-the-art linear decoders on every task.


Fig 18; Naufel et al., 2018: Linear vs. Non-linear decoder performance metrics for decoding wrist-muscle electromyogram data in monkeys from intracortical neural data during wrist movement tasks that involved manipulating objects with either no resistance (movement), spring resistance, or isometric resistance profiles. Reprinted from Figure 9.




The emerging paradigm of matching neural and kinematic manifolds presents several fruitful approaches for developing useful neural prostheses without perfectly reconstructing body part trajectories.


Arduously, the dynamic transformation of neural activity to behavior exhibits nonlinearities induced by intrinsic dynamics of the observed motor cortical area and input dynamics from other brain regions, which may include sensory inputs.


Dynamic models of neural population activity (”population models”) increasingly use nonlinear models to account for these nonlinearities and describe (encode) the activity in terms of a low-dimensional manifold embedded in the high-dimensional space of neural recordings. In contrast to dimensionality reduction outcomes, these descriptions also include the temporal structure of the state evolution in its low-dimensional subspace (e.g. continuous movement) and subsequently map (decode) the latent state to behavior (and/or to the other sources of nonlinearity).


This dynamic transformation can be decomposed into several interpretative steps (Fig. 19): The mapping from neural activity to the latent subspace, the state dynamics in this subspace (recursion), and finally the mappings of the state to neural activity and behavior (neural and behavior readouts). In the field of machine learning, researchers have found that Riemannian manifolds are a common framework for extracting latent structures embedded in data because they allow for non-linearities to be modeled naturally as part of the metric (notion of distance) rather than as part of a structure. These manifolds have several advantages for modeling similarity structures within neural data (Yger et al., 2017).


Figure 19; Sani et al., 2021: Steps involved in disentangling the neural dynamics. Reprinted from Figure 1a.



The mapping of behaviorally relevant dynamics to movement can be bolstered by a similar process of reducing the high-dimensional biomechanics of complex movement onto kinematic manifolds which was discussed in previous sections. However, like neural activity, biomechanics of movement also experience nonlinearities and temporal dynamics inaccessible to the aforementioned linear dimensionality reduction techniques.


When Portnova-Fahreeva et al. (2020) trained participants to perform hand gestures, grasps, and tasks present in ADL, they found that their nonlinear autoencoder method “nAEN” outperformed linear PCA on all tasks — more accurately & separately representing hand kinematics on a non-linear manifold, accounting for 94% of input variance (whereas PCA only 78%).


Figure 20; Portnova-Fahreeva et al., 2020: Study setup consisting of three different phases — ”(A) American Sign Language (ASL) Gestures; (B) Object Grasps; Activities of Daily Living (ADL) Tasks.” Reprinted from Figure 2.




Figure 21; Portnova-Fahreeva et al., 2020: “Accuracy of SoftMax regression applied to different datasets [American Sign Language (ASL) Gestures, Object Grasps, Activities of Daily Living (ADL) Tasks] across all participants. Regression was applied to original input data (green), reduced non-linear Autoencoder Network (nAEN) 2D (light blue) and reconstructed 20D (dark blue) data, as well as reduced Principal Component Analysis (PCA) 2D (light red) and reconstructed 20D (dark red) data.” Reprinted from Figure 11.



Taking the idea of prioritizing well-defined non-linear kinematic manifolds further, one could employ strategies to create a neural-kinematic manifold catalog. One such example (Fig. 21) is introduced by Agudelo-Toro et al. (2023), who created a training protocol instructing the subjects to perform progressively complex kinematic tasks (involving more DoFs) and later mapped their neural latent trajectories to the kinematic ones which were then expanded into hand configurations with comparable accuracy to native grasping.


Figure 21; Agudelo-Toro et al., 2023: Reprinted from Figure 1.


The endeavor to expand the capabilities of BCIs beyond basic motor functions to more refined and precise control reflects a commitment to pushing the boundaries of what these technologies can achieve, offering users a more comprehensive and empowering interaction with their environment.


 

Part 7 of a series of unedited excerpts from uCat: Transcend the Limits of Body, Time, and Space by Sam Hosovsky*, Oliver Shetler, Luke Turner, and Cai Kinnaird. First published on Feb 29th, 2024, and licensed under CC BY-NC-SA 4.0.



uCat is a community of entrepreneurs, transhumanists, techno-optimists, and many others who recognize the alignment of the technological frontiers described in this work. Join us!


*Sam was the primary author of this excerpt.

Comentarios


bottom of page