Stratified normalization: Using additional information to improve the neural network’s performance.
A few months ago, I started researching how to classify evoked emotions using EEG recordings when I rapidly faced the most challenging problem when developing Brain Imaging Methods: the poor homogeneity of EEG activity across participants. This problem can be easily explained using Figure 1, where the plots were obtained by first extracting features for each EEG recordings for each video and participant and then implementing the dimensionality reduction tool UMAP to embed the data. While on the left, colors indicate emotions; on the right, colors indicate to whom the data corresponds to. Hence, if we directly try to use a model trained to classify emotions with four of the participants and try to infer the emotions on a fifth one, the results will be really poor, for not saying null.
To solve this issue, the most implemented solution is to use a calibration stage to pre-train the AI model. Nevertheless, this is very time-consuming, so many researchers have been trying to find alternative solutions.
Although this article is focused on brain imaging methods, this same problem happens in many other fields such as image recognition, voice recognition, or facial expressions.
In the past years, there have been multiple researchers that have proposed solutions to solve this issue. For brain imaging, most of the proposed solutions are based on selecting robust features. However,
(1) the classifier’s performance of the participant-independent models trained with those features are still lower than participant-dependent models.
(2) these features are task-dependent, which hinders robust solutions across not only tasks but also fields.
Therefore, we have studied a new participant-based normalization method, named stratified normalization, for training deep neural networks.
The main idea of this method is to proportionate additional information to the neural network to improve its performance. It is important to remind that this additional information needs to be different from the information we are trying to infer. For example, in our dataset, we are trying to classify emotions, so the used additional information used are the session and participant for each data.
Our analysis is on the task of cross-subject emotion classification from EEG signals, and results have demonstrated that networks trained with stratified normalization significantly outperformed standard training with batch normalization.
To present our paper briefly, I will focus on the proposed method, and I will contrast it with the well-known batch normalization method. Besides, to clarify the explanation, I will first explain briefly the dataset I have used for the analysis. Finally, I will show our results and finish with a small conclusion.
The dataset used is the SEED dataset, which is a collection of EEG dataset provided by the BCMI laboratory which Prof. Bao-Liang Lu leads. This dataset is, along with the DEAP dataset, the two most important datasets in EEG-based affecting computing.
It contains 62-channel EEG data collected from 15 participants, who carried out three sessions over the same 15 film clips. For each film clip, an emotional rating was assigned (positive, neutral, or negative) by averaging the ratings of 20 participants who were asked to indicate one of the three keywords after watching them.
- Participants: 15 participants
- Sessions x Videos: 3 x 15 (45 videos for each participant)
Batch normalization was first introduced by Ioffe and Szegedy (2015) to address the problem of internal covariate shift, an unwanted drift in the distribution of neuron’s activations resulting from the learning process. It is the most implemented normalization within neural networks and it has shown great performance in multiple applications.
As explained further below, we slightly adapted the method for our purposes. Figure 2 illustrates our implementation of the batch normalization method. For this method, the normalization is done per feature for each batch.
The stratified normalization is the method proposed and consists of a feature normalization per participant and session. Figure 3 details our implementation of the stratified normalization method.
Compared with the prior normalization method, this method normalizes the data per feature, participant and session. Compared with batch normalization, the main drawback is that it is necessary to either include a large batch set or study how much data there is per class in each batch.
Results and conclusion
Using the architecture displayed in Figure 4, the results are the ones displayed in Figures 5 and 6.
Figures 5 and 6 show the embedding of the predicted values at the neural network’s output layer. The results show that at the output layer, the emotion recognition accuracy is higher and the participant identification accuracy is lower for models trained with stratified normalization. Indeed, this time, embeddings are more compact on the UMAP for the stratified normalization rather than for the batch normalization, and easily recognizable for the emotion ratings rather than for the participant numbers — for which the spread of colors seems to indicate that most of the brain signature is gone indeed.
These results indicate the high applicability of stratified normalization for cross-subject emotion recognition tasks, suggesting that this method could be applied not only to other EEG classification datasets but also to other applications that require domain adaptation algorithms.