SIViP DOI 10.1007/s11760-016-0897-2
ORIGINAL PAPER
Extraction of breathing features using MS Kinect for sleep stage detection Aleš Procházka1,2 · Martin Schätz1,2 · Fabio Centonze3 · Jiˇrí Kuchynka ˇ 4 · 4 4 Oldˇrich Vyšata · Martin Vališ
Received: 22 November 2015 / Revised: 11 March 2016 / Accepted: 8 April 2016 © Springer-Verlag London 2016
Abstract This paper presents the contactless measuring of breathing using the MS Kinect depth sensor and compares the results obtained with records of breathing taken by polysomnography (PSG). We explore the methods of signal denoising, resampling, and spectral analysis of acquired data as well as feature extraction and their Bayesian classification. The proposed methodology was applied for analysis of the long-term monitoring of individuals who were observed simultaneously by PSG and MS Kinect in the sleep laboratory. After time synchronization of polysomnographic and MS Kinect video data, features were extracted from both signals and compared. The average error of the frequency while being evaluated by MS Kinect that was related to that obtained by PSG was 3.75 %. The mean accuracy of the Bayesian classification of features into two classes (i.e. wake or sleep) was 88.90 and 88.95 % for the PSG and MS Kinect measurements, respectively. The strong likeness of features supports the hypothesis that contactless techniques may repElectronic supplementary material The online version of this article (doi:10.1007/s11760-016-0897-2) contains supplementary material, which is available to authorized users.
B
Aleš Procházka
[email protected]
1
Department of Computing and Control Engineering, University of Chemistry and Technology in Prague, 166 28 Prague, Czech Republic
2
Czech Institute of Informatics, Robotics and Cybernetics, Czech Technical University in Prague, 166 36 Prague, Czech Republic
3
Department of Mechanical Engineering, Politecnico di Milano, Milan, Italy
4
Department of Neurology, Faculty of Medicine in Hradec Králové, Charles University in Prague, 500 05 Prague, Czech Republic
resent a valid alternative to the present approach of sleep monitoring, thereby allowing data acquisition in the home environment as well. Keywords Polysomnography · Breathing analysis · Digital signal processing · Range imaging methods · MS Kinect · Feature extraction · Bayesian classification
1 Introduction Non-contact methods of breathing tracking have found a notable research interest in recent years as an alternative method to the traditional approach of studying the effects of sleep on breathing. The measuring of breathing with contact methods, such as a respiratory belt, is uncomfortable for patients and, in the case of complex sleep analysis, cables from different sensors can substantially affect the quality of the sleep. The paper presents how MS Kinect sensors can be used in this area. Range imaging methods are based upon specific computational technologies that form matrices whose elements carry information about the distance of the corresponding image component from the sensor (also known as the depth map). Figure 1 presents an example of such a depth map of the chest. The record of mean distances of the chest during different sleep stages that are acquired from the depth map is presented in Fig. 2. This paper focuses on using the depth sensor from the MS Kinect version 2 for contactless measuring of breathing [1, 26,30,31]. There are already different methods available for measuring breathing that do not require contact. One of these methods involves the use of radio frequency, which behaves essentially like a radar system [29]. A common form of this system uses Doppler radar, which is based upon the frequency
123
SIViP
2600 2400
Dist [mm]
(b)
(a)
1 0 −1 −2 0
2
4
6
8
10
12
Time [s]
2200
10 2000
(c)
8
1800
6
1600
4 2
1400
0 1200
−2
1000
−4
800
−6 −8
600
−10
Fig. 1 An example of the chest depth image that is acquired by the MS Kinect, which includes a the whole depth map, b the evolution of the average distance in the selected area before (dark line) and after filtering (grey line), and c the selected image area
Depth Maps Diff. [mm]
(a) Wake Sleep FA
0.1 0.05 0
−0.05 −0.1 100
200
300
400
500
600
700
800
900
340
360
1000 1100 Time [s]
Diff. [mm]
(b) 0.2 0 −0.2
Diff. [mm]
(c) 0.2 0 −0.2 200
220
240
260
280
300
320
380 Time [s]
Fig. 2 Section of MS Kinect mean difference values, which includes the a segment of mean breathing depth map differences in three classes (Wake, Sleep, Falling asleep—FA), b detail breathing depth map record, and c denoised values of the breathing depth map record that detect the chest movement
shift of a signal that is caused by the relative velocities of a transmitter and receiver. The transmitted signal reflected from the chest will change in frequency according to the movement of the chest [12,15,32]. Another method uses the Eddy current to exploit changes in torso conductance during respiration, which is the result of variations in quantities of air, blood, and fluids. Therefore, this method allows for the measurement of volume changes of air in the lungs. The breathing depth during separate stages of sleep is different and significantly shallower during REM sleep when compared to the awake stage [9]. There are no significant differences between the sexes either in ventilation (corrected for body surface area) or in end-tidal concentrations during any stage of sleep. There are no significant differences in the levels of ventilation in non-REM sleep, but there can be a significant fall during REM sleep [28]. This reduction in ventilation exists because of a breathing pattern that is considerably shallower in all stages of sleep than during wakefulness. The respiratory rate and the respiratory dynamics or the variation of the breathing pattern provide useful information for diagnostics or therapy.
123
Polysomnogaphy (PSG), which is the gold standard for sleep disorders research, is commonly used in sleep laboratories [3,25]. Polysomnographic monitoring represents a multiparametric method that seeks to record physiological changes that occur during sleep. This method requires a whole night recording in a sleep laboratory and entails performing both cardiorespiratory and neurological measurements that require connecting several sensors to the patient’s body to obtain sufficient data. The evaluation of polysomnographic signals that are acquired in the clinical environment must be done by an expert with the support of specific algorithmic tools [24,27]. Therefore, its use in the home environment is not feasible. In recent years, many non-contact and non-invasive measurements for the tracking of breathing have been proposed [6,8,14,16,19,20] as an alternative to PSG data acquisition. Specifically, the use of systems based upon video imaging is promising because it both avoids physically linking the patient to instrumentation and may not require that the subject be in a known position under static conditions. Recently, the use of the inexpensive MS Kinect device, originally conceived for the video game industry, has found interest in research activities in this field. Its suitability for home use may also result in moving the recording practice to the domestic environment. The estimation of the respiratory rate is particularly crucial [4,5], and different approaches have been used. A real-time respiratory tracking is achieved by means of a monitoring system that includes a translation surface [30]. A volume change of the thoracoabdominal part of the examinee is measured through 3D reconstruction of the thorax surface [2]. Meanwhile, in [18] an infrared dot pattern is projected, and the trajectories of the dots are tracked over a time window to estimate the movements of the subject. The present paper addresses the topic of breathing reconstruction by means of depth images that are recorded during the sleep of the patient by MS Kinect (Fig. 1). Features of interest are extracted from video images that are acquired, and the results are compared with the features that are obtained from the corresponding synchronized PSG record, which was obtained as a reference. Breathing features that were detected for several individuals were analysed by the Bayesian classification system using the proposed criterion.
2 Data acquisition Depth maps were acquired by the MS Kinect version 2 depth sensor and were used to follow chest movement to analyse breathing and to detect disorders. The radius of this sensor ranges from 0.5 to 4.5 m can be extended up to 8 m, and it has the sampling rate of 30 frames per second. The
SIViP
device features a sensor that is capable of capturing depth maps using “time-of-flight” technology [13]. The resulting matrix has 512 × 424 elements wherein each cell of the matrix reports the value of the distance between the Kinect device and the target (the chest of the subject). The depth sensor of the MS Kinect version 2 is not affected by the outside light, and this type of error is not present in depth maps. However, single rough errors that are randomly distributed still exist, and they can substantially affect the captured scene. The method of reducing these errors included the use of the 2D median filter for depth map matrix processing. Figure 1 presents an example of the depth map that is acquired by the MS Kinect and the area that is used for breathing analysis. The evolution of the mean distance of the chest from the MS Kinect is presented in Fig. 2. For the purpose of acquisition, the software development kit (SDK) 2.0 has been used to develop an algorithm that integrates the depth maps with specific features, such as associating a time stamp with milliseconds precision to each frame. The data recordings (about 130 GB per night for one person) occurred in the sleep laboratory of the University Hospital Hradec Králové (Czech Republic). In our tests, the patient was recorded for an entire night (duration of 8 h) with the Kinect device being positioned by the side of the bed at a distance of approximately 1.5 m. The patient was free to move and was covered by clothes and blankets. The sleep laboratory set-up is presented in Fig. 3a. For the long-term records of the breathing, a wide area was manually selected to include the patient’s chest even during his movement over the whole night. In case of more detail respiratory surface motion tacking and modelling [1], specific algorithms should be used. (a)
(b)
(b) DENTAL (iii) BREATHING BREATHING BREATHING ARCH FEATURES FEATURES PARAMETERS FEATURES DETECTION DETECTION DETECTION ESTIMATION (a)EXAMINATION (i) EXAMINATION EXAMINATION
Multidimensional MS Kinect Kinect data acquisition: data acquisition: >> range range mapping image mapping record >> noise noise analysis analysis >> de−noising enhancement de−noising >> data data fusion fusion
(ii) (ii) PROCESSING (b) PROCESSING
Video frames Image Video frames segmentation processing processing using: using: >> >> >> >>
depth frames watershed depth frames transform resampling resampling region chest chest area area growing detection detection method Hough distance mean mean distance distance transform evaluation evaluation evaluation convex hull
(iii)ANALYSIS (ii) (c) ANALYSIS (iii) ANALYSIS
Analysis of Analysis image components boundaries: of of breathing breathingfeatures: features: >> >> >> >> >>
segmentation segmentation frequency frequency detection detection separation points detection regularity regularity estimation indicator evaluation dental bodies extraction arch parameters estimation dental treatment evaluation accuracy accuracy evaluation evaluation
Fig. 3 Principle of the chest movement analysis presenting a the sleep laboratory set-up with the depth image visualized in the range up to 2 m and b the block diagram of the MS Kinect that is used for data acquisition, processing, and analysis to detect and classify breathing features
Results of the following methodological study are based upon MS Kinect and PSG records 8 hours long of four male patients having diagnosis (Dg) of restless legs syndrome (R), sleep apnoea (A) and one healthy individual (H) with detail facts in the Supplementary tables.
3 Methods Figure 3b presents the block diagram of MS Kinect data acquisition and methodological steps of their processing to find breathing features for Bayesian classification. 3.1 Depth frames resampling and denoising To conduct out the analysis of the depth maps, the proper distribution of the frames in time was needed. The sampling frequency of the MS Kinect, though, is not regular. Hence, interpolation and resampling of the depth frames were necessary. The linear interpolation method was used as it guarantees the values of the depths of each pixel at each instant in time to be included in the depth range of the same pixel values in the neighbouring frames. For more smooth results, spline interpolation was applied to reduce the sampling rate of MS Kinect to frequency of 10 Hz used for PSG records. The depth maps that are obtained from the hospital monitoring includes errors that are due to sensor inaccuracy. To ensure reliable results, a denoising process on the images was necessary. The wavelet transform was chosen as a tool for the denoising of individual depth frames. The fundamental idea behind the wavelet techniques is to find the low-pass and highpass image components for their multiscale decomposition, thresholding, and reconstruction. A wide range of the literature [11,23] attests that this method is effective in removing the noise from images because it enables the capturing of the energy of the image in a restricted amount of values and the distinguishing of the information that is related to its important components from those that are related to noise only. As proposed in [7,11], the use of Haar wavelet is sufficient for processing of depth images owing to its simplicity and similar results in comparison with other mother wavelets. Image decomposition into the second level and global thresholding was used in this case. The characteristics of the noise in the images that are acquired make the traditional thresholding techniques ineffective. Therefore, the denoising on the depth maps was implemented in such a way that, uniquely, the second level approximation image is maintained, while all coefficients of the detailed images are cancelled as they contain only a minimal part of the image energy.
123
SIViP
1
1 Spectrum Smoothing
0.9
i∈R1 j∈R2
The resulting sequence of values y(tk ), k = 1, 2, . . . indicates the information about the time evolution of breathing that can be detected from this signal. As the MS Kinect depth sensor is not precise enough and its error increases with the increasing distance of the measured object, the output of any depth map processing will be affected by noise components. Typical respiratory rate for a healthy adult at rest is 12–20 breaths per minute (bpm) [15], which corresponds to 0.2–0.33 Hz. This can be used for the selection of the cut-off frequency of the low-pass filter, which is used to filter out noise and to preserve the slow periodic signal of breathing. We used a Savitzky–Golay filter of the second order and span equal to 29, as it is able to preserve steep changes in the signal while smoothing its fast fluctuations. The cut-off frequency of 0.7 Hz (42 bpm) for the FIR filter of order 50 was chosen. A visually pleasant reconstruction of the patient’s breathing is not strictly necessary for the purpose of extracting the examined features; however, it is important to obtain a reconstruction that enables the clinical recognition of irregular behaviours that are possibly detectable as sleep apnoeas. To do this, two operations have been performed. First, the signal has been smoothed to eliminate the high frequency noise. Then, the low frequency components have been rejected to obtain a signal without irrelevant slow fluctuations. Synchronization of the MS Kinect and the PSG data set was achieved by computing the cross-correlation and identifying its maximum to detect the lag between the signals. Figure 4 compares two selected synchronized segments in the frequency domain and it presents frequency components for PSG and MS Kinect records in the given range. Coef p ficients of the smoothing polynomial g(x) = P−1 p=0 c p x were evaluated in both cases by the least square method to minimize summed squared differences S(c0 , c1 , . . . , c P−1 ) =
(g( f k ) − s( f k ))
2
(2)
k=1
between values of the smoothing function g( f k ) and K values of spectral components s( f k ) for frequencies f k .
123
0.8
Amplitude Spectrum
0.7 0.6 0.5 0.4 0.3
0.7 0.6 0.5 0.4 0.3 0.2
0.2
0.1
0.1 0
Spectrum Smoothing
0.9
0.8
Amplitude Spectrum
The sequence of depth maps that were acquired by the MS Kinect was analysed over the selected rectangular area R1 × R2 of M rows and N columns that cover the chest of the individual. For each matrix Dtk (i, j) and indices i ∈ R1 and j ∈ R2 of the area of interest that were acquired at discrete time instances tk , the mean of the depth values over the selected area of M rows and N columns was evaluated by relation 1 Dtk (i, j) (1) y(tk ) = MN
K
(b)
(a)
3.2 Depth maps processing
0.22
0.24
0.26
0.28
0.3
0.32
0
0.22
0.24
Frequency [Hz]
0.26
0.28
0.3
0.32
Frequency [Hz]
Fig. 4 Spectra of two synchronized segments with a duration of 200 s from a the PSG record (airflow channel) and b the reconstruction using MS Kinect
3.3 Feature extraction and data fusion The recorded data were analysed using a custom algorithm. First, the area of interest, which corresponded to the patient’s chest, was manually selected. The depth values of the pixels in the region of interest were then averaged and stored to create a time series, thereby representing the time evolution of the chest movement. Two features of breathing were considered: Frequency The spectrum of the signal that was obtained by processing the depth frames was evaluated in the range of frequencies of interest for breathing (from 12 to 20 bpm). Additionally, a smoothing function of the spectrum was evaluated in order to address the case when a single frequency peak was not identifiable. Regularity It has been shown that the amplitude of the breathing effort signal is more regular during non-REM sleep than during REM sleep [17]. The regularity of the amplitude is accounted as a feature of the standard deviation of the peak-to-trough amplitudes. The standard deviation that was extracted from the reconstruction through MS Kinect is not directly comparable with that which was extracted from the PSG signal because the quantities that were measured were different. In order to obtain comparable numerical values, we proposed a new measure, regularity indicator (RI), which is defined as R I (i) =
std(di ) mean(di )
(3)
where di is the vector of differences between consequent local maxima and minima in the ith segment. Distribution of frequency and regularity features for classification of Wake and REM sleep stages into two classes is presented in Fig. 5.
SIViP RI / WAKE
RI / REM
4
4
3.5
3.5
3.5
3
3
3
2.5
2.5
2.5
2
2
2
1.5
1.5
1.5
1
1
1
0.5
0.5
0.5
0
0 10
P(x|ck ) =
R
P(x j |ck )
(7)
J =1 0
Number
10
Then, it is possible to find how likely to see observation x for class ck by relation:
0 0
15
10
20
Number
where R is the number of features. Evaluating further the probability of the occurrence of class ck as
5 0
F / REM
F / WAKE
Regularity Indicator
SLEEP FEATURES - PSG 4
10
Wake REM
0 12
13
14
15
16
17
18
19
20
P(ck ) = Nck /N
Breathing frequency [bpm]
Fig. 5 Distribution of regularity (RI) and frequency (F) features of Wake and REM sleep stages that were extracted from 200-s-long segments of five night PSG records of the thoracic effort channel of selected patients
cˆk = maxc1 ,c2 ,...,c M (P(ck |x))
(4)
The Bayesian probability [22] of an instance x being in class ck , k = 1, 2, . . . , M finds P(x|ck ) P(ck ) P(x)
(5)
where P(x|ck ) stands for the probability of generating instance x given class ck , P(ck ) represents the probability of the occurrence of class ck , and P(x) stands for the probability of the occurrence of x. Assuming the Gaussian distribution, it is possible to find for each class ck the mean μck ,x j and variance σc2k ,x j of each its attribute x j . The distribution of each attribute x j and each class ck is then defined by p(x j |ck ) =
1 2π σc2k ,x j
M
P(x|ck ) P(ck )
(9)
k=1
Classification of sleep features that were associated with time segments of the given length can be performed by the Bayesian classification. We used a matrix X N ,R of feature vectors xi in each row to describe R attributes (mean frequency, regularity indicator, …) for each separate segment i = 1, 2, . . . , N . We then further defined the associated column vector y N ,1 that specifies the class ck , k = 1, 2, . . . , M of each segment from the given set of M classes. The set of these classes can specify c1 – wake, c2 – non-REM or c3 – REM stages. During the evaluation process, a function that transforms the space of features X N ,R into the vector of classes y N ,1 was estimated. The probabilistic classification estimates the class cˆk of the unknown instance x:
P(ck |x) =
where Nck stands for the number of individuals belonging to class ck and P(x) =
3.4 Bayesian classification
(8)
exp −
(x j − μck ,x j )2 2 σc2k ,x j
(6)
it is possible to evaluate the Bayesian probability of a class ck by substituting Eqs. (7), (8), and (9) into Eq. (5).
4 Results Exploiting the time stamps that were associated for each depth image, the irregularity in the time distribution of the images was detected. To prevent potential mistakes in the analysis that was caused by this irregular behaviour, the frames have been linearly interpolated and resampled at 10 Hz, which was consistent with the frequency of the PSG record. Time synchronization of data that were recorded by PSG and MS Kinect was essential to compare breathing features in corresponding segments. By selecting intervals of regular breathing in the MS Kinect reconstruction, we were able to detect a clearly distinguishable peak in the correlation between the data sets. Spectral components that were evaluated in corresponding segments of both signals were then obtained. Figure 4 presents a substantial likeness in the selected case. The features were extracted from segments that were 200 s long from an entire night of monitoring (approximately 8 h) through both the PSG and the Kinect record. Each segment was also associated with the corresponding sleep stage that was associated with an expert (Wake / non-REM / REM). In Fig. 6, the obtained features were visualized in a plane for a selected individual. We used this to visualize how segments from different sleep stages tend to belong to different areas of the plane and how this behaviour is detectable from the PSG signal as well as from the Kinect reconstruction. Figure 7a presents distribution probabilities of segment features that belong to three classes (Wake, non-REM, REM), as evaluated using a polysomnographic record (thoracic
123
SIViP (a)
(b)
3
Wake NonREM REM
4
Wake Sleep
1.5
1
Probability P(x|c ) k
Regularity Indicator
Regularity Indicator
2
0.007
0.005
3.5 2.5
(b)
(a)
4.5
Wake NonREM REM
Probability P(x|ck)
3.5
3
0
2.5
Wake Sleep
0.005 0.003 0.001 −0.001 −0.003
2
−0.005
−0.005
1.5
4
3 2
1
2 1
0.5
0.5
Regularity Indicator
0
16
14
12
18
Regularity Indicator
0
16
14
12
Breathing frequency [bpm] 0
12
13
14
15
16
17
18
0
19
12
13
14
Breathing frequency [bpm]
15
16
17
18
19
(c)
Breathing frequency [bpm]
(d)
3.5
4.5
Wake Sleep
Wake Sleep
4
3
Regularity Indicator
3.5
Regularity Indicator
Fig. 6 Visualization of the features that were extracted from 200-slong segments from an entire night record (8 h), thereby indicating the sleep stage of each segment for a features from the PSG record (thoracic effort channel) and b features from the reconstruction using MS Kinect
18
Breathing frequency [bpm]
2.5
2
1.5
1
3 2.5 2 1.5 1
0.5
(a)
0 Wake NonREM REM 0.005
0
0.005
13
14
15
16
17
18
19
0
12
13
14
15
16
17
18
19
Breathing frequency [bpm]
Fig. 8 Distribution of the probabilities of segment features belonging to two classes (Wake, Sleep), as evaluated by a polysomnographic records (thoracic effort channel) and b MS Kinect with c, d the corresponding Bayesian boundaries of separate classes
0.003 0.001 −0.001 −0.003
−0.005
12
Breathing frequency [bpm]
Wake NonREM REM
0.007
Probability P(x|ck)
Probability P(x|ck)
0.5
(b)
−0.005 4
3 2
2 1 0
Regularity Indicator
16
14
12
18
Regularity Indicator
0
16
14
12
Breathing frequency [bpm]
(a)
18
Breathing frequency [bpm]
3
95
4.5
Wake NonREM REM
Wake NonREM REM
4
90
Regularity Indicator
Regularity Indicator
3.5 2.5
2
1.5
1
3
85 2304
2.5
0.5
14
15
16
17
Breathing frequency [bpm]
18
19
1305
2005
0
85 2304
2704
0.15 0.1 12
13
14
15
16
17
18
19
Breathing frequency [bpm]
effort channel) and a selected individual. The results for the MS Kinect observations are presented in Fig. 7b. The following Bayesian method, as shown in Fig. 7c, d, allowed for a classification accuracy 79.6 and 70.0 % for the PSG and MS Kinect observations, respectively. The mean classification accuracy [21] for four individuals is 74.98 and 66.32 % for the PSG and MS Kinect observations, respectively. Figure 8 shows the results of the Bayesian classification into two classes only (Wake / Sleep) using PSG and MS Kinect data. The classification accuracy for the selected individual was 94.1 and 91.9 % for the PSG and MS Kinect observations, respectively (Fig. 8c, d). The mean classification accuracy for four individuals was 88.90 and 88.95 % for the PSG and MS Kinect observations, respectively. Figure 9 compares the accuracy and cross-validation (by the leave-one-out method) of different methods (Bayesian, decision tree and k-nearest neighbours) for classification of
2005
1305
2005
0.12 Bayes DecTree 3NN 5NN 7NN
0.1 0.08 0.06
0.05 0 2304
1305
(d)
0.2 1.5
Fig. 7 Distribution probabilities of segment features belonging to three classes (Wake, non-REM, REM) evaluated from a polysomnographic records (thoracic effort channel) and b MS Kinect with c, d corresponding Bayesian boundaries of separate classes
123
90
2704
2
0.5
13
95 Bayes DecTree 3NN 5NN 7NN
(c)
1
12
100
(d)
(c) 3.5
0
(b)
100
0.04
2704
1305
Date
2005
0.02 2304
2704
Date
Fig. 9 Comparison of accuracy and cross-validation for classification by different methods (Bayesian, decision tree and k-nearest neighbours) for PSG and MS Kinect records of selected individuals and classification into two classes
features detected by PSG and MS Kinect. Bayesian methods provide similar results as other methods but they allow their simple graphical interpretation. The comparison of mean breathing features that were evaluated from polysomnographic and MS Kinect records for the wake and sleep stages is presented in Fig. 10. The mean error of 3.75 % in the mean frequency evaluated by PSG and MS Kinect indicates similarity between these two methods. The regularity indicator that is presented in Fig. 10b allows for better separation between the wake stage (with larger range of the chest movement) and sleep (with the shallow breathing). The distribution of mean breathing features, as evaluated from polysomnographic and MS Kinect records for selected individuals, is presented in Fig. 11. Results that were obtained show the possibility of measuring breathing with MS Kinect depth sensor as an alternative
(a)
(b)
20 15 10 PSG MS Kinect
5 0
23 27 13 20
23 27 13 20
Wake Stage
Sleep Stage
Regularity indicator
Breathing frequency [bpm]
SIViP
3 PSG MS Kinect 2
1
0
23 27 13 20
23 27 13 20
Wake Stage
Sleep Stage
Fig. 10 Comparison of mean breathing features, as evaluated a from polysomnographic and b MS Kinect records for the wake and sleep stages
1305
2
Wake Sleep 2304 2704
2005
1 2005 1305
0
14
2304 2704
16
Acknowledgments All measured data have been recorded in the University Hospital Hradec Králové.
(b)
3
18
Breathing frequency [bpm]
Regularity indicator
Regularity indicator
(a)
error of 3.75 % in the mean frequency evaluated by PSG and MS Kinect. The mean accuracy for Bayesian classification of features into two classes was 88.90 and 88.95 % for the PSG and MS Kinect observations, respectively. Future research will include an analysis of methods allowing better classification accuracy and detection of breathing disorders in the home environment. To make the whole evaluation more efficient, distributed processing methods for big data analysis will be applied as well.
3
2304 2005
1305
Wake Sleep
2
References
2704
1 2005 1305
0
14
16
2304
2704
18
Breathing frequency [bpm]
Fig. 11 Distribution of mean breathing features, as evaluated a from polysomnographic and b MS Kinect records for selected individuals
to other contact methods. Processing of video sequences provides similar breathing features as those that were obtained from the PSG. The paper presents the possibility of the long-term monitoring of the sleep by MS Kinect and global detection of breathing that can replace PSG methods. The depth sensor allows moreover detail respiratory motion tracking [1,31] and chest surface modelling.
5 Conclusion In this paper, we presented a vision-based method for reconstructing the breathing of sleeping subjects. The system uses depth maps to visualize the breathing behaviour and to extract frequency and regularity features. The validity of the proposed method was examined by comparing the extracted features to the accepted reference (PSG). The approach is non-invasive [10] and makes use of inexpensive instrumentation. The methodology proposed consists of denoising images with the 2D median filter, computing difference of chest position using the difference of depth maps and post-processing using FIR filter for final denoising. The respiratory rate can be achieved with considerable accuracy when the breathing is regular (non-REM sleep). When the breathing behaviour is not regular, a dominant frequency is hardly identifiable, and the regularity feature is more significant. The results include evaluation of the mean
1. Alnowami, M., Alnwaimi, B., Tahavori, F., Copland, M., Wells, K.: A quantitative assessment of using the Kinect for Xbox360 for respiratory surface motion tracking. In: Proc. of SPIE 8316, Medical Imaging, pp. 1–10 (2012) 2. Aoki, H., Miyazaki, M., Nakamura, H., Furukawa, R., Sagawa, R., Kawasaki, H.: Non-contact respiration measurement using structured light 3-D sensor. In: Proc. of SICE Annual Conf., pp. 614–618 (2012) 3. Assefa, S., Diaz-Abad, M., Korotinsky, A., Tom, S., Scharf, S.M.: Comparison of a simple obstructive sleep apnea screening device with standard in-laboratory polysomnography. Sleep Breath. First online: Aug. 2015, 1–5 (2015) 4. Bernacchia, N., Scalise, L., Casacanditella, L., Ercoli, I., Marchionni, P., Tomasini, E.: Non contact measurement of heart and respiration rates based on Kinect. In: Int. Symp. on Medical Meas. and Appl., pp. 1–5 (2014) 5. Burba, N., Bolas, M., Krum, D., Suma, E.: Unobtrusive measurement of subtle nonverbal behaviors with the Microsoft Kinect. In: Proc.IEEE Virt.Reality, pp.1–4 (2012) 6. Carlson, B., Neelon, V., Hsiao, H.: Evaluation of a non-invasive respiratory monitoring system for sleeping subjects. Physiol. Meas. 20(1), 53–63 (1999) 7. Centonze, F.: Image processing and three-dimensional modeling using Microsoft Kinect v2 in analysis of sleep disorders. Thesis, Polytechnico Milano, Italy (2015) 8. Dafna, E., Tarasiuk, A., Zigel, Y.: Sleep-wake evaluation from whole-night non-contact audio recordings of breathing sounds. Plos One 10(2), e0117382 (2015) 9. Douglas, N.J., White, D.P., Pickett, C.K., Weil, J.V., Zwillich, C.W.: Respiration during sleep in normal man. Thorax 37(11), 840–844 (1982) 10. Erden, F., Velipasalar, S., Alkar, A.Z., Cetin, A.E.: Sensors in assisted living. IEEE Signal Process. Mag. 33(2), 36–44 (2016) 11. Hosˇtálková, E., Vyšata, O., Procházka, A.: Multi-dimensional biomedical image de-noising using Haar transform. In: Proc. of 15th Int. Conf. on Digital Signal Processing, pp. 175–178 (2007) 12. Kagawa, M., Ueki, K., Tojima, H., Matsui, T.: Noncontact screening system with two microwave radars for the diagnosis of sleep apnea-hypopnea syndrome. In: Proc. of the 35th Annual Int. Conf. of the IEEE: Engineering in Medicine and Biology Society, pp. 2052–2055 (2013) 13. Kolb, A., Barth, E., Koch, R., Larsen, R.: Time-of-Flight Sensors in Computer Graphics In: Eurographics 2009 - State of the Art Reports, pp. 119–134 (2009) 14. Krüger, B., Vögele, A., Lassiri, M., Herwartz, L., Terkatz, T., Weber, A., Garcia, C., Fietze, I., Penzel, T.: Sleep detection using
123
SIViP
15.
16. 17.
18.
19.
20.
21.
22.
23.
24.
de-identified depth data. J. Mob. Multimed. 10(3&4), 327–342 (2014) Lee, Y.S., Pathirana, P.N., Steinfort, C.L., Caelli, T.: Monitoring and analysis of respiratory patterns using microwave Doppler radar. IEEE J. Eng. Health Med. 2, 1–12 (2014) Lee, J., Hong, M., Ryu, S.: Sleep Monitoring System Using Kinect Sensor. Int J Distrib Sens Netw, Article ID 875371, (2015) Long, X., Foussier, J., Fonseca, P., Haakma, R., Aarts, R.: Respiration amplitude analysis for REM and NREM sleep classification. In: Int. Conf. of the IEEE Engineering in Medicine and Biology Society, pp. 5017–5020 (2013) Martinez, M., Stiefelhagen, R.: Breath rate monitoring during sleep using near-IR imagery and PCA. In: 21st Int. Conf. on Pattern Recognition (ICPR), vol. 48, pp. 3472–3475 (2012) Metsis, V., Kosmopoulos, D., Athitsos, V., Makedon, F.: Noninvasive analysis of sleep patterns via multimodal sensor input. Pers Ubiquitous Comp 18, 19–26 (2014) Penne, J., Schaller, C., Hornegger, J., Kuwert, T.: Robust real-time 3D respiratory motion detection using time-of-flight cameras. Int. J. Comput. Assist. Radiol. Surg. 3(5), 427–431 (2008) ˇ Procházka, A., Vyšata, O.: Tupa, O., Yadollahi, M., Vališ, M.: Discrimination of axonal neuropathy using sensitivity and specificity statistical measures. Neural Comput. Appl. 25(6), 1349–1358 (2015) ˇ Procházka, A., Vyšata, O., Vališ, M., Tupa, O., Schätz, M., Maˇrík, V.: Bayesian classification and analysis of gait disorders using image and depth sensors of Microsoft Kinect. Digit Signal Process. 47, 169–177 (2015) Rai, R., Sontakke, T.: Implementation of image denoising using wavelet thresholding techniques. Int. J. Comput. Technol. Electron. Eng. 1(2), 6–10 (2011) Rodríguez-Sotelo, J.L., Osorio-Forero, A., Jiménez-Rodríguez, A., Cuesta-Frau, D., Cirugeda-Roldán, E., Peluffo, D.: Automatic sleep
123
25.
26.
27.
28.
29.
30.
31.
32.
stages classification using EEG entropy features and unsupervised pattern analysis techniques. Entropy 16, 6573–6589 (2014) Redmond, S., Heneghan, C.: Cardiorespiratory-based sleep staging in subjects with obstructive sleep apnea. IEEE Trans. Biomed. Eng. 53(3), 485–496 (2006) Schatz, M., Centonze, F., Kuchynka, J., Tupa, O., Vysata, O., Geman, O., Prochazka, A.: Statistical recognition of breathing by MS Kinect depth sensor. In: Int. Workshop on Computational Intelligence for Multimedia Understanding (IWCIM), pp. 1–4 (2015) Sen, B., Peker, M., Cavusoglu, A., Celabi, F.V.: A comparative study on classification of sleep stage based on EEG signals using feature selection and classification algorithms. J. Med. Syst. 38(18), 1–21 (2014) Stradling, J.R., Chadwick, G.A., Frew, A.J.: Changes in ventilation and its components in normal subjects during sleep. Thorax 40(5), 364–370 (1985) Taheri, T., Anna, A.S.: Non-Invasive Breathing Rate Detection Using a Very Low Power Ultra-wide-band Radar. In: IEEE Int. Conf. on Bioinformatics and Biomedicine (BIBM), pp. 70–83 (2014) Xia, J., Siochi, R.A.: A real-time respiratory motion monitoring system using kinect: proof of concept. Med. Phys. 39(5), 2682– 2685 (2012) Yu, M.C., Liou, J.L., Kuo, S.W., Lee, M.S., Hung, Y.P.: Noncontact respiratory measurement of volume change using depth camera. In: IEEE Int. Conf. Engineering in Medicine and Biology Society, pp. 2371–2374 (2012) Zaffaroni, A., Kent, B., O’Hare, E., et al.: Assessment of sleepdisordered breathing using a non-contact bio-motion sensor. J. Sleep Res. 22(2), 231–236 (2014)