Machine learning-based available bandwidth estimation

Sukhpreet Kaur Khangura

doi:10.15488/9166

Details

Original language	English
Qualification	Doctor of Engineering
Awarding Institution	Leibniz University Hannover
Supervised by	Fidler, M., Supervisor
Date of Award	12 Dec 2019
Place of Publication	Hannover
Publication status	Published - 2019

Abstract

Today’s Internet Protocol (IP), the Internet’s network-layer protocol, provides a best-effort service to all users without any guaranteed bandwidth. However, for certain applications that have stringent network performance requirements in terms of bandwidth, it is significantly important to provide Quality of Ser- vice (QoS) guarantees in IP networks. The end-to-end available bandwidth of a network path, i.e., the residual capacity that is left over by other traffic, is deter- mined by its tight link, that is the link that has the minimal available bandwidth. The tight link may differ from the bottleneck link, i.e., the link with the minimal capacity. Passive and active measurements are the two fundamental approaches used to estimate the available bandwidth in IP networks. Unlike passive measurement tools that are based on the non-intrusive monitoring of traffic, active tools are based on the concept of self-induced congestion. The dispersion, which arises when packets traverse a network, carries information that can reveal relevant network characteristics. Using a fluid-flow probe gap model of a tight link with First-in, First-out (FIFO) multiplexing, accepted probing tools measure the packet dispersion to estimate the available bandwidth. Difficulties arise, how- ever, if the dispersion is distorted compared to the model, e.g., by non-fluid traffic, multiple tight links, clustering of packets due to interrupt coalescing and inaccurate time-stamping in general. It is recognized that modeling these effects is cumbersome if not intractable. To alleviate the variability of noise-afflicted packet gaps, the state-of-the-art bandwidth estimation techniques use post-processing of the measurement results, e.g., averaging over several packet pairs or packet trains, linear regression, or a Kalman filter. These techniques, however, do not overcome the basic as- sumptions of the deterministic fluid model. While packet trains and statistical post-processing help to reduce the variability of available bandwidth estimates, these cannot resolve systematic deviations such as the underestimation bias in case of random cross traffic and multiple tight links. The limitations of the state-of-the-art methods motivate us to explore the use of machine learning in end-to-end active and passive available bandwidth estimation. We investigate how to benefit from machine learning while using standard packet train probes for active available bandwidth estimation. To reduce the amount of required training data, we propose a regression-based scale- invariant method that is applicable without prior calibration to networks of arbitrary capacity. To reduce the amount of probe traffic further, we implement a neural network that acts as a recommender and can effectively select the probe rates that reduce the estimation error most quickly. We also evaluate our method with other regression-based supervised machine learning techniques. Furthermore, we propose two different multi-class classification-based meth- ods for available bandwidth estimation. The first method employs reinforcement learning that learns through the network path’s observations without having a training phase. We formulate the available bandwidth estimation as a single-state Markov Decision Process (MDP) multi-armed bandit problem and implement the ε-greedy algorithm to find the available bandwidth, where ε is a parameter that controls the exploration vs. exploitation trade-off. We propose another supervised learning-based classification method to ob- tain reliable available bandwidth estimates with a reduced amount of network overhead in networks, where available bandwidth changes very frequently. In such networks, reinforcement learning-based method may take longer to con- verge as it has no training phase and learns in an online manner. We also evaluate our method with different classification-based supervised machine learning techniques. Furthermore, considering the correlated changes in a network’s traffic through time, we apply filtering techniques on the estimation results in order to track the available bandwidth changes. Active probing techniques provide flexibility in designing the input struc- ture. In contrast, the vast majority of Internet traffic is Transmission Control Protocol (TCP) flows that exhibit a rather chaotic traffic pattern. We investigate how the theory of active probing can be used to extract relevant information from passive TCP measurements. We extend our method to perform the estima- tion using only sender-side measurements of TCP data and acknowledgment packets. However, non-fluid cross traffic, multiple tight links, and packet loss in the reverse path may alter the spacing of acknowledgments and hence in- crease the measurement noise. To obtain reliable available bandwidth estimates from noise-afflicted acknowledgment gaps we propose a neural network-based method. We conduct a comprehensive measurement study in a controlled network testbed at Leibniz University Hannover. We evaluate our proposed methods under a variety of notoriously difficult network conditions that have not been included in the training such as randomly generated networks with multiple tight links, heavy cross traffic burstiness, delays, and packet loss. Our testing results reveal that our proposed machine learning-based techniques are able to identify the available bandwidth with high precision from active and passive measurements. Furthermore, our reinforcement learning-based method without any training phase shows accurate and fast convergence to available band- width estimates.

Cite this

Machine learning-based available bandwidth estimation. / Khangura, Sukhpreet Kaur.
Hannover, 2019. 132 p.

Research output: Thesis › Doctoral thesis

Khangura, SK 2019, 'Machine learning-based available bandwidth estimation', Doctor of Engineering, Leibniz University Hannover, Hannover. https://doi.org/10.15488/9166

Khangura, S. K. (2019). Machine learning-based available bandwidth estimation. [Doctoral thesis, Leibniz University Hannover]. https://doi.org/10.15488/9166

Khangura SK. Machine learning-based available bandwidth estimation. Hannover, 2019. 132 p. doi: 10.15488/9166

Khangura, Sukhpreet Kaur. / Machine learning-based available bandwidth estimation. Hannover, 2019. 132 p.

Download

@phdthesis{92478786f8d1492bb12eaddb6880b1d6,

title = "Machine learning-based available bandwidth estimation",

abstract = "Today{\textquoteright}s Internet Protocol (IP), the Internet{\textquoteright}s network-layer protocol, provides a best-effort service to all users without any guaranteed bandwidth. However, for certain applications that have stringent network performance requirements in terms of bandwidth, it is significantly important to provide Quality of Ser- vice (QoS) guarantees in IP networks. The end-to-end available bandwidth of a network path, i.e., the residual capacity that is left over by other traffic, is deter- mined by its tight link, that is the link that has the minimal available bandwidth. The tight link may differ from the bottleneck link, i.e., the link with the minimal capacity. Passive and active measurements are the two fundamental approaches used to estimate the available bandwidth in IP networks. Unlike passive measurement tools that are based on the non-intrusive monitoring of traffic, active tools are based on the concept of self-induced congestion. The dispersion, which arises when packets traverse a network, carries information that can reveal relevant network characteristics. Using a fluid-flow probe gap model of a tight link with First-in, First-out (FIFO) multiplexing, accepted probing tools measure the packet dispersion to estimate the available bandwidth. Difficulties arise, how- ever, if the dispersion is distorted compared to the model, e.g., by non-fluid traffic, multiple tight links, clustering of packets due to interrupt coalescing and inaccurate time-stamping in general. It is recognized that modeling these effects is cumbersome if not intractable. To alleviate the variability of noise-afflicted packet gaps, the state-of-the-art bandwidth estimation techniques use post-processing of the measurement results, e.g., averaging over several packet pairs or packet trains, linear regression, or a Kalman filter. These techniques, however, do not overcome the basic as- sumptions of the deterministic fluid model. While packet trains and statistical post-processing help to reduce the variability of available bandwidth estimates, these cannot resolve systematic deviations such as the underestimation bias in case of random cross traffic and multiple tight links. The limitations of the state-of-the-art methods motivate us to explore the use of machine learning in end-to-end active and passive available bandwidth estimation. We investigate how to benefit from machine learning while using standard packet train probes for active available bandwidth estimation. To reduce the amount of required training data, we propose a regression-based scale- invariant method that is applicable without prior calibration to networks of arbitrary capacity. To reduce the amount of probe traffic further, we implement a neural network that acts as a recommender and can effectively select the probe rates that reduce the estimation error most quickly. We also evaluate our method with other regression-based supervised machine learning techniques. Furthermore, we propose two different multi-class classification-based meth- ods for available bandwidth estimation. The first method employs reinforcement learning that learns through the network path{\textquoteright}s observations without having a training phase. We formulate the available bandwidth estimation as a single-state Markov Decision Process (MDP) multi-armed bandit problem and implement the ε-greedy algorithm to find the available bandwidth, where ε is a parameter that controls the exploration vs. exploitation trade-off. We propose another supervised learning-based classification method to ob- tain reliable available bandwidth estimates with a reduced amount of network overhead in networks, where available bandwidth changes very frequently. In such networks, reinforcement learning-based method may take longer to con- verge as it has no training phase and learns in an online manner. We also evaluate our method with different classification-based supervised machine learning techniques. Furthermore, considering the correlated changes in a network{\textquoteright}s traffic through time, we apply filtering techniques on the estimation results in order to track the available bandwidth changes. Active probing techniques provide flexibility in designing the input struc- ture. In contrast, the vast majority of Internet traffic is Transmission Control Protocol (TCP) flows that exhibit a rather chaotic traffic pattern. We investigate how the theory of active probing can be used to extract relevant information from passive TCP measurements. We extend our method to perform the estima- tion using only sender-side measurements of TCP data and acknowledgment packets. However, non-fluid cross traffic, multiple tight links, and packet loss in the reverse path may alter the spacing of acknowledgments and hence in- crease the measurement noise. To obtain reliable available bandwidth estimates from noise-afflicted acknowledgment gaps we propose a neural network-based method. We conduct a comprehensive measurement study in a controlled network testbed at Leibniz University Hannover. We evaluate our proposed methods under a variety of notoriously difficult network conditions that have not been included in the training such as randomly generated networks with multiple tight links, heavy cross traffic burstiness, delays, and packet loss. Our testing results reveal that our proposed machine learning-based techniques are able to identify the available bandwidth with high precision from active and passive measurements. Furthermore, our reinforcement learning-based method without any training phase shows accurate and fast convergence to available band- width estimates.",

author = "Khangura, {Sukhpreet Kaur}",

year = "2019",

doi = "10.15488/9166",

language = "English",

school = "Leibniz University Hannover",

}

Download

TY - BOOK

T1 - Machine learning-based available bandwidth estimation

AU - Khangura, Sukhpreet Kaur

PY - 2019

Y1 - 2019

N2 - Today’s Internet Protocol (IP), the Internet’s network-layer protocol, provides a best-effort service to all users without any guaranteed bandwidth. However, for certain applications that have stringent network performance requirements in terms of bandwidth, it is significantly important to provide Quality of Ser- vice (QoS) guarantees in IP networks. The end-to-end available bandwidth of a network path, i.e., the residual capacity that is left over by other traffic, is deter- mined by its tight link, that is the link that has the minimal available bandwidth. The tight link may differ from the bottleneck link, i.e., the link with the minimal capacity. Passive and active measurements are the two fundamental approaches used to estimate the available bandwidth in IP networks. Unlike passive measurement tools that are based on the non-intrusive monitoring of traffic, active tools are based on the concept of self-induced congestion. The dispersion, which arises when packets traverse a network, carries information that can reveal relevant network characteristics. Using a fluid-flow probe gap model of a tight link with First-in, First-out (FIFO) multiplexing, accepted probing tools measure the packet dispersion to estimate the available bandwidth. Difficulties arise, how- ever, if the dispersion is distorted compared to the model, e.g., by non-fluid traffic, multiple tight links, clustering of packets due to interrupt coalescing and inaccurate time-stamping in general. It is recognized that modeling these effects is cumbersome if not intractable. To alleviate the variability of noise-afflicted packet gaps, the state-of-the-art bandwidth estimation techniques use post-processing of the measurement results, e.g., averaging over several packet pairs or packet trains, linear regression, or a Kalman filter. These techniques, however, do not overcome the basic as- sumptions of the deterministic fluid model. While packet trains and statistical post-processing help to reduce the variability of available bandwidth estimates, these cannot resolve systematic deviations such as the underestimation bias in case of random cross traffic and multiple tight links. The limitations of the state-of-the-art methods motivate us to explore the use of machine learning in end-to-end active and passive available bandwidth estimation. We investigate how to benefit from machine learning while using standard packet train probes for active available bandwidth estimation. To reduce the amount of required training data, we propose a regression-based scale- invariant method that is applicable without prior calibration to networks of arbitrary capacity. To reduce the amount of probe traffic further, we implement a neural network that acts as a recommender and can effectively select the probe rates that reduce the estimation error most quickly. We also evaluate our method with other regression-based supervised machine learning techniques. Furthermore, we propose two different multi-class classification-based meth- ods for available bandwidth estimation. The first method employs reinforcement learning that learns through the network path’s observations without having a training phase. We formulate the available bandwidth estimation as a single-state Markov Decision Process (MDP) multi-armed bandit problem and implement the ε-greedy algorithm to find the available bandwidth, where ε is a parameter that controls the exploration vs. exploitation trade-off. We propose another supervised learning-based classification method to ob- tain reliable available bandwidth estimates with a reduced amount of network overhead in networks, where available bandwidth changes very frequently. In such networks, reinforcement learning-based method may take longer to con- verge as it has no training phase and learns in an online manner. We also evaluate our method with different classification-based supervised machine learning techniques. Furthermore, considering the correlated changes in a network’s traffic through time, we apply filtering techniques on the estimation results in order to track the available bandwidth changes. Active probing techniques provide flexibility in designing the input struc- ture. In contrast, the vast majority of Internet traffic is Transmission Control Protocol (TCP) flows that exhibit a rather chaotic traffic pattern. We investigate how the theory of active probing can be used to extract relevant information from passive TCP measurements. We extend our method to perform the estima- tion using only sender-side measurements of TCP data and acknowledgment packets. However, non-fluid cross traffic, multiple tight links, and packet loss in the reverse path may alter the spacing of acknowledgments and hence in- crease the measurement noise. To obtain reliable available bandwidth estimates from noise-afflicted acknowledgment gaps we propose a neural network-based method. We conduct a comprehensive measurement study in a controlled network testbed at Leibniz University Hannover. We evaluate our proposed methods under a variety of notoriously difficult network conditions that have not been included in the training such as randomly generated networks with multiple tight links, heavy cross traffic burstiness, delays, and packet loss. Our testing results reveal that our proposed machine learning-based techniques are able to identify the available bandwidth with high precision from active and passive measurements. Furthermore, our reinforcement learning-based method without any training phase shows accurate and fast convergence to available band- width estimates.

AB - Today’s Internet Protocol (IP), the Internet’s network-layer protocol, provides a best-effort service to all users without any guaranteed bandwidth. However, for certain applications that have stringent network performance requirements in terms of bandwidth, it is significantly important to provide Quality of Ser- vice (QoS) guarantees in IP networks. The end-to-end available bandwidth of a network path, i.e., the residual capacity that is left over by other traffic, is deter- mined by its tight link, that is the link that has the minimal available bandwidth. The tight link may differ from the bottleneck link, i.e., the link with the minimal capacity. Passive and active measurements are the two fundamental approaches used to estimate the available bandwidth in IP networks. Unlike passive measurement tools that are based on the non-intrusive monitoring of traffic, active tools are based on the concept of self-induced congestion. The dispersion, which arises when packets traverse a network, carries information that can reveal relevant network characteristics. Using a fluid-flow probe gap model of a tight link with First-in, First-out (FIFO) multiplexing, accepted probing tools measure the packet dispersion to estimate the available bandwidth. Difficulties arise, how- ever, if the dispersion is distorted compared to the model, e.g., by non-fluid traffic, multiple tight links, clustering of packets due to interrupt coalescing and inaccurate time-stamping in general. It is recognized that modeling these effects is cumbersome if not intractable. To alleviate the variability of noise-afflicted packet gaps, the state-of-the-art bandwidth estimation techniques use post-processing of the measurement results, e.g., averaging over several packet pairs or packet trains, linear regression, or a Kalman filter. These techniques, however, do not overcome the basic as- sumptions of the deterministic fluid model. While packet trains and statistical post-processing help to reduce the variability of available bandwidth estimates, these cannot resolve systematic deviations such as the underestimation bias in case of random cross traffic and multiple tight links. The limitations of the state-of-the-art methods motivate us to explore the use of machine learning in end-to-end active and passive available bandwidth estimation. We investigate how to benefit from machine learning while using standard packet train probes for active available bandwidth estimation. To reduce the amount of required training data, we propose a regression-based scale- invariant method that is applicable without prior calibration to networks of arbitrary capacity. To reduce the amount of probe traffic further, we implement a neural network that acts as a recommender and can effectively select the probe rates that reduce the estimation error most quickly. We also evaluate our method with other regression-based supervised machine learning techniques. Furthermore, we propose two different multi-class classification-based meth- ods for available bandwidth estimation. The first method employs reinforcement learning that learns through the network path’s observations without having a training phase. We formulate the available bandwidth estimation as a single-state Markov Decision Process (MDP) multi-armed bandit problem and implement the ε-greedy algorithm to find the available bandwidth, where ε is a parameter that controls the exploration vs. exploitation trade-off. We propose another supervised learning-based classification method to ob- tain reliable available bandwidth estimates with a reduced amount of network overhead in networks, where available bandwidth changes very frequently. In such networks, reinforcement learning-based method may take longer to con- verge as it has no training phase and learns in an online manner. We also evaluate our method with different classification-based supervised machine learning techniques. Furthermore, considering the correlated changes in a network’s traffic through time, we apply filtering techniques on the estimation results in order to track the available bandwidth changes. Active probing techniques provide flexibility in designing the input struc- ture. In contrast, the vast majority of Internet traffic is Transmission Control Protocol (TCP) flows that exhibit a rather chaotic traffic pattern. We investigate how the theory of active probing can be used to extract relevant information from passive TCP measurements. We extend our method to perform the estima- tion using only sender-side measurements of TCP data and acknowledgment packets. However, non-fluid cross traffic, multiple tight links, and packet loss in the reverse path may alter the spacing of acknowledgments and hence in- crease the measurement noise. To obtain reliable available bandwidth estimates from noise-afflicted acknowledgment gaps we propose a neural network-based method. We conduct a comprehensive measurement study in a controlled network testbed at Leibniz University Hannover. We evaluate our proposed methods under a variety of notoriously difficult network conditions that have not been included in the training such as randomly generated networks with multiple tight links, heavy cross traffic burstiness, delays, and packet loss. Our testing results reveal that our proposed machine learning-based techniques are able to identify the available bandwidth with high precision from active and passive measurements. Furthermore, our reinforcement learning-based method without any training phase shows accurate and fast convergence to available band- width estimates.

U2 - 10.15488/9166

DO - 10.15488/9166

M3 - Doctoral thesis

CY - Hannover

ER -

Research@Leibniz University

Machine learning-based available bandwidth estimation

Authors

Research Organisations

Details

Abstract

Cite this

By the same author(s)

Age- and deviation-of-information of hybrid time- and event-triggered systems: What matters more, determinism or resource conservation?

Statistical Age-of-Information Bounds for Parallel Systems: When Do Independent Channels Make a Difference?

The Tiny-Tasks Granularity Trade-Off: Balancing Overhead Versus Performance in Parallel Systems

A Min-plus Model of Age-of-Information with Worst-case and Statistical Bounds

Performance and Scaling of Parallel Systems with Blocking Start and/or Departure Barriers