![Page 1: Aggregation and Degradation in JetStream: Streaming ...mfreed/docs/jet... · 0 10 20 30 40 50 60 70 80 90 Experiment time (minutes) 0 100 200 300 BW (Mbits/sec) 400 Degradation Keeps](https://reader033.vdokument.com/reader033/viewer/2022060517/604acb053e42da00040760a7/html5/thumbnails/1.jpg)
Ariel Rabkin Princeton University
Aggregation and Degradation in JetStream: Streaming Analytics in the Wide Area
Work done with Matvey Arye, Siddhartha Sen, Vivek S. Pai, and Michael J. Freedman
![Page 2: Aggregation and Degradation in JetStream: Streaming ...mfreed/docs/jet... · 0 10 20 30 40 50 60 70 80 90 Experiment time (minutes) 0 100 200 300 BW (Mbits/sec) 400 Degradation Keeps](https://reader033.vdokument.com/reader033/viewer/2022060517/604acb053e42da00040760a7/html5/thumbnails/2.jpg)
Today’s Analytics Architectures
2
� Backhaul is inefficient and inflexible
MillWheel (Google) Storm
![Page 3: Aggregation and Degradation in JetStream: Streaming ...mfreed/docs/jet... · 0 10 20 30 40 50 60 70 80 90 Experiment time (minutes) 0 100 200 300 BW (Mbits/sec) 400 Degradation Keeps](https://reader033.vdokument.com/reader033/viewer/2022060517/604acb053e42da00040760a7/html5/thumbnails/3.jpg)
Tomorrow’s Architecture: JetStream
3
� Backhaul is inefficient and inflexible � Goal: optimize use of WAN links by
exposing them to streaming system.
JetStream
![Page 4: Aggregation and Degradation in JetStream: Streaming ...mfreed/docs/jet... · 0 10 20 30 40 50 60 70 80 90 Experiment time (minutes) 0 100 200 300 BW (Mbits/sec) 400 Degradation Keeps](https://reader033.vdokument.com/reader033/viewer/2022060517/604acb053e42da00040760a7/html5/thumbnails/4.jpg)
Backhaul is Intrinsically Inefficient
4
Time [two days]
Ban
dwid
th
Available
Buyer’s remorse: wasted bandwidth
Analyst’s remorse: system overload or missing data
Needed for backhaul
![Page 5: Aggregation and Degradation in JetStream: Streaming ...mfreed/docs/jet... · 0 10 20 30 40 50 60 70 80 90 Experiment time (minutes) 0 100 200 300 BW (Mbits/sec) 400 Degradation Keeps](https://reader033.vdokument.com/reader033/viewer/2022060517/604acb053e42da00040760a7/html5/thumbnails/5.jpg)
Stream Processing Basics
5
Filtering (count > 100) Sampling (drop 90% of data) Image Compression
Quantiles (95th percentile) Query stored data
Site A
Some Operators in JetStream:
Stream Operators
Inpu
t Dat
a
Stream Operators
Inpu
t Dat
a
Stream Operators
Stream Operators
Site B
Stream Operator
Site C
![Page 6: Aggregation and Degradation in JetStream: Streaming ...mfreed/docs/jet... · 0 10 20 30 40 50 60 70 80 90 Experiment time (minutes) 0 100 200 300 BW (Mbits/sec) 400 Degradation Keeps](https://reader033.vdokument.com/reader033/viewer/2022060517/604acb053e42da00040760a7/html5/thumbnails/6.jpg)
The JetStream System
What: Streaming with aggregation and degradation as first-class primitives
Where: Storage and processing at edge
Why: Maximize goodput using aggregation and degradation
How: Data cubes and feedback control
6
![Page 7: Aggregation and Degradation in JetStream: Streaming ...mfreed/docs/jet... · 0 10 20 30 40 50 60 70 80 90 Experiment time (minutes) 0 100 200 300 BW (Mbits/sec) 400 Degradation Keeps](https://reader033.vdokument.com/reader033/viewer/2022060517/604acb053e42da00040760a7/html5/thumbnails/7.jpg)
An Example Query
7
How popular is every URL?
Requests Requests CDN
Requests
Requests Requests CDN
Requests
![Page 8: Aggregation and Degradation in JetStream: Streaming ...mfreed/docs/jet... · 0 10 20 30 40 50 60 70 80 90 Experiment time (minutes) 0 100 200 300 BW (Mbits/sec) 400 Degradation Keeps](https://reader033.vdokument.com/reader033/viewer/2022060517/604acb053e42da00040760a7/html5/thumbnails/8.jpg)
Mechanism 1: Storage with Aggregation
8
Requests Requests CDN
Requests
Requests Requests CDN
Requests Every minute, compute request counts by URL
Local Aggregation and Storage
Local Aggregation and Storage
![Page 9: Aggregation and Degradation in JetStream: Streaming ...mfreed/docs/jet... · 0 10 20 30 40 50 60 70 80 90 Experiment time (minutes) 0 100 200 300 BW (Mbits/sec) 400 Degradation Keeps](https://reader033.vdokument.com/reader033/viewer/2022060517/604acb053e42da00040760a7/html5/thumbnails/9.jpg)
Mechanism 2: Adaptive Degradation
9
Requests Requests CDN
Requests
Requests Requests CDN
Requests Every minute, compute request counts by URL
Local Aggregation and Storage
Local Aggregation and Storage
Adjustable Filtering
Adjustable Filtering
![Page 10: Aggregation and Degradation in JetStream: Streaming ...mfreed/docs/jet... · 0 10 20 30 40 50 60 70 80 90 Experiment time (minutes) 0 100 200 300 BW (Mbits/sec) 400 Degradation Keeps](https://reader033.vdokument.com/reader033/viewer/2022060517/604acb053e42da00040760a7/html5/thumbnails/10.jpg)
Requirements for Storage Abstraction
10
� Update-able (locally and incrementally)
Data Data Merged Representation
+ =
Data Data
� Merge-able (without accuracy penalty)
� Data size is reducible (with predictable accuracy cost)
Stored Data += Data
![Page 11: Aggregation and Degradation in JetStream: Streaming ...mfreed/docs/jet... · 0 10 20 30 40 50 60 70 80 90 Experiment time (minutes) 0 100 200 300 BW (Mbits/sec) 400 Degradation Keeps](https://reader033.vdokument.com/reader033/viewer/2022060517/604acb053e42da00040760a7/html5/thumbnails/11.jpg)
The Data Cube Model
Aggregation used for: � Updates � Roll-ups � Merging cubes � Summarizing cubes
11
Counts by URL 12:00 12:01 12:02
www.mysite.com/a 3 5 0
www.mysite.com/b 0 2 0
www.yoursite.com 5 4 …
www.her-site.com 8 12 …
Cube: A multidimensional array, indexed by a set of dimensions, whose cells hold aggregates.
Cubes have aggregation function: Agg( , )à
![Page 12: Aggregation and Degradation in JetStream: Streaming ...mfreed/docs/jet... · 0 10 20 30 40 50 60 70 80 90 Experiment time (minutes) 0 100 200 300 BW (Mbits/sec) 400 Degradation Keeps](https://reader033.vdokument.com/reader033/viewer/2022060517/604acb053e42da00040760a7/html5/thumbnails/12.jpg)
Cubes can be “Rolled Up”
12
Counts by URL 12:00 12:01 12:02
www.mysite.com/a 3 5 0
www.mysite.com/b 0 2 0
www.yoursite.com 5 4 …
www.her-site.com 8 12 …
Cube: A multidimensional array, indexed by a set of dimensions, whose cells hold aggregates.
Counts by URL *
www.mysite.com/a 8
www.mysite.com/b 2
www.yoursite.com 9
www.her-site.com 20
Counts by URL 12:00 12:01 12:02 * 16 23 …
![Page 13: Aggregation and Degradation in JetStream: Streaming ...mfreed/docs/jet... · 0 10 20 30 40 50 60 70 80 90 Experiment time (minutes) 0 100 200 300 BW (Mbits/sec) 400 Degradation Keeps](https://reader033.vdokument.com/reader033/viewer/2022060517/604acb053e42da00040760a7/html5/thumbnails/13.jpg)
Cubes Unify Storage and Aggregation
13
Stored Data Update
Update
Update
Update sent downstream
Standing Query
One-off query
![Page 14: Aggregation and Degradation in JetStream: Streaming ...mfreed/docs/jet... · 0 10 20 30 40 50 60 70 80 90 Experiment time (minutes) 0 100 200 300 BW (Mbits/sec) 400 Degradation Keeps](https://reader033.vdokument.com/reader033/viewer/2022060517/604acb053e42da00040760a7/html5/thumbnails/14.jpg)
Feedback control
Degradation: The Big Picture
14
Local Data Dataflow
Operators Summarized or Approximated
Data
� Level of degradation auto-tuned to match bandwidth. � Challenge: Supporting mergeability and flexible policies
Network Dataflow Operators
![Page 15: Aggregation and Degradation in JetStream: Streaming ...mfreed/docs/jet... · 0 10 20 30 40 50 60 70 80 90 Experiment time (minutes) 0 100 200 300 BW (Mbits/sec) 400 Degradation Keeps](https://reader033.vdokument.com/reader033/viewer/2022060517/604acb053e42da00040760a7/html5/thumbnails/15.jpg)
Mergeability Imposes Constraints
� Insight: Degradation may be discontinuous
01 - 10 11 - 20 Every 10 21 - 30
01 - 30 Every 30??
01 - 05 06 - 10 11 - 15 16 - 20 21 - 25 Every 5 26 - 30
01 - 06 07 - 12 13 - 18 19 - 24 Every 6 25 - 30
15
??????
02 - 06 07 - 11 12 - 16 17 - 21 22 - 26 Every 5 27 - 31
![Page 16: Aggregation and Degradation in JetStream: Streaming ...mfreed/docs/jet... · 0 10 20 30 40 50 60 70 80 90 Experiment time (minutes) 0 100 200 300 BW (Mbits/sec) 400 Degradation Keeps](https://reader033.vdokument.com/reader033/viewer/2022060517/604acb053e42da00040760a7/html5/thumbnails/16.jpg)
There Are Many Ways to Degrade Data
16
� Can coarsen a dimension
� Can drop low-rank values
![Page 17: Aggregation and Degradation in JetStream: Streaming ...mfreed/docs/jet... · 0 10 20 30 40 50 60 70 80 90 Experiment time (minutes) 0 100 200 300 BW (Mbits/sec) 400 Degradation Keeps](https://reader033.vdokument.com/reader033/viewer/2022060517/604acb053e42da00040760a7/html5/thumbnails/17.jpg)
5s minute 5 m hour dayAggregation time period
1
2
4
8
16
32
64
128
256
Savi
ngs
from
Agg
rega
tion
Domains
Coarsening Does Not Always Help
17
5s minute 5 m hour dayAggregation time period
1
2
4
8
16
32
64
128
256
Savi
ngs
from
Agg
rega
tion
DomainsURLs
![Page 18: Aggregation and Degradation in JetStream: Streaming ...mfreed/docs/jet... · 0 10 20 30 40 50 60 70 80 90 Experiment time (minutes) 0 100 200 300 BW (Mbits/sec) 400 Degradation Keeps](https://reader033.vdokument.com/reader033/viewer/2022060517/604acb053e42da00040760a7/html5/thumbnails/18.jpg)
Degradations Have Trade-offs
18
Name Fixed BW Savings
Fixed Accuracy cost
Parameter
Dim. Coarsening Usually no Yes Dimension Scale
Drop values (locally)
Yes No Cut-off
Drop values (globally)
No, multi-round protocol
Yes Cut-off
Audiovisual downsampling
Yes Yes Sample rate
Histogram Coarsening
Yes
Yes
Number of Buckets
![Page 19: Aggregation and Degradation in JetStream: Streaming ...mfreed/docs/jet... · 0 10 20 30 40 50 60 70 80 90 Experiment time (minutes) 0 100 200 300 BW (Mbits/sec) 400 Degradation Keeps](https://reader033.vdokument.com/reader033/viewer/2022060517/604acb053e42da00040760a7/html5/thumbnails/19.jpg)
A Simple Idea that Does Not Work
� We have sensors that report congestion…. � Have operators read sensor and adjust themselves?
19
Coarsening Operator
Incoming data Network Sampled
Data
Sending 4x too much
![Page 20: Aggregation and Degradation in JetStream: Streaming ...mfreed/docs/jet... · 0 10 20 30 40 50 60 70 80 90 Experiment time (minutes) 0 100 200 300 BW (Mbits/sec) 400 Degradation Keeps](https://reader033.vdokument.com/reader033/viewer/2022060517/604acb053e42da00040760a7/html5/thumbnails/20.jpg)
A Simple Idea that Does Not Work
� We have sensors that report congestion…. � Have operators read sensor and adjust themselves?
20
Coarsening Operator
Incoming data Network Sampled
Data
Sending 4x too much
Increase aggregation period up to 10 sec. If
insufficient, use sampling
![Page 21: Aggregation and Degradation in JetStream: Streaming ...mfreed/docs/jet... · 0 10 20 30 40 50 60 70 80 90 Experiment time (minutes) 0 100 200 300 BW (Mbits/sec) 400 Degradation Keeps](https://reader033.vdokument.com/reader033/viewer/2022060517/604acb053e42da00040760a7/html5/thumbnails/21.jpg)
Challenge: Composite Policies
� Chaos if two operators are simultaneously responding to the same sensor
21
Coarsening Operator
Incoming data Network
Sampling Operator
Sending 4x too much
![Page 22: Aggregation and Degradation in JetStream: Streaming ...mfreed/docs/jet... · 0 10 20 30 40 50 60 70 80 90 Experiment time (minutes) 0 100 200 300 BW (Mbits/sec) 400 Degradation Keeps](https://reader033.vdokument.com/reader033/viewer/2022060517/604acb053e42da00040760a7/html5/thumbnails/22.jpg)
Interfacing with Operators
22
Shrinking data by 50% Possible levels:
[0%, 50%, 75%, 95%, …]
Go to level 75%
Coarsening Operator
Incoming data Network
Sampling Operator
Controller
Sending 4x too much
![Page 23: Aggregation and Degradation in JetStream: Streaming ...mfreed/docs/jet... · 0 10 20 30 40 50 60 70 80 90 Experiment time (minutes) 0 100 200 300 BW (Mbits/sec) 400 Degradation Keeps](https://reader033.vdokument.com/reader033/viewer/2022060517/604acb053e42da00040760a7/html5/thumbnails/23.jpg)
Experimental Setup
23
80 nodes on VICCI testbed at three sites (Seattle, Atlanta, and Germany)
Policy: Drop data if insufficient BW
Princeton
![Page 24: Aggregation and Degradation in JetStream: Streaming ...mfreed/docs/jet... · 0 10 20 30 40 50 60 70 80 90 Experiment time (minutes) 0 100 200 300 BW (Mbits/sec) 400 Degradation Keeps](https://reader033.vdokument.com/reader033/viewer/2022060517/604acb053e42da00040760a7/html5/thumbnails/24.jpg)
0 20 40 60 80 100 120 140Experiment time (minutes)
0
200
400
600
800BW
(Mbi
ts/s
ec)
Without Degradation
24
Drop BW
0 20 40 60 80 100 120 140Elapsed time (minutes)
0
200
400
600
800
1000
Late
ncy
(sec
)
0 20 40 60 80 100 120 140Elapsed time (minutes)
0
200
400
600
800
1000
Late
ncy
(sec
)
0 20 40 60 80 100 120 140Elapsed time (minutes)
0
200
400
600
800
1000
Late
ncy
(sec
)
Median Latency
95th percentile latency
Maximum latency
![Page 25: Aggregation and Degradation in JetStream: Streaming ...mfreed/docs/jet... · 0 10 20 30 40 50 60 70 80 90 Experiment time (minutes) 0 100 200 300 BW (Mbits/sec) 400 Degradation Keeps](https://reader033.vdokument.com/reader033/viewer/2022060517/604acb053e42da00040760a7/html5/thumbnails/25.jpg)
0 10 20 30 40 50 60 70 80 90Experiment time (minutes)
0
100
200
300
400BW
(Mbi
ts/s
ec)
Degradation Keeps Latency Bounded
25
Bandwidth Shaping
0 10 20 30 40 50 60 70 80 90Elapsed time (minutes)
0
5
10
15
20
Late
ncy
(sec
)
0 10 20 30 40 50 60 70 80 90Elapsed time (minutes)
0
5
10
15
20
Late
ncy
(sec
)
Median Latency
95th percentile latency
![Page 26: Aggregation and Degradation in JetStream: Streaming ...mfreed/docs/jet... · 0 10 20 30 40 50 60 70 80 90 Experiment time (minutes) 0 100 200 300 BW (Mbits/sec) 400 Degradation Keeps](https://reader033.vdokument.com/reader033/viewer/2022060517/604acb053e42da00040760a7/html5/thumbnails/26.jpg)
0 10 20 30 40 50 60 70 80 90Elapsed time (minutes)
0
5
10
15
20
25
30
35
40
Late
ncy
(sec
)Showing maximum latencies
26
Median Latency
95th percentile latency
Maximum Latency
![Page 27: Aggregation and Degradation in JetStream: Streaming ...mfreed/docs/jet... · 0 10 20 30 40 50 60 70 80 90 Experiment time (minutes) 0 100 200 300 BW (Mbits/sec) 400 Degradation Keeps](https://reader033.vdokument.com/reader033/viewer/2022060517/604acb053e42da00040760a7/html5/thumbnails/27.jpg)
Programming Ease
27
Scenario Lines of code Slow requests 5 Requests by URL 5
Bandwidth by node 15 Bad referrers 16 Latency and size quantiles 25 Success by domain 30 Top 10 domains by period 40
Big Requests 97
![Page 28: Aggregation and Degradation in JetStream: Streaming ...mfreed/docs/jet... · 0 10 20 30 40 50 60 70 80 90 Experiment time (minutes) 0 100 200 300 BW (Mbits/sec) 400 Degradation Keeps](https://reader033.vdokument.com/reader033/viewer/2022060517/604acb053e42da00040760a7/html5/thumbnails/28.jpg)
Conclusions and Future Work
� Useful to embed aggregation and degradation abstractions in streaming systems.
� Aggregation can be unified with storage.
� System must accommodate degradation semantics.
� Open questions: � How to guide users to the right degradation policy? � How to embed abstractions in higher-level language?
28