system-level critical path analysis

23
System-level critical path analysis Progress Report Meeting December 11 th 2013 Francis Giraldeau [email protected] Under the direction of Michel Dagenais DORSAL Lab, École Polytechnique de Montréal

Upload: others

Post on 05-May-2022

10 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: System-level critical path analysis

System-level critical path analysis

Progress Report MeetingDecember 11th 2013

Francis [email protected]

Under the direction of Michel DagenaisDORSAL Lab, École Polytechnique de Montréal

Page 2: System-level critical path analysis

General objective

Provide trace analysis tools to understand the overall performance

of a distributed application.

“”

Page 3: System-level critical path analysis

apt-get install tree

Callgrind output

Blocking : 37%

What the appis waiting for?

Running : 63%

Page 4: System-level critical path analysis

Critical Flow View

running timer

mandb

dpkg

wait I/O

Page 5: System-level critical path analysis

Performance impact

● Sysbench experiments: CPU, I/O, mysql● lttng 2.3.0 - Dominus Vobiscum● Ubuntu 13.04 – kernel 3.8.0-34-generic● i7-3770 8GB RAM● Single hard drive 7200 RPM (trace+load)● No event loss achieved using ionice and renice

Page 6: System-level critical path analysis

sysbench CPU sysbench I/O sysbench mysql0%

5%

10%

15%

20%

25%

30%

35%

40%

45%

Tracing overhead according to configuration

all events

graph + sys

graph

Ove

rhe

ad

Page 7: System-level critical path analysis

sysbench CPU sysbench I/O sysbench mysql0

500

1000

1500

2000

2500

3000

3500

Trace size according to configuration

all events

graph + sys

graphMB

Page 8: System-level critical path analysis

sched_switch

sched_wakeup

softirq_exit

softirq_entry

hrtimer_expire_exit

hrtimer_expire_entry

irq_handler_exit

irq_handler_entry

sched_wakeup_new

sched_process_fork

sched_process_exit

sched_process_exec

0 0,1 0,2 0,3 0,4 0,5 0,6 0,7

Event type proportion

Average event size: 36 bytes

Page 9: System-level critical path analysis

Reduce overhead strategies

● Target required events with conditions● Reduce event size

● Define new TP with minimal fields● Record only syscall entry ID, no args

● Interrupt context instead of entry/exit

Page 10: System-level critical path analysis

Critical path Critical path recovery of recovery of

distributed appsdistributed apps

Page 11: System-level critical path analysis

Recovering dependencies over TCP

● Recording TCP headers● Match packets● Link related nodes in the graph● Critical path computation: no change!

Page 12: System-level critical path analysis

Principle of operation

writewrite

softirqsoftirq

WAIT_CPUWAIT_CPU

wakeup

network

Page 13: System-level critical path analysis

RPC ServerCommands: hog or sleep

Default control flow view

Critical Flow View : request hog()

Critical Flow View : request sleep()

Page 14: System-level critical path analysis

Python Django Unit Testusing postgresql database

Critical Flow View with default instrumentation

Critical Flow View with TCP Packet Matching

Page 15: System-level critical path analysis

Bridge the gap Bridge the gap between kernel trace between kernel trace

and app codeand app code

Page 16: System-level critical path analysis

Locate system calls in the code

0 7ffff78fb740 __write1 7ffff78892f3 _IO_file_write2 7ffff78891d2 _IO_file_seek3 7ffff788a905 _IO_do_write4 7ffff7889b71 _IO_file_xsputn5 7ffff7859044 _IO_vfprintf6 7ffff78630a9 _IO_printf7 400942       main8 7ffff7830ea5 __libc_start_main9 400809       _start

writewrite

Page 17: System-level critical path analysis

Recording ELF call stack

● Instruction pointer: mostly in libc● Frame pointers: fast, chained list of callers

● -fomit-frame-pointers: silly optimization on x86

● Scrape the stack: record everything that looks like a return address, yields false positive

● Unwind: recover registers state from the stack for each frame (using eh_frame)

Page 18: System-level critical path analysis

WAMS: where are my syscalls?

simple implementation of online unwind with ptrace

$ wams sleep 1...35ip = 7ffff7ad28c0 nanosleepip = 403de7 ip = 403c7a ip = 4016fa ip = 7ffff7a32ea5 __libc_start_mainip = 4017c9

Strace-plus (strace and unwind): https://code.google.com/p/strace-plus/

Wams source code: https://github.com/giraldeau/wams

Page 19: System-level critical path analysis

Unwind over ptrace overhead

sys_mmap64() : 12us

● Nr calls to ptrace(PTRACE_PEEKDATA) : 105● One frame processing: 274us● 14 frames: ~4ms

Each system call adds milliseconds overhead

Page 20: System-level critical path analysis

Perf callchain

Record registers + stack + mmap, offline unwindCan be performed on sys_enter, sched_switch and sched_wakeup

$ sudo perf record -g --call-graph dwarf \ -a -R -r 1 -m 4096 -f -c 1 \ -e sched:sched_wakeup \ -e sched:sched_switch -- $CMD

sample time 105358012591826 cpu 2 tid 11546 cmd mandb

ffffffff8108ca72 ttwu_do_wakeupffffffff8108f044 try_to_wake_upffffffff8108f242 default_wake_functionffffffff81086655 __wake_up_commonffffffff81089eb8 __wake_upffffffff8140e35e tty_wakeupffffffff8141a263 pty_writeffffffff8141328d n_tty_writeffffffff81410209 tty_writeffffffff8119411c vfs_writeffffffff81194462 sys_writeffffffff816d37dd system_call_fastpath

7ffff72c4740 __GI___libc_write7ffff72e8961 __printf_chk403a1a main7ffff71f9ea5 __libc_start_main403b09 _start

Kernel code

App code

Page 21: System-level critical path analysis

no trace

sys_entry

sched_switch

ptrace

0 2 4 6 8 10 12 14

Elapsed time according to trace configuration

s

Lost eventsHuge trace size

Page 22: System-level critical path analysis

Future work● Multi-host support● On-line computation● Reduce call chain recovery overhead● Fix sched_wakeup IPI (linux >= 3.10)

Page 23: System-level critical path analysis

Thanks to Professor Michel Dagenais and our partners EfficiOS and Ericsson.

Special thanks to Geneviève Bastien for her excellent work on Luna Dorsal.

Software:

http://secretaire.dorsal.polymtl.ca/~fgiraldeau/workload-kit/

http://secretaire.dorsal.polymtl.ca/~fgiraldeau/traceset/

https://github.com/giraldeau