reverse engineering for reuse - tu wien€¦ · solution: scan all the code in single, short...
Post on 25-May-2020
2 Views
Preview:
TRANSCRIPT
Reverse Engineeringfor Reuse
Univ.Prof. Dipl.-Ing. Dr. techn.
Harald GALL
Universität ZürichTechnische Universität Wien
(c) 2006, H.Gall Reverse Engineering.2
Motivation für Reverse Engineering
• Existierende Software, die erfolgreich imEinsatz ist
• Hohe Investitionen in bestehende Software• Gute Lösungen
• Ziele:☞Reuse-Potential nutzen☞Herauslösen von Komponenten für Reuse☞Grad der Wiederverwendung erhöhen
(c) 2006, H.Gall Reverse Engineering.3
Reverse Engineering - Begriff
• Analyse eines existierenden Systems mit dem Ziel☞ Identifikation der Komponenten & deren Beziehungen
☞ Erzeugung von Darstellungen auf höheremAbstraktionsniveau
• keine Änderung des Systems!• bezieht sich auf alle Phase des Software Life-Cycles• Subprozesse:
☞ Redokumentation
☞ Design Recovery
(c) 2006, H.Gall Reverse Engineering.4
Redokumentation
• Erzeugung oder Überarbeitung vonsemantisch äquivalenter Repräsentationen desSystems auf demselben Abstraktionsniveau
• z.B.☞Redokumentation des Source Codes (Code -
Code)
☞Redokumentation des Designs (Design - Design)
(c) 2006, H.Gall Reverse Engineering.5
Design Recovery
• Zusätzliche Information (Wissen über das Systemund seinen Anwendungsbereich) wird zurGenerierung von Abstraktionen herangezogen.
• Repräsentationen:☞ Datenflussdiagramme, Kontrollflussdiagramme☞ informale Beschreibungen des Software-Systems und
seiner Domäne (Diagramme, Text etc.)
• Ergebnis sind semantisch reiche Darstellungen(Abstraktionen) des Systems
(c) 2006, H.Gall Reverse Engineering.6
Restructuring
• Transformation von einer Repräsentation in eineandere auf demselben Abstraktionsniveau.
• Funktionalität und Semantik des Software-Systemswird nicht verändert!
• z.B.☞ Restrukturierung des Source Codes in neue logische
Einheiten (Module)☞ Restrukturierung des Designs in veränderte
Komponenten
(c) 2006, H.Gall Reverse Engineering.7
Refactoring
• Refactoring is a technique to restructure code ina disciplined way.
• For a long time it was a piece of programmerknowledge, done with varying degrees ofdiscipline by experienced developers, but notpassed on in a coherent way
• Martin Fowler’s “www.refactoring.com”
(c) 2006, H.Gall Reverse Engineering.8
Reverse Engineering Tools
• Tool-Übersicht: http://scgwiki.iam.unibe.ch:8080/SCG/370• CodeCrawler:
http://www.iam.unibe.ch/~scg/Research/CodeCrawler/index.html
• Imagix4D: http://www.imagix.com/• Rigi: http://www.rigi.csc.uvic.ca/• XGvis: http://www.research.att.com/areas/stat/xgobi/• IBM Structured Analysis Tool:
http://www.alphaworks.ibm.com/tech/sa4j• Java Clone Detection CloneDR:http://www.semdesigns.com/Products/Clone/download.asp
(c) 2006, H.Gall Reverse Engineering.9
Re-Engineering
• Änderung des Software-Systems, um es inveränderter Form neu zu implementieren. Auchneue Anforderungen (Requirements) werdenmiteinbezogen.
• Re-Engineering := Reverse Engineering+ Δ+ Forward Engineering
(c) 2006, H.Gall Reverse Engineering.10
Reverse Engineering Terminologie
[Chikofsky/Cross, 1990]
Design ImplementationRequirementsForward
EngineeringForward
Engineering
ReverseEngineering
ReverseEngineering
DesignRecovery
DesignRecovery
Re-Engineering(renovation)
Restructuring Restructuring
Re-Engineering(renovation)
Restructuring,Redocumentation
(c) 2006, H.Gall Reverse Engineering.11
Beispiel: Multitasking Window System
• Wissen über Schlüsselstrukturen wie:☞ process table, window table, window management
module, process management module, etc.
• Suche im Domain Model nach diesen Konzeptenund Instanzierung dieser für einen architekturellenÜberblick
Process table
Window table
Process Management Module
Window Management Module
...
MultitaskingWindowManager
(c) 2006, H.Gall Reverse Engineering.12
Semantische Info im Code
#include <stdio.h>#include “h0001.h”#include “h0002.h”#include “h0003.h”
f0001(a0001)unsigned int a0001;{unsigned int i0001;f0002(g0005, d0001, d0002);f0002(a0001, d0003, d0002);f0003(g0001[a0001].s0001,g0001[a0001].s0002);g0006 = a0001;i0001 = g0001[a0001].s0003;if(! f0004(i0001) && (g0002->g0003)[i0001].s0004 == d0004)
f0005(i0001);}
(c) 2006, H.Gall Reverse Engineering.13
Semantische Info im Code /2
#include <stdio.h>#include “proc.h”#include “windows.h”#include “globdefs.h”
change_window(nw)unsigned int nw;{unsigned int pn;border_attribute(cwin, NORM_ATTR,INV, INV_ATTR);border_attribute(nw, NORMHLIT_ATTR, INV_ATTR);move_cursor(wintbl[nw].crow, wintbl[nw].ccol);cwin = nw;pn = wintbl[nw].pnumb;if(! outrange(pn) && (g->proctbl)[pn].procstate == SUSPENDED)
resume(pn);}
(c) 2006, H.Gall Reverse Engineering.14
Semantische Info im Code /3#include <stdio.h>#include “proc.h”#include “windows.h”#include “globdefs.h”
change_window(nw) /* change current window to window nw */unsigned int nw; /* number of target window */{unsigned int pn;/* restore border of current window to un-highlighted */border_attribute(cwin, NORM_ATTR,INV, INV_ATTR);/* highlight border of new current window */border_attribute(nw, NORMHLIT_ATTR, INV_ATTR);/* move physical cursor to new window where cursor was left and make nw the current window */move_cursor(wintbl[nw].crow, wintbl[nw].ccol);cwin = nw;/* resume the process associated with the new window if it is suspended */pn = wintbl[nw].pnumb;if(! outrange(pn) && (g->proctbl)[pn].procstate == SUSPENDED)
resume(pn);}
(c) 2006, H.Gall Reverse Engineering.15
Abstraction-to-code-mapping
• direktes Binden:☞Assoziation mittels linguistischer Idiome
• indirektes Binden:☞Assoziation durch Substrukturen
(c) 2006, H.Gall Reverse Engineering.16
Abstrakte Design Idiome im DM
process table
window tabletable
queue
process
data
[ pr.c | prc ] [ .? | .. | ... | .... ] [t.b | tbl ]
linguistic idiom
ProcessNumber
ProcessState
ProcessName
Location ofsaved Envir.
Locationof Process
data object idiom
...
...
...
(c) 2006, H.Gall Reverse Engineering.17
Suche mittels Linguistischer Idiome
process proctbl [MAXPROCS]; /* process table array */........typedef struct procentry /* process table entry */
{unsigned int savesp; /* save sp register */unsigned int savess; /* save ss register */unsigned int pspseg; /* PSP seg addr this proc */unsigned int windno; /* window number this proc */unsigned int procstate; /* process state */char procname[MAXPNAME+1]; /* process name */int pnum; /*process number for this entry */......} process;
(c) 2006, H.Gall Reverse Engineering.18
Substruktur Bindungen im SourceCode
process proctbl [MAXPROCS]; /* process table array */........typedef struct procentry /* process table entry */
{unsigned int savesp; /* save sp register */unsigned int savess; /* save ss register */unsigned int pspseg; /* PSP seg addr this proc */unsigned int windno; /* window number this proc */unsigned int procstate; /* process state */char procname[MAXPNAME+1]; /* process name */int pnum; /*process number for this entry */......} process;
ProcessNumber
ProcessState
ProcessName
Location ofsaved Envir.
Locationof Process
(c) 2006, H.Gall Reverse Engineering.19
Resultierendes partielles Mapping
Process table
Window table
Process Management ModuleWindow Management Module...
MultitaskingWindowManager
Processstate
Processname
Processnumber
Processtable def.
code
Object-OrientedReengineeringPatterns
Serge DemeyerStéphane Ducasse
Oscar Nierstrasz
www.iam.unibe.ch/~scg/OORP
(c) Demeyer, Ducasse, Nierstrasz Reverse Engineering.22
Reverse Engineering Patterns• What and Why• Setting Direction
☞ Most Valuable First
• First Contact☞ Chat with the Maintainers☞ Interview during Demo
• Initial Understanding☞ Analyze the Persistent Data☞ Study Exceptional Entities
• Detailed Model Capture☞ Tie Code and Questions☞ Step through the Execution☞ Look for the Contracts
• Conclusion
(c) Demeyer, Ducasse, Nierstrasz Reverse Engineering.23
What and Why ?DefinitionReverse Engineering is the process of analysing a subject system
☞ to identify the system’s components and their interrelationships and☞ create representations of the system in another form or at a higher level of
abstraction.— Chikofsky & Cross, ’90
MotivationUnderstanding other people’s code(cf. newcomers in the team, code reviewing,original developers left, ...)
Generating UML diagrams is NOT reverse engineering... but it is a valuable support tool
(c) Demeyer, Ducasse, Nierstrasz Reverse Engineering.24
The Reengineering Life-Cycle
(0) req. analysis(1) model captureissues• scale• speed• accuracy• politics
Requirements
Designs
Code
(0) requirementanalysis
(1) modelcapture
(2) problemdetection (3) problem
resolution
(4) program transformation
(c) Demeyer, Ducasse, Nierstrasz Reverse Engineering.25
Forces — Setting Direction
• Conflicting interests (technical, ergonomic,economic, political)
• Presence/absence original developers
• Legacy architecture
• Which problems to tackle?☞Interesting vs important problems?☞Wrap, refactor or rewrite?
(c) Demeyer, Ducasse, Nierstrasz Reverse Engineering.26
Setting Direction
Agree on Maxims
Set direction
Appoint aNavigator
Speak to theRound Table
Maintaindirection
Coordinatedirection
Most Valuable First
Where to start
Fix Problems,Not Symptoms
If It Ain't BrokeDon't Fix It
What not to doWhat to do
Keep it Simple
How to do it
Principles & Guidelines forSoftware project management
especially relevant forreengineering projects
(c) Demeyer, Ducasse, Nierstrasz Reverse Engineering.27
Most Valuable FirstProblem: Which problems should you focus on first?Solution: Work on aspects that are most valuable to your
customer• Maximize commitment, early results; build
confidence• Difficulties and hints:
☞ Which stakeholder do you listen to?☞ What measurable goal to aim for?☞ Consult change logs for high activity☞ Play the Planning Game☞ Wrap, refactor or rewrite? — Fix Problems, not Symptoms
(c) Demeyer, Ducasse, Nierstrasz Reverse Engineering.28
Forces — First Contact
• Legacy systems are large and complex☞Split the system into manageable pieces
• Time is scarce☞Apply lightweight techniques to assess feasibility and
risks
• First impressions are dangerous☞Always double-check your sources
• People have different agendas☞Build confidence; be wary of skeptics
(c) Demeyer, Ducasse, Nierstrasz Reverse Engineering.29
First Contact
System experts
Chat with theMaintainers
Interviewduring Demo
Talk withdevelopers
Talk withend users
Talk about it
Verify whatyou hear
feasibility assessment(one week time)
Software System
Read All the Codein One Hour
Do a MockInstallation
Read it Compile it
Skim theDocumentation
Readabout it
(c) Demeyer, Ducasse, Nierstrasz Reverse Engineering.30
Chat with the Maintainers
Problem: What are the history and politics of the legacysystem?
Solution: Discuss the problems with the systemmaintainers.
• Documentation will mislead you (various reasons)• Stakeholders will mislead you (various reasons)
• The maintainers know both the technical and politicalhistory
(c) Demeyer, Ducasse, Nierstrasz Reverse Engineering.31
Chat with the Maintainers
Questions to ask:• Easiest/hardest bug to fix in recent months?• How are change requests made and evaluated?• How did the development/maintenance team evolve during
the project?• How good is the code? The documentation?• Why was the reengineering project started? What do you
hope to gain?The major problems of our work are not so much technological as sociological.
— DeMarco and Lister, Peopleware ‘99
(c) Demeyer, Ducasse, Nierstrasz Reverse Engineering.32
Read all the Code in One Hour
Problem: How can you get a first impression of the quality of thesource code?
Solution: Scan all the code in single, short session.• Use a checklist (code review guidelines, coding styles etc.)• Look for functional tests and unit tests• Look for abstract classes and root classes that define domain
abstractions• Beware of comments• Log all your questions!
I took a course in speed reading and read “War and Peace” intwenty minutes. It’s about Russia.
—Woody Allen
(c) Demeyer, Ducasse, Nierstrasz Reverse Engineering.33
• Solution: interview during demo- select several users- demo puts a user in a positive
mindset- demo steers the interview
Interview during Demo
Problem: What are the typical usage scenarios?
Solution: Ask the user!
• ... however☞ Which user ?
☞ Users complain☞ What should you ask ?
(c) Demeyer, Ducasse, Nierstrasz Reverse Engineering.34
First Project Plan
Use standard templates, including:• project scope
☞ see "Setting Direction"
• opportunities☞ e.g., skilled maintainers, readable source-code, documentation
• risks☞ e.g., absent test-suites, missing libraries, …☞ record likelihood (unlikely, possible, likely)
& impact (high, moderate, low) for causing problems
• go/no-go decision• activities
☞ fish-eye view
(c) Demeyer, Ducasse, Nierstrasz Reverse Engineering.35
Forces — Initial Understanding
• Data is deceptive☞ Always double-check your sources
• Understanding entails iteration☞ Plan iteration and feedback loops
• Knowledge must be shared☞ “Put the map on the wall”
• Teams need to communicate☞ “Use their language”
(c) Demeyer, Ducasse, Nierstrasz Reverse Engineering.36
Initial Understanding
understand ⇒higher-level model
Top down
Speculate about Design
Recoverdesign
Analyze thePersistent Data
Study theExceptional
Entities
Recoverdatabase
Bottom up
Identifyproblems
(c) Demeyer, Ducasse, Nierstrasz Reverse Engineering.37
Analyze the Persistent DataProblem: Which objects represent valuable data?Solution: Analyze the database schema• Prepare Model
☞ tables ⇒ classes; columns ⇒ attributes☞ candidate keys (naming conventions + unique indices)☞ foreign keys (column types + naming conventions
+ view declarations + join clauses)• Incorporate Inheritance
☞ one to one; rolled down; rolled up
• Incorporate Associations☞ association classes (e.g. many-to-many associations)☞ qualified associations
• Verification☞ Data samples + SQL statements
(c) Demeyer, Ducasse, Nierstrasz Reverse Engineering.38
Example: One To One
Patientid: char(5)insuranceID: char(7)insurance: char(5)
Salesmanid: char(5)company: char(40)
Personid: char(5)name: char(40)addresss: char(60)
Patientid: char(5)insuranceID: char(7)insurance: char(5)
Salesmanid: char(5)company: char(40)
Personid: char(5)name: char(40)addresss: char(60)
(c) Demeyer, Ducasse, Nierstrasz Reverse Engineering.39
Example: Rolled Down
Patientid: char(5)name: char(40)addresss: char(60)insuranceID: char(7)insurance: char(5)
Salesmanid: char(5)name: char(40)addresss: char(60)company: char(40)
Patientid: char(5)insuranceID: char(7)insurance: char(5)
Salesmanid: char(5)company: char(40)
Personid: char(5)name: char(40)addresss: char(60)
(c) Demeyer, Ducasse, Nierstrasz Reverse Engineering.40
Example: Rolled Up
Personid: char(5)name: char(40)addresss: char(60)insuranceID: char(7) «optional»insurance: char(5) «optional»company: char(40) «optional»
Patientid: char(5)insuranceID: char(7)insurance: char(5)
Salesmanid: char(5)company: char(40)
Personid: char(5)name: char(40)addresss: char(60)
(c) Demeyer, Ducasse, Nierstrasz Reverse Engineering.41
Speculate about Design
Problem: How do you recover design from code?Solution: Develop hypotheses and check them
• Develop a plausible class diagram and iteratively check andrefine your design against the actual code.
Variants:• Speculate about Business Objects• Speculate about Design Patterns• Speculate about Architecture
(c) Demeyer, Ducasse, Nierstrasz Reverse Engineering.42
Study the Exceptional Entities
Problem: How can you quickly identify design problems?
Solution: Measure software entities and study the anomalous ones
• Use simple metrics
• Visualize metrics to get an overview
• Browse the code to get insight into the anomalies
(c) Demeyer, Ducasse, Nierstrasz Reverse Engineering.43
Visualizing Metrics
Use simple metrics and layout algorithms.
(x,y) width
height colour
Visualize up to 5 metrics per node
(c) Demeyer, Ducasse, Nierstrasz Reverse Engineering.44
Initial Understanding (revisited)Top down
Speculate about Design
Analyze thePersistent Data
Study theExceptional
Entities
understand ⇒higher-level model
Bottom up
ITERATION
Recoverdesign
Recoverdatabase
Identifyproblems
(c) Demeyer, Ducasse, Nierstrasz Reverse Engineering.45
Forces — Detailed Model Capture
• Details matter☞Pay attention to the details!
• Design remains implicit☞Record design rationale when you discover it!
• Design evolves☞Important issues are reflected in changes to the
code!
• Code only exposes static structure☞Study dynamic behaviour to extract detailed design
(c) Demeyer, Ducasse, Nierstrasz Reverse Engineering.46
Detailed Model CaptureExpose the design
& make sure it stays exposedTie Code and Questions
Refactor to Understand
Keep track ofyour understanding
Expose design
Step through the Execution
Expose collaborations
• Use Your Tools• Look for Key Methods
• Look for Constructor Calls• Look for Template/Hook Methods
• Look for Super Calls
Look for the Contracts
Expose contracts
Learn from the Past
Expose evolution
Write Teststo Understand
(c) Demeyer, Ducasse, Nierstrasz Reverse Engineering.47
Tie Code and QuestionsProblem: How do you keep track of your understanding?
Solution: Annotate the code
• List questions, hypotheses, tasks and observations.
• Identify yourself!
• Use conventions to locate/extract annotations.• Annotate as comments, or as methods
(c) Demeyer, Ducasse, Nierstrasz Reverse Engineering.48
Refactor to Understand
Problem: How do you decipher cryptic code?Solution: Refactor it till it makes sense• Goal (for now) is to understand, not to reengineer• Work with a copy of the code• Refactoring requires an adequate test base
☞ If this is missing, Write Tests to Understand
• Hints:☞ Rename attributes to convey roles☞ Rename methods and classes to reveal intent☞ Remove duplicated code☞ Replace condition branches by methods
(c) Demeyer, Ducasse, Nierstrasz Reverse Engineering.49
Step Through the ExecutionProblem: How do you uncover the run-time architecture?
Solution: Execute scenarios of known use cases and step through thecode with a debugger
• Difficulties☞ OO source code exposes a class hierarchy, not the run-time object
collaborations
☞ Collaborations are spread throughout the code
☞ Polymorphism may hide which classes are instantiated
• Focussed use of a debugger can expose collaborations
(c) Demeyer, Ducasse, Nierstrasz Reverse Engineering.50
Look for the Contracts
Problem: Which contracts does a class support?Solution: Look for common programming idioms• Look for “key methods”
☞ Intention-revealing names☞ Key parameter types☞ Recurring parameter types represent temporary associations
• Look for constructor calls• Look for Template/Hook methods• Look for super calls• Use your tools!
(c) Demeyer, Ducasse, Nierstrasz Reverse Engineering.51
Learn from the Past
Problem: How did the system get the way it is?
Solution: Compare versions to discover where code was removed
• Removed functionality is a sign of design evolution
• Use or develop appropriate tools
• Look for signs of:☞ Unstable design — repeated growth and refactoring
☞ Mature design — growth, refactoring and stability
top related