Software Refactoring and Its Impact on Code
Refactoring
What if your design is wrong?
Refactoring is the process of changing a software system:
- In such a way that it does not alter the externally-visible behavior of the code, yet improves its internal structure.
- But many changes will alter externally-visible behavior! Must be careful.
Standard Refactoring
- There are some refactoring operations that can be applied in many places. Whole catalogues are devoted to these.
Rename
- Names matter; fix them when they fail to express purpose clearly.
- Can be applied to: methods, fields, local variables, types, packages.
- References to the named element can also be updated (and usually should be).
- What if the name is already being used?
- Obvious problem sometimes. Can’t have two methods with identical signatures.
- Possible, but poor practice in other situations:
- Hiding a local variable in an outer scope (compilation error in Java).
- Hiding a field of the same name from a superclass.
- What about polymorphism?
- Need to worry about super- and subclasses.
- What if it forms part of external API?
- Mark old name as “deprecated”; yuck!
Pull Up
- You realize that each subclass of a given class implements the same method:
- Eliminate the redundant code by moving these methods to the superclass.
- Need to ensure that the methods are identical:
- If not, need to modify each to pull out the common functionality: not meaning-preserving.
- Sometimes, you might not want that method in the superclass:
- Insert an additional, abstract class between.
Inline
- Sometimes the implementation of a method is as obvious as its name.
- Replace calls to the method with the implementation itself.
- Beware: simple methods can be helpful for evolvability!
- What if the implementation accesses private fields or methods?
- What if you end up with “magic numbers” replicated all over the code?
- Consider replacing the magic number with a
static final
field reference.
- Consider replacing the magic number with a
- This refactoring can be abused badly, leading to bloated source code and replicated errors.
Add Parameter
- Decide that a method implementation needs an extra piece of information: Pass it as an argument.
- Where does that argument come from? Sometimes a default value can be used. Sometimes, the extra argument needs to come from the caller of the caller of the caller…
- What about super- and subclasses?
- Big concern is that long parameter lists are dangerous:
- The more parameters, the less likely that developers will understand all the options.
- Demands a lot of excess baggage in situations where flexibility is not needed.
- Can lead to widespread, non-trivial modification.
Tool Support
- Software tools can automate the tedious parts of many refactorings.
- Some transformations would not alter the behavior but are beyond the ability of a tool to perform correctly.
- Tools generally assume that you control all the code, so watch out!
Lexing
- Starts from a stream of characters.
- Identifies keywords, numbers, whitespace, comments, etc.
- Groups important characters into tokens.
- (Usually) discards whitespace & comments.
Parsing
- Identifies specific patterns in token sequences.
- Groups tokens into syntactically meaningful constructs.
- Output usually organized as a tree (Abstract Syntax Tree (AST)).
Semantic Information
Compiler Overview
Source code characters —> Lexer —> Tokens —> Parser —> Concrete Syntax Tree —> Basic Optimization —> Abstract Syntax Tree —> Semantic Analysis Steps (loops symbol table in this step) —> Binary (object code or bytecode)
Refactoring Tools
- Developer indicates element to change, and kind of change.
- Tool constructs ASTs for the program.
- Tool locates the node that represents the element to change.
- Tool determines other elements that reference it.
- Tool determines where/how the element would move/change.
- Tool determines whether these consequences would violate any constraints for that refactoring kind.
Pitfalls
- In constructing a refactoring tool, it is not always possible to account for the full set of constraints that human developers would need to worry about.
- “If it’s automated, it must work right!”
- Maybe the tool is not implemented correctly.
- Maybe the tool fails to consider all the consequences that should concern you.
- Are the resulting IS-A and HAS-A properties meaningful?
- Understand what you are trying to do.
- Inspect the results; run tests!