Arriving as someone who has to inherit a large existing codebase is often an intimidating exercise. Many agile practices in particular help you learn a number of the details, including the original XP metaphor, pair programming, test driven development, daily stand ups, and showcases. Many other valuable practices should also help including an excellent onboarding program, always available mentor(s), and an easy to set up environment.
We’ve been considering a number of techniques to learn as much about the system as possible. Here are just a few of the ones that spring to mind:
- Uncover all external dependencies – External dependencies and integration points are often killer spots for fast feedback loops, be they running local tests through an interface, or by just trying to deploy an application and see if they are live. Each dependency adds complexity to deployment, another point of failure, and possibly another communication bottleneck with parties outside. Some examples of external dependencies include specially licensed software or services, databases, file systems, other software applications, web services, REST services, and messaging queues or messaging buses.
- Validate your own understanding of the architecture – An architecture diagram is often useful when starting to navigate a codebase. The implementation of that architecture may not be as clear as a high level diagram, so it’s important to uncover the flow of a system. What we’ve found useful is building up a flow through the system using a specific example scenario to understand the interactions of classes as they fit into the architecture.
- Read through tests – If you’re lucky, your system will have plenty of tests. Start with the higher level ones that walkthrough system level interactions, delving into the more granular ones when it’s not necessarily clear.
- Try to write some automated tests – Try to test something in the system and you’ll suddenly discover you’re pulling a string that happens to be connected to everything else. You may learn what happens to be the most used (or abused) classes and where all the dependencies start to overlap.
- Generate diagrams using analysis tools – Consider different visualisations of code to understand how all the parts of the system fit together.
- Write down questions as you go (and get them answered) – Ask lots of questions after you’ve had some attempt at getting your own understanding. It will take a while to get the domain vocabulary and your questions will be more useful the more context you have.
Leave a comment if you have other strategies that you have found particularly useful. We’d certainly appreciate it right now.
One thing that’s worked pretty well for me: install a dependency analysis tool of some sort (eg JDepend) and strangle the dependencies as much as you can. Look for loops, chains running longer than expected or any unusual patterns, and dig into those.
Most of the time, you’ll uncover some missing analogy somewhere, or come up with tons of relevant questions (why does this service need access to that repository? why is this being done in code, instead of the database / vice-versa? etc.)
If you manage to hit a consistent dependency pattern (which should look like a neat little staircase between packages or classes vs themselves), go find the story behind it. In my experience, it never happens by accident or “just doing good OO”. In this case, the team had some amazing new ways to encourage this sort of stuff to happen naturally, and there were loads and loads of good lessons to be learned.