Code Health/office-hours/meeting notes/20190709-Cyclical Depedencies

Code Health Office Hours - Cyclical Dependencies

edit

Attendees: JR, Daniel, Guillaume L., Brennen, Pita, James F., thcipriani, Lars Wirzenius, Amir Sarabadani, Dayllan, Antoine

  • Guillaume: Why are circular dependencies bad? What are you trying to solve by addressing them?
  • Daniel: First off, why are dependencies bad in general? If code depends on some other code, it's influenced by that code. It's useful to be sure that code isn't impacted by changes to other code. The more isolated, the less we have to worry. Circular dependencies mean that two components can't be understood in isolation. They pretend to be two things but are in fact one.
  • Guillaume: Why is this bad? In a way it means we have something cohesive...
  • Daniel: If I have to change one bit of code, what do I have to look at in order to change it?
  • Guillaume: Dependencies increase the cost of change?
  • Daniel: Yes. They also increase the cost of onboarding - of understanding code. They make testing harder, and make it hard to have a clear sense of ownership or responsibility.
  • Guillaume: Questions?
    • Lars: You mentioned cohesion - but dependencies are also about coupling. Coupling is bad. It's one of the leading reasons why software is bad.
    • Timo: There are various classes of circular dependencies:
      • Isolated componentes very tightly coupled - tech debt that's otherwise fairly innocent. Downsides are relatively limited.
      • More commonly at WMF, it's a 1-many relationship, where logic is distributed across many components. And to do this well, you have to understand all the different systems, which becomes harder as we grow.
    • Daniel: Roughly 60% of MediaWiki's codebase is one big tangle.
  • Guillaume: What can we do about it?
    • Daniel: Core Platform Team is developing an initiative around decoupling. Several strategies:
      • Breaking up classes.
      • Abstraction: You define an interface and bind to that interface instead of the implementation with all the stuff. Interfaces oriented around what the caller needs. Allows the calling code to only depend on the methods it needs.
      • But where to start? Probably the classes used most across the codebase. Title and User object (sp/case?)
      • Tooling to find cycles.
    • Timo: We're not working in isolation - we have 3rd-party devs, extensions, people who need to understand the code. Have to find a bridge from the old way to the new way. Not just intent, but how people feel about / understand the code. A class like Title or User can be difficult here.
      • Naming may not be technically significant, but it is socially significant.
      • Have to take into account momentum of developer community as a whole.
    • Daniel: Having a good migration path is important, especially in an environment this complex and multi-stakeholder.
      • Sometimes it's obvious what the ideal structure would be, but that's completely incompatible with what we have now. Makes it better to go for something less clean that eases the migration.
      • (Discussion of Title objects.)
  • Timo: Revision refactor. We split one class into a dozen others, and then the old class can be re-implemented as a wrapper around the others. With titles, it's harder. Widespread breakage is harder to deal with than a limited set of cases.
  • Daniel: A good place to plug in conversation about stability of extension interface... Questions / comments?
  • Antoine: if two classes are so connected, maybe we should just merge them?
  • Daniel: in MediaWiki that is rarely the case. Usually both classes have very different concepts. We're not usually talking about 2 classes - we're talking about thousands.
  • Guillaume: Domain driven design?
    • Daniel: That's a tricky topic. Annoyed by purists around domain-driven design, but I find the ideas inspiring / useful. A purist approach would split MW into a few dozen isolated contexts - this provides flexibility, but comes with massive overhead. Need to find balance.
      • An idea I find useful is that I fall into the trap of making 2 things that seem similar the same.
      • SOLID principles
  • Guillaume: High-level overview of SOLID? Agreement seems hard here...
    • Daniel: Specific situations are what's really difficult here.
  • Timo: do you see the two principles as mutually exclusive or complementary.
    • Daniel: One I would use more for interfaces and the other for classes
  • Timo: The MCR refactor contains a lot of interfaces without implementations which I found strange
    • If there's an interface with only one implementation then why have the interface? Static analysis tools
  • Amir: it's not a bad idea in general to have an interface with only one implementation, it may help guide 3rd parties creating their own implementations
  • Daniel: Related to using interfaces as extension points
  • Amir: are there any plans in the decoupling of classes to improve DI?
  • Daniel: we need to test to refactor, but in order to test we need to break dependencies -- lots of refactoring edge-cases (some of which could have been caught by better tests)
    • Daniel: improve e2e tests prior to refactoring rather than improving unit tests
  • Timo: once wmf-config is loaded many classes are available that are not safe to use, but we have a need to instatiate some of those available classes at that time -- the way we do that is not via a singleton, but there is no global state (we think), but it's risky
  • Timo: making classes too small to avoid cyclical dependencies runs the risk of creating classes that are no longer meaningful on their own
  • Guillaume: What impact will the cyclic dependency refactoring have on someone writing extensions, etc?
    • Daniel: midterm impact is that the codebase becomes easier to understand since classes can be understood in isolation (hopefully)
  • Guillaume: how can we help you move forward?
    • Daniel: Stop introducing cyclic depndencies, it's so easy to introduce cylic depndencies. Making it obvious that you are introducing would help a lot
    • Daniel: for example: value object should return other value objects by id
  • Antoine: daniel, can you give a quick overview of your tool to detect cycles? :)
    • Daniel: https://phabricator.wikimedia.org/diffusion/MTDA/ (email announcement: https://lists.wikimedia.org/pipermail/wikitech-l/2019-May/092066.html )
    • Daniel: Several metrics exists: 1- The total lines of the big giant blob of classes that all coupled together. 2- The list of "bad" dependencies using the paper for fixing messy ontologies 3- List of classes that depend on lots of classes and lots of classes depend on the given class, making it a critical point of failure in the code
    • Daniel: One annoying thing is that it doesn't care about deprecated features, so they have to stay there for two releases to make sure nothing breaks and metrics stays the same

Chat Transcript

edit

Antoine Musso: just buy a new one!!! it is more cost effective ;:]]
Brennen Bearnes: not at, what, $350 a keyboard?
Jean-Rene Branaa: https://etherpad.wikimedia.org/p/CodeHealthOfficeHours-Cyclicaldependencies
Greg Grossmeier: $250 :)
Jean-Rene Branaa: https://etherpad.wikimedia.org/p/CodeHealthOfficeHours-Cyclicaldependencies
Lars Wirzenius: o/
Timo Tijhof: o/
Jean-Rene Branaa: https://etherpad.wikimedia.org/p/CodeHealthOfficeHours-Cyclicaldependencies
Antoine Musso: if two classes are very strongly connected, one might as well jsut merge them?
Antoine Musso: (but then you get strongly connected methods instead of objects bah ..)
Timo Tijhof: @Antoine: Yeah, I made a distinction earlier between cycles of 1:1(:1...) which can be easily merged and limit their destructive impact (just treat them as one, being separate might even make it easier in some way to maintain).
Timo Tijhof: the other kind of cycle is 1:many, where the real logic of "thing" is not in class Thing, but in 100 other unrelated classes.
Timo Tijhof: This is the kind of cycle quite common in MW, and hard to break (not mergeable)
Guillaume Lederrey: merge as long as you dont break SRP, or be prepared to split again soon...
Timo Tijhof: e.g. EditPage containing logic of when the user is blocked, or ViewAction deciding when a page is deleted.
Daniel Kinzler: https://phabricator.wikimedia.org/T208776
Daniel Kinzler: a "short time" :)
Antoine Musso: +1 :)
Daniel Kinzler: https://en.wikipedia.org/wiki/Interface_segregation_principle
https://en.wikipedia.org/wiki/Dependency_inversion_principle
Timo Tijhof: o/
Brennen Bearnes: (i'm tapping out on notes for a bit)
Jean-Rene Branaa: for newcomers: https://etherpad.wikimedia.org/p/CodeHealthOfficeHours-Cyclicaldependencies
Antoine Musso: I think there is a PSR for a caching interface which could replace our BagOStuff
Antoine Musso: but I disgress
Amir Sarabadani: o/
Timo Tijhof: Yagni: You aren't gonna need it
Jean-Rene Branaa: for newcomers: https://etherpad.wikimedia.org/p/CodeHealthOfficeHours-Cyclicaldependencies
Guillaume Lederrey: o/
Antoine Musso: daniel, can you give a quick overview of your tool to detect cycles? :)
Daniel Kinzler: https://phabricator.wikimedia.org/diffusion/MTDA/
Lars Wirzenius: thanks, this was good, but I gotta go now
Daniel Kinzler: https://lists.wikimedia.org/pipermail/wikitech-l/2019-May/092066.html
Tyler Cipriani: https://dephpend.com/
Antoine Musso: great session thank you!
Brennen Bearnes: very interesting discussion, thanks folks.
Guillaume Lederrey: I love to get feedback!