Hacker News Clone

Happiness is a freshly organized codebase

by felixrieseberg on 5/7/2020, 7:07:19 PM with 90 comments

by ryanianian on 5/7/2020, 8:40:35 PM
The concept of linting repo structure is an excellent one.
I cannot even get started in projects without a sane repo layout. Source files scattered everywhere unrelated to each other, utils junk-drawers with dozens of files, multiple top-level source directories without explicit rationale, tests and source totally disjoint, hacks to modify build and run paths, implicit dependencies between directories, awful convoluted build-system configuration to match all of these idiosyncrasies, and impossible or very difficult editor/ide integration as a result. Even rails-eque apps (which come with a reasonable structure) get really messy really quickly if somebody doesn't have the diligence to stay on top of it.
I want there to be a {sane, polyglot, large-ish-scale} project layout convention that plays well with {intuition/discoverability, build-systems, editors, project size}, but it seems there's an inherent tension between these. Maven tried. Kinda. I wonder if there's a fundamental problem we aren't solving somewhere.
by bob1029 on 5/7/2020, 9:55:56 PM
I feel like a lot of the pain in codebase organization boils down to having a technical project structure (i.e. layout on disk) that does not align well with the business. Obviously, it's impossible to force something to directly align with such disparate and abstract requirements, so you have to create abstractions (layers) to enable a hypothetically-pure realm.
In our architecture, we've created roughly 2 different kinds of abstraction: Platform and Business.
The Platform abstractions are only intended to support themselves and the Business abstractions. These are not supposed to be easily-accessible by non-technical folks. Their entire purpose is to make the Business abstractions as elegant as possible. The idea is these should be slowly-changing and stable across all use cases. Developers are only expected to visit this realm maybe once every other week. We effectively treat ourselves as our own customer with this layer.
The Business abstractions are built using only first-party primitives and our Platform layer. Most of this is very clean because we explicitly force all non-business concerns out into the Platform layer. Things like validation logic are written using purely functional code in this realm, which makes it very easy to reason with. Developers are expected to live in this area most of the time. Organization of business abstractions typically falls along lines of usage or logical business activity. We explicitly duplicate some code in order to maintain very clean separation of differing business activities. I've watched first hand as the subtle variances of a single combined Customer model across hundreds of contexts eventually turned it into a boat anchor.
As a consequence of all this, our project managers and other non-code-experts are actually able to review large parts of the codebase and derive meaningful insights without a developer babysitting them. We are at a point where project managers can open PRs for adjusting basic validation and parameter items in our customers' biz code. Developers also enjoy not having to do massive altitude shifts on abstractions while reading through 1 source file. You either spend a day working platform concerns, or you are in the business layer. Context switching sucks and we built to avoid it as much as possible. IMO, having a bad project structure is one direct side-effect of being forced to context switch all the time.
by rudi-c on 5/7/2020, 8:40:33 PM
I've encountered similar difficulties around codebases with a lack of file hierarchy structure. But one major difficulty in fixing the issue is that moving a lot of files around tends to trash `git blame`, which is often more valuable than knowing what folder to put a new file in. Is that something you've encountered?
by kccqzy on 5/7/2020, 8:35:20 PM
I sometimes wish Git properly records copy and move information. It turns out Git's heuristics for detecting copies and moves work about 95% of the time, and for the remaining 5% it's mildly annoying to read a git blame with every line from a move. You can blame further across that move, but that's manual. If copy and move information is perfectly recorded, I'd have more incentive to do these kinds of code reorganization.
by dangwu on 5/7/2020, 7:22:46 PM
Interesting read!
> No more dumping ground folders like “Helper” or “Utility”
If you’re organizing by feature and you have one of these helper/utility classes that support multiple features, where does it live? Would you consider each utility to be its own “feature”?
by nickjj on 5/8/2020, 12:20:26 AM
For new projects (especially as a solo developer), there's also a related topic of not taking advantage of tools like kanban boards, or having a place outside of your source code to organize your thoughts and research.
It's very possible to wind up with massive comment dumps of things to research, alternate implementations, notes to yourself and other things littered in your code base where you haven't made any git commits yet.
This really leaves things in a messy state where you feel like the project is never going to be finished.
An example of this and how I solved this problem with a kanban board can be found here: https://youtu.be/HHOkcCqsipE?t=76
by rukittenme on 5/7/2020, 8:55:54 PM
I've been wondering a lot recently if folders make projects better or worse.
For example, when I write a library its usually very simple. There's a single directory which contains all of the source files. When people use the library they:
"""
import lib
lib.run()
""""
Dead simple, no complex module paths to remember, no hierarchical folder structure forcing you to code based on a pattern rather than functionality. Pure bliss.
But on the other hand, I have projects that contain 100k lines of source code. I can't just leave it out in the cold. So poor baby gets a couple of folders.
But I do hate it. I hate writing the code. I hate reading it. I hate finding it 6 months after the fact.
That's probably just the nature of the job. It is work at the end of the day. Maybe its just doomed to be hard.
by ldd on 5/7/2020, 10:15:33 PM
There is this wonderful utility: dependency-cruiser[0] for javascript / typescript projects.
It visualizes dependencies in a project. I found it so so easy to refactor and move files around after I started using it. I am not usually a visually-oriented person, but for this usecase, and to be happy, `dep-cruiser` surely helps.
[0]: https://github.com/sverweij/dependency-cruiser
by battery_cowboy on 5/8/2020, 3:33:39 AM
What if we just got rid of files and put our code into a database?
You'd just have a "new code block" button, which creates the editor tab for your code, usually a function or a class, and usually one item per block. When you save it, it puts it into a database and you can version things easily. You can call other functions from the block and your editor will show their code when you mouseover or maybe some other method, just like today. Basically the same as today, but you don't need to worry about where some code lives. You back it with a great search feature to find stuff.
Hell, let's just eliminate pathed files, why do we care about file paths with the level of search today? Just store everything in a key:value store directly on the hard drive, no paths needed. For legacy, just add keys for '/etc/fstab' or whatever.
by peter_d_sherman on 5/10/2020, 4:59:01 AM
>"The Slack iOS team lived in these conditions for a few too many years. We got here as a result of some attempts to organize source files (several times), a lack of architecture pattern in the codebase, and a high growth of developers over a couple years. To put things into context, we have roughly 13,000 files (and counting), about 27 top level directories, a mix of files in Objective-C and Swift, and around 40 iOS developers that work in one monorepo."
An extensive, unrefactored codebase is no different than a jungle.
You might have a 10'xer programmer on your staff, and he might hold the programming equivalent of a machete, but if the rate of his refactoring (assuming your corporate rules let him) is slower than the rate of new code being added by other employees, he is going to fail, no matter how good he is!
I need to write a future essay about the relationship between 10'xers and how a 10'xer is a combination of only as good as how well the codebase is refactored, how well they know the codebase, how much corporate rules/polcies permit refactoring (or not), how much time he doesn't have to waste time solving stupid one-time issues from single customers, and how much help or pushback he is or isn't getting from the rest of the team.
In other words, given the right set of conditions between codebase size and obfuscation, limiting corporate policies (i.e., "you can't refactor", "you can't make just one mistake in your code, because it's all mission critical, and if you do, you will be fired, and by the way, there's no test environment!"), distraction ("you have to help this customer with his cosmetic problem before you are permitted to tackle the guts of the system"), and pushback from the rest of the team, you can actually change 10'xers (and higher!) to 1x'ers and below...
The reverse is true too...
I'll make anyone a "My Fair Lady" / Eliza Doolittle style bet (or the reverse!) that what I say is true!
That is, that 1'xers can be taught to become 10'xers, and conversely, 10'xers can be hampered by a variety of factors ("The Perfect Storm") resulting in them being slowed to 1'xers, or below...
by stefan_ on 5/8/2020, 1:24:32 AM
Danger is a tool we integrated into our Continuous Integration system that performs post-commit automated checks
You mean like the post-commit hook that Git offers out of the box? It's even named the exact same! I feel like we don't focus all this time on fast build test and deploy cycles only to then commit, navigate to some website to create a "merge request", wait for some fairy to allocate computing resources for trivial checks my computer could have done, only to get some pure noise "don't put this file here" comment and repeat the cycle all again.
by soedirgo on 5/8/2020, 12:02:06 AM
In Go, there's a standard project layout [1]. It'd be nice to have a project layout linter in Go Report Card [2].
[1]: https://github.com/golang-standards/project-layout
[2]: https://goreportcard.com
by momokoko on 5/8/2020, 1:50:03 AM
Needing organized codebases is a personality type as there is zero academic research I’m aware of that has ever shown strict organization has resulted in less bugs or faster development.
These are all done by people that need to feel in control and that everything has a place. So much money and time has been wasted on stuff like this with zero proven or measured benefit.