I've read the C++ FAQs (which are not FAQs at all but questions that
should be asked frequently) many times. A great book. Still, there
are some points that strike me as plain weird. For example, there is an
item approximately like this:
- Are small projects good to get your feet wet?
- No, often small projects give you wrong intuitions about design. It's hard to say which projects are small and which are big, but a project with less than 10KLOC (LinesOfCode) is almost certainly small and a project with more than 200KLOC is almost certainly big.
Okay, now. I've programmed for five years in a procedural fashion, five
years in OO, one year in functional and three years in a hybrid style.
I've grokked
BehaviorOrientedProgramming,
IncrementalDevelopment,
ClassInvariants,
UnitTests, etc ad nauseum, but I still
claim that
writing a single application of more than two hundred thousand lines of code is madness. Sure, it can be done. I've seen
procedural programs of >200KLOC. With OOP, you can relatively
easily go as high as 2MLOC, but you shouldn't. Here is why.
There are many ways to abstract big projects that have many cooperating
parts. They use different mechanisms for communication between the
components. There are API's (
ApplicationProgrammingInterface), ABI's
(
ApplicationBinaryInterface), streams, API's to streams (like Xlib,
CORBA, XDR-RPC, XML-RPC and so on), internal command languages and
procedure databases (
AlternateHardAndSoftLayers), to name those that
come to my mind first. OOP falls in the first class of abstraction, it
helps to build API's. (You might claim that it also gives ABI's, but
I've learned not to trust this.)
Roughly, the former two (API's and ABI's) are used more in the commercial
world, while the latter three (streams and layered languages) are used
more in the
UnixLike world, and especially, in the
FreeSoftwareMovement
world. I'm strongly of the opinion that the
UnixLike way is better, and
I'll try to assess the forces that cause this division:
- In the commercial world, solutions tend to be total (not modular), because they are sold in one package and they are not expected to be mixed with other solutions. Interoperability is not only irrelevant, it is often also a threat.
- API's are a great way to ensure totalness, because they keep most communication within the application while forgetting the metadata of what the communication is about. ABI's, on the other hand, help to write API-level modules without disclosing the source code.
- Character streams are the traditional UnixLike way of abstraction. Even many driver stacks in various kernels are modeled as stream filter pipelines. But in addition to being traditional, they are LanguageAgnostic, UnitTestable, DistributionAgnostic, and they provide StrictEncapsulation.
- In the OpenSource world, products are often not made with a specific need in mind (whereas commercial products are designed to be useful / sound useful to the client), but to solve a specific problem. The difference between these two is subtle and often results in commercial products aiming for adaptability, OpenSource products aiming for wide applicability. (This points to another weird Q&A in C++ FAQs, see below.)
- In bigger programs, the Unix tradition mandates exposing the internals. Why? Because otherwise the program cannot be used in a modular fashion. Hence AlternateHardAndSoftLayers. This also enables people to follow the Unix philosophy of using the most appropriate tool for each task (low level languages for low level programming, high level languages for high level programming).
The FAQ mentioned above was something like:
- Why is it important to handle change?
- Changing requirements. Customers change their requirements, and software that cannot adapt to changing requirements is software that does not get buyed.
But in the
OpenSource world, you don't have customers. Programmers
state the requirements of the program. When they don't meet the
requirements of end users, the end users should pick another program.
This also makes it important that programmers not change the program's
requirements, because then there will be end users for whom the program
no longer provides a solution.
Of course, all these methods of abstraction have their uses. For
example, API's are great as parts of development environment, such as
standard libraries. Besides, the methods can be used to wrap each
other. But often, API's are left as the only interface to a commercial
product, and this shows in the design so that stream-wrapping or
language-wrapping the API would be clumsy.
All in all, the all-object, one-language approach seems to result in
solutions that are monolithic, bloated, often also slow (because of the
inability / lack of will to program the parts with domain-specific
languages), tightly dependent of one application architecture / language
platform, mystic and difficult to grasp. Besides, one often ends
writing everything from the ground up in the same style because the
environment is the same. However in my experience it's best to choose
the programming style depending of the size of the subsystem.
Comments?
--
PanuKalliokoski
As a
RelationalWeenie, my solution to "big programs" is often to let the database be the
BigRiverOfCommunication between many smaller "tasks" or "events". True, this may perhaps bother
StaticTyping fans, but it works well in my observation. Your
GrokScope is reduced to the current task at hand, shared libraries, and the (good) schema. It allows contractors to walk in off the street and start adding to the project in a few hours. See:
ProceduralMethodologies.
Relational databases are a great way of gluing things together. Sometimes, however, you just can't afford
DatabaseVendorLock. And I can tell writing a multi-lingual framework in a database-independent way is
not nice.
However, relational databases often
are better than domain-specific server services. You can always go there with your SQL monitor and see what's happening. The database also stays comprehensible as long as you understand the RDBMS, not only as long as you still understand your own application.