I n t r o d u c t i o n a n d F o u n d a t i o n s 1 Mainstream Software Projects Perform Poorly The Standish Group has been publishing “Chaos Reports” for over 25 years. Results reported in the 2013 edition are as follows: • Only 39% of software projects are successful—The Standish Group defines success as finishing within 10% of agreed schedule, budget, and scope. • 43% of software projects are challenged—A challenged project deviates more than 10% in at least one of schedule, budget, or scope but still delivers software of value. • 18% of software projects fail—A failed project is cancelled before delivering software of any value. Of software projects that do deliver, the successful and challenged projects combined, the Chaos report shows the average: • 42% late—A planned 10-month project is more reasonably expected to take about 14 months. • 35% over budget—A planned $10 million project is more reasonably expected to cost about $13.5 million. • 25% under scope—In spite of overrunning both schedule and budget, software projects do not deliver all agreed-on functionality. A project that expects to satisfy 100 requirements should be more reasonably expected to satisfy only 75. Examination of mainstream software projects reveals common reasons why they get into trouble. the three most significant reasons in decreasing order of importance, are: Vague, ambiguous, incomplete requirements. Overdependence on testing. “Self-documenting code” is a myth. Vague, ambiguous, incomplete requirements The root cause is overdependence on natural languages to specify requirements. Ambiguity is built into natural languages. The same word often has different meanings and different words can have the same meaning. Natural languages are verbose. It takes many words to precisely communicate even relatively simple ideas. People are reluctant to say things in a precise way because of the effort. Most requirements aren’t changing, they are only being clarified.Natural languages make it difficult to communicate requirements at the level of precision needed. Overdependence on testing • 56% of defects are faulty requirements and 27% are faulty design, • 83% of software defects exist before the corresponding code is ever written! • Making matters worse, a typical software test team is only 60 –70% effective at finding defects. We need to be better at testing?The truth is • You need to be better at avoiding defects. And you need to be better at finding and fixing requirements and design defects, well before writing code. Self-documenting code is a myth • code describes a solution not the problem, and it’s impossible to distinguish has-to-be from happens-to-be. • Code will only allow you to figure out what it does, which is not necessarily what it is intended to do. • Why does this code look the way it does? • Most critically, you need an answer to: “What will happen if I change it?”If some change is made to this code, can it, or will it, break something? 2 Model-Based Software Engineering • “… the profession in which a knowledge of the mathematical and computing sciences gained by study, experience, and practice is applied with judgment to develop ways to utilize, economically, computing systems for the benefit of mankind.” True engineering discipline • The mathematics—the underlying formalisms—gives a model its single, precise, unambiguous interpretation. • But everyday developers don’t have to work in that ultra-formal world. They can work in the comfortable world of class diagrams (based on set theory, relational algebra, measurement theory, etc.) and state charts (based on finite automata theory, functions, etc.) knowing that someone has already provided the linkage to the underlying formalisms. Goal • “… change the nature of programming from a private, puzzle solving activity to a public, mathematics based activity of translating specifications into programs … that can be expected to both run and do the right thing with little or no debugging.” 3 The Nature of Code Syntax and Semantics in Programming Languages • Syntax is structure:how words are assembled into sentences. • Semantics is meaning: what those things mean. • An operation’s “signature” is the operation’s name, return type(s), and ordered list of parameters and types:just a syntactic declaration of the interface. • The behavior (semantics) of the operation should at least be implied by the operation and parameter names, but precise behavior is not specified. • In the vast majority of mainstream software organizations, the only way for a client developer to be sure of the semantics of a called operation is to read the called operation’s method code. • Client code is now written with knowledge of server code; client code is almost certainly more tightly coupled to server code than it should be,It destroys encapsulation. • The compiler/linker—and any static or dynamic code analysis tool, for that matter—only enforces syntactic consistency and is of essentially no help in recognizing semantic inconsistency. • A significant fraction of software defects are rooted in semantic inconsistencies between client code and server code and are amazingly difficult to isolate and correct. Design by Contract and Software Semantics • Traditionally there has been an overemphasis on programming language syntax and an underemphasis on semantics . • The remedy for this underemphasis on code semantics is ”design by contract.” • Under design by contract, an operation provides a “contract”, which has two parts: -A “requires” or “preconditions” clause—Conditions that are the responsibility of the client program(mer) to assure before calling this operation -A “guarantees” or “postconditions” clause—Conditions that are the responsibility of the server program(mer) to make true by the time this operation completes An Example Operation with Its Contract public static boolean isLanguageInstalled (LanguageCode l, boolean b) { // requires // l ref ers to a valid, recognized written language // guarantees // if the language package identified by l is present // then TRUE will be returned // otherwise FALSE will be returned // some method implementation goes here } The Importance of Semantics in Programming • By exporting a signature and contract (through code documentation generation tools like JavaDoc, Doxygen, or any equivalent), the client program(mer) knows precisely the syntax and semantics of that operation without having to read method implementation or—worse yet—guess. • The code is syntactically correct (or it wouldn’t compile), but it is semantically meaningless in light of behavior the stakeholders want. • Debugging means trying to root out all those pesky semantic inconsistencies. Software Automates “Business” • All nontrivial software exists to automate some kind of business: that software is intended to enforce some set of policies and/or carry out some set of processes. • For software developers to be successful automating someone’s business, those developers need to understand that business at least as well as —if not better than—the business experts understand it. • Policies and processes automated in software can be very complex ,the biggest problem in mainstream software development is vague, ambiguous, and incomplete requirements. • So,we need Model-Based Software Requirements. Semantic Models of Automation Technology • Not only can we model the semantics of a business, but we can also model the semantics of an automation technology: a programming language. • Today, developers build an implicit mental model of programming language semantics as they learn to program in that language. On the other hand, if language designers published an explicit semantic model, it could help avoid common misinterpretations of language elements as well as ensure that all compilers for that language behave in exactly the same way. Code Is a Mapping • Lines of code are a mapping—in the set theory sense—from elements of the semantic model of business policies and processes onto elements of the semantic model of the technology. • For a given semantic model of a business and a given semantic model of a technology, many mappings are possible,three critical properties must be present in any correct mapping: -Sufficiently complete—Every element in the semantic model of the business that stakeholders want automated needs to be mapped to at least one element of the semantic model of the technology. Nothing stakeholders want automated can be unmapped. -Preserve semantics—All business policy and process semantics must be faithfully represented in technology semantics. -Satisfy all nonfunctional requirements—All nonfunctional requirements such as technology, performance, reliability, portability, scalability, accuracy, security, etc. are met. The Most Important Implication of “Code Is a Mapping” • Real-world policies and processes are far too complex for developers to keep straight in their head. • The root of most software defects is simply incomplete, inconsistent, ambiguous, misunderstood, or incorrect business policy and process semantics. • Most remaining defects are incorrect mappings of policy and process semantics onto technology semantics • The remaining defects are in not satisfying one or more nonfunctional requirements. 4 Fundamental Principles The fundamental principles are part of software engineering practice: ways of approaching problem solving that help manage the complexity that needs to be managed. • Focus on semantics • Control complexity • Use appropriate abstractions • Encapsulate accidental complexity • Maximize cohesion and minimize coupling • Design to invariants and design for change • Avoid premature design and optimization • Name things carefully Focus on Semantics you should do all of the folowings to have an adequate focus on semantics : • the semantic model is intended to be exactly a complete, consistent, clear, concise, precise, unambiguous, and validated specification of policy and process semantics. • Design by contract and Liskov substitutability are two of several ways to focus on semantics in design and code . • Programming by intent, assertions, and proof of correctness are other ways. • It is also important to understand that code is a mapping from those policy and process semantics onto automation technology (i.e., programming language) semantics Control Complexity Software complexity comes at a cost: • More complex things cost more and take longer to understand, design, and build. • More complex things cost more and take longer to maintain. • More complex things tend to have more defects, further increasing cost and time to maintain. Essential Versus Accidental Complexity • Essential complexity is in the problem space: it is in the policies and processes being automated. • The other kind of complexity is accidental: in the solution space. -Threaded code is more complex than non-threaded code. Caching, data compression, data de-normalization, and making software scalable are other examples of solution space complexities. -Structural complexity is also in the solution space. These have nothing to do with the policies and processes being automated and everything to do with how the developers are doing that automation. Necessary Versus Unnecessary Accidental Complexity • Necessary accidental complexity is unavoidable due to nontrivial performance requirements. Without that solution space complexity, the software would not satisfy stakeholder requirements. • Unnecessary accidental complexity, on the other hand, is solution space complexity that is not helping to meet any performance requirement. -Technical debt can be considered unnecessary accidental complexity. -Refactoring is generally intended to reduce unnecessary accidental complexity. -Developers should strive to eliminate unnecessary accidental complexity at every reasonable opportunity. Manage Complexity • As for essential complexity and necessary accidental complexity, the best that can be done is to manage it. • You can never eliminate complexity. • The best you can do is to eliminate unnecessary accidental complexity • and manage as much of the remaining complexity as you can: by reducing the amount you have to deal with at any one time. The remaining principles are all ways to: • Manage essential complexity • Manage necessary accidental complexity • Eliminate unnecessary accidental complexity Use Appropriate Abstractions Abstraction can be defined as: “The principle of ignoring those aspects of a subject that are not relevant to the current purpose in order to concentrate solely on those that are.” Abstraction is “permission to ignore.” Abstracting Away Implementation Technology • Model-based software engineering intentionally abstracts automation technology away from the semantic model. Abstracting away solution complexities allows focus solely on understanding the policies and processes to be automated completely, consistently, clearly, concisely, and correctly and not be distracted by (at that time) irrelevant technical details. Abstracting Away Business Details • Abstraction in the problem (policy and process) space . -A class is an abstraction of a set of existing or potential business -relevant things that are subject to the same policies, processes, and constraints . Abstraction is the single most powerful complexity management tool software professionals have. Developers who are good at abstracting -those who can cover a problem space with the fewest, cleanest, simplest abstractions -write the smallest, cleanest, simplest, most understandable, and easiest-to-maintain code. Developers who have difficulty abstracting tend to write a lot of ugly, complex, defect-ridden, and hard-to-maintain code. Encapsulate Accidental Complexity • Abstraction and encapsulation are related concepts, but are different in a very important way. • Abstraction is permission to ignore. • Encapsulation takes that one step further by actively preventing you from knowing. • Encapsulation doesn’t help manage essential complexity because that can’t be hidden. • Encapsulation helps manage accidental complexity: we want to hide as much accidental complexity-implementation detail-as possible. • Encapsulation means we don’t ever want client developers to know how a called operation is implemented. Develop to an interface, not through an interface • Client developers clearly need to know the syntax of the server operation‘s interface: they see that in the signature. But operation semantics can’t be hidden either. • The semantics need to be exposed-that‘s the role of the contract. • Using only the signature (syntax) and contract (semantics), client developers can code to an interface, not through it. • Implementation details are accidental complexity because they relate to how the function is designed and built. • Without design by contract, there can be no encapsulation! Maximize Cohesion and Minimize Coupling • We build software by first decomposing the problem into pieces that are small enough to solve. • We then solve those pieces. • Finally, we compose those small solutions into something that solves the original problem. • It’s a process of decomposition, solution, and then recomposition. • This leads to important questions: “How can I tell a good decomposition from a bad decomposition?” “Is there a better decomposition than this one?” “If so, what would that better decomposition look like?” Cohesion • Cohesion considers the extent to which elements in a decomposition solve single subproblems. • Things that belong together should be close together and things that don’t belong together should be separated. • Clearly, isLanguageInstalled() is solving two very different subproblems: querying for a language package and deleting one. • The remedy is to split the code into two distinct operations, one to only query and the other to only delete. • After the split, the two operations will each do one single thing; they will now each be highly cohesive. • public static boolean isLanguageInstalled (LanguageCode l, boolean b) { • // requires • // l ref ers to a valid, recognized written language • // guarantees • // when b is TRUE • // then returns true when language package l is present • // otherwise returns FALSE • // when b is not TRUE • // then deletes language package l and returns FALSE • // some method implementation goes here • } Coupling • Coupling is about connections between elements. • When two things do need to be connected, that connection should be as loose as possible. -Programming to an interface, instead of programming through an interface, is one way to reduce coupling. When a system is loosely coupled, changes are localized and don’t ripple. Cohesion and coupling are pervasive properties that apply at all levels of decomposition, not just to operations. Design to Invariants and Design for Change • The design has to be based on something; it makes the most sense to base it on things that are least likely to change. • Design for change: adding configuration zones. • Design-for-change strategies include: • Separate common from variable functionality. -Using conditional compilation. -Inheritance: The superclasses represent invariants, subclasses represent variants. -Aspect-oriented programming:Variation is hidden inside aspects. Design to Invariants and Design for Change • Hide variation behind an abstract interface. -Architectural layering. -Several design patterns including Adapter, Bridge, Strategy, etc. • Use delayed binding -Using #define constants instead of magic numbers -Dependency injection -Self-configuration -Data-driven design Design to Invariants and Design for Change • Software can be structured around the invariants and configuration zones put in for the requirements that are likely to change. • Adding scope to a system necessarily introduces complexity, even if it is invariant. • Adding the ability to handle variation is even more complex. • Be sure any complexity added to handle variation is adequately compensated by increased value. Avoid Premature Design and Optimization • Fudd’s First Law of Creativity says, “To come up with a really good idea, start by coming up with as many ideas as you can, then throw away the bad ones.” • Tactics for avoiding premature design in the semantic model are as follows: -Convince yourself of the value of Fudd’s First Law of Creativity -Overcome the instinctive urge to solve problems too early by telling yourself, “If I can just wait until I really understand the problem, then (and only then) will I be able to come up with a killer solution.” -Be careful to model only the true nature of the policies and processes – Follow the rules and guidelines presented along with each semantic model element. Review models with others Avoid Premature Design and Optimization • Premature optimization in design also risks suboptimal solutions, and there can be significant cost, schedule, and (unnecessary) accidental complexity implications as well. -Start with the simplest possible design-Develop initial design and code, emphasizing simplicity, readability, maintainability, modularity, etc. -Only optimize when and where there is sufficient value. Name Things Carefully • Naming is vitally important, particularly in how it affects abstraction and encapsulation. • Names should: • Be in the problem space (policy and process) vocabulary. • Fully and accurately describe the meaning of the named thing. • Be unique and have exactly one meaning within its context. • Be clear, obvious, readable, and pronounceable to a typical reader. • Not be contractions (e.g., getWindow() rather than getWin()). • Abbreviate only when necessary and then always consistently. Name Things Carefully • Be similar for similar things (e.g., don’t use dependent.GetId(), supervisor(), employee.Id.Get(), or candidate.Id() for operations that all return entity identifiers). • Be contrasting for contrasting things (e.g., open/close, start/finish, … not open/finish). • Be identical for the same thing. Sometimes the same concept appears in different places-that one concept needs to have the same name in all places to emphasize that it is the same thing. • A good name is like a walking comment. Summary • Complexity costs in terms of time and money -both in initial development and in long-term maintenance. • We need to be smart about complexity : find ways to eliminate it when it has negative value and manage it when it has positive value. • The fundamental principles are part of software engineering practice: ways of approaching problem solving that help manage the complexity that needs to be managed, because it’s complexity with net economic benefit, and avoiding or eliminating unnecessary complexity, because it ‘s complexity that doesn’t bring enough economic benefit to justify its presence. 业务和技术共通的能力