Lesson 1 - Introduction to C# and the .NET framework
Lesson highlights
Are you looking for a quick reference on .NET framework instead of a thorough-full lesson? Here it is:
C# programs are instructions for the CLR virtual machine which is part of every Windows:
The advantages are:
- Revealing errors in source code
- Stability
- Simple development
- Speed
- Low vulnerability
- Portability
The .NET framework covers programming languages, Visual Studio, Virtual Machine (CLR) and a complete set of easy-to-use libraries for developing console apps, databases, forms, etc.
Would you like to learn more? A complete lesson on this topic follows.
Welcome to the first lesson of the C# .NET course. This course is all about the C# language and the .NET framework. We'll go through step by step, from the very beginning to the more complex structures, object models, databases, and web applications. With a little patience and persistence, you will become a good programmer.
To fully understand the C# language, we'll have to look to the past and get a good understanding of how programming languages have evolved over the course of time. Doing so will enable us to understand how C# works, and why it is deemed an all-around good programming language to work with.
Evolution of programming languages
1st generation languages - Machine code
Computer processors can perform a limited number of simple instructions, which are stored as a sequence of bits, i.e. numbers. In most cases, the aforementioned instructions are written using the hexadecimal system, so as to make reading them less of a chore. However, the instructions are so limited, that all you can really do is sum up addresses and jump between instructions. As you may already know, in the world of programming, one does not simply add two numbers together. What we do, is look at the numbers' addresses in memory and then sum them up (which takes multiple instructions). Here's what adding two numbers would look like in the hex:
2104 1105 3106 7001 0053 FFFE 0000
The instructions are given to the processor in binary. This sort of code is extremely unreadable and is dependent on the instruction set of the given CPU. I assure you, it is extremely nauseating to program in this "language". Unfortunately, every program must be compiled in binary format so that it can be executed by a computer processor.
2nd generation languages - Assembler
Assembler (ASM for short) is no simpler than machine code, but at least it's human readable! Here, the instructions have human readable text codes, so that people wouldn't have to memorize every single one of the number combinations. The instruction codes are later compiled into binary code. Adding two numbers up in ASM would go something like this:
ORG 100 LDA ADD B STA C HLT DEC 83 DEC -2 DEC 0 END
It's a bit more human-readable, but most people, including me, would still have no clue how this program works.
3rd generation languages
Third generation languages finally give a good amount of abstraction of how the program is seen by the computer. Rather than forcing us to adapt to the computer's arcane way of thinking, the languages focused a bit more on how we see the program. Numbers were then perceived as variables and code had an almost "mathematical-notation" sort of aesthetic.
Adding up two numbers in the C language would go like this:
int main (void) { int a, b, c; a = 83; b = -2; c = a + b; return 0; }
Pretty much anyone could assume what this program does just by looking at it.
It sums 83
and -2
up, and stores the result in a
variable named c
. The main advantage third generation languages had
over all of the previous languages was high readability.
As time went on and code optimization was in demand, object-oriented programming was brought into play, which we will get into later. Third generation languages are essentially divided into the following categories:
Compiled languages
Compiled aka unmanaged languages have their source code in a language that people can fully understand. The source code must still be translated into machine code so that it can be executed by the processor. This translation is provided by a compiler, which compiles the entire program into machine code.
Compiled languages have the following advantages:
- Speed - The program only slows down during the one-time compilation. Once a program is compiled, it runs as quick as, or even quicker due to compiler optimizations, a program written in ASM.
- Inaccessibility of source code - the program is distributed in the compiled form, which makes modifying it very difficult if you don't have the source code.
- Easy to detect errors in source code - If there is an error in the source code, the entire compilation process crashes, and the programmer gets to see where he/she messed up. This greatly simplifies software development.
There are, as you may have guessed, disadvantages as well:
- Platform dependency - the program is still platform-dependent, i.e. on the processor or operating system. We cannot take a pre-compiled program, and run it on another platform without recompiling it and tweaking it a bit.
- Inability to edit - Once the program is compiled into the machine code, you cannot edit it any other way, only by re-compilation. That also applies to the languages mentioned above.
- Memory management - Due to the fact that computers mechanically execute instructions, you may occasionally run into memory overflow errors. Compiled languages don't have automatic memory management, so they're more of a hassle. Run-time errors are caused mainly by manual memory management, which cannot be detected by compilation.
Examples of compiled languages include the C language, its object-oriented successor C++, and Pascal/Delphi.
Interpreted languages
Interpreted languages make an attempt to solve program portability issues, and make programmers' lives a bit easier. Interpreters work much like compilers do, but instead of translating the entire program all at once, they only translate what is needed at a given moment in time. Its name comes from the human profession of Interpretation. Where an interpreter is someone who listens and serves as a "middle man" for people who do not speak the same language. In other words, he/she translates what each person says to a language that they understand. The translation is done while each one speaks. Interpreted languages work in pretty much the same way. The source code is read line by line, compiled into machine code, executed, and then thrown away.
Interpretation is a waste of processor power, of sorts, and is not the fastest way to get things done.
What advantages does interpretation have, then?
- Portability: The program is fully portable. If the platform has an interpreter, our program will be able to run on said platform (developing an interpreter is much simpler than developing a compiler).
- Simpler development - We no longer have to deal with manual memory management. All of that is done for us by what is known as the garbage collector (we'll get into that and more in the advanced courses). In some cases, we don't even have to specify data types, which usually leads to more comfortable data structures.
- Stability - Due to the fact that the interpreter actually understands the code, it spots errors that would eventually be executed by compiled programs. Interpreting programs is, without a doubt, safer that compiling them. Also, using this type of language brings reflection into play, where the program examines itself during the run-time (more on this later on in the courses).
- Easy editing - We can write programs in parts, and upload them to the target destination whenever we want because the code doesn't need to be compiled. In other words, it can easily be edited on the fly.
Interpreters have three major disadvantages:
- Speed - Interpretation can be very slow at times, and the program wouldn't use your computer to its full capacity.
- Difficulties in finding errors - Due to compilation being done during run-time, errors won't pop up before the code is executed, which can be very annoying.
- Vulnerability - Since the program is distributed as source code, anyone and everyone can alter or even steal parts of it.
PHP is an interpreted language. Most websites are written in this relatively easy language because it gets the job done right. Facebook uses a custom version of PHP, if you're interested, look up the "HipHop for PHP" project.
Languages with the virtual machine
Hmm, now what if we took the best of both approaches and left out most of the disadvantages? Thus, the virtual machine was born! Virtual machines are the most advanced kind of programming languages, currently the most widespread and the best choice for developing most applications. C# and Java belong to this category.
First and foremost, the source code is translated into what we call "Intermediate code". Microsoft calls it CIL (Common Intermediate Language), which is essentially machine code, i.e. binary. However, it has a considerably simpler instruction set and directly supports object-oriented programming. Due to its higher simplicity, intermediate code can be interpreted relatively quickly by the virtual machine i.e. the intermediate code interpreter. In C#, we refer to it as the CLR (Common Language Runtime), which is then fed right into our processor.
By using a virtual machine, we essentially eliminate both the interpreter and compiler's disadvantages, while still using most of their advantages:
- Revealing errors in source code - CIL compilation easily uncovers bugs in the source code.
- Stability - Due to the fact that the interpreter understands the code, it keeps us from performing dangerous operations as well as alert you with error messages. Reflection is still available for use if needed.
- Simple development - We have hi-tech data structures and libraries available. Memory management is done for us by the Garbage Collector.
- Good amount of speed - The speed of a virtual machine is somewhere in between the interpreter and the compiler. The virtual machine is able to cache results instead of throwing them away like interpreters usually do. It can also optimize itself when it notices recurrent calculations, which does end up speeding the compiler up a bit e.g. "Just In Time Compilation". The program, on the other hand, is a bit slower because the machine has to translate common libraries during runtime.
- Low vulnerability - The application is distributed as CIL source code, which isn't human readable.
- Portability - The final program will run on any hardware that has a proper virtual machine installed. In fact, it wouldn't even depend on the language itself. One project can be created by more people, one working in C#, the second in Visual Basic, and the third in C++. The source code is always translated into CIL.
Languages with VM are designed for object-oriented programming and is the most modern way to develop software. There are also languages of 4th and 5th generation, but they are very specific and we won't cover them today.
The .NET Framework
We've already explained how C# works. Now we'll go over what the .NET framework is. It is meant to cover four things: language, Visual Studio, Virtual Machine (CLR) and Libraries.
Language
As I mentioned above, in .NET we have several languages in which we can work. C# is the most popular and advanced language of them, and it has been designed specifically for .NET.
Visual Studio
Visual Studio is an IDE (Integrated Development Environment) - the environment in which we write source code and also helps us with the development itself. VS is respected even among the Java programmers. It is a modern IDE, that is free-to-use if you get the Community edition.
Virtual machine
CLR is a virtual machine that interprets and executes CIL into instructions for the current CPU.
Libraries
Libraries are probably the biggest advantage of .NET. Microsoft provides a complete set of libraries with pre-made structures and components, e.g. for console apps, databases, forms, etc. Since MS is also the author of Windows, their components fit nicely there. To run applications, it's necessary to have the same version of .NET on the target machine in which the application was written. The good news is that modern Windows versions always have .NET installed.
.NET has the following structure:
In .NET 2.0, we can see the actual CRL, virtual machine, and the Base Class Library. Version 3.0 brings new ways of developing form applications and workflows. Version 3.5 is especially interesting for us since it has brought LINQ, which we will get into later on. The next version allows the running of LINQ on multi-core processors. In 2012, version 4.5 was released. It simplifies writing asynchronous functions, which we will also cover later on.
Now we know what we're going to work with. In the next lesson, Visual Studio and your first C# .NET console application, we'll work with Visual Studio to create our first program.